Google Analytics

Search

To search for specific articles you can use advanced Google features. Go to www.google.com and enter "site:darrellgrainger.blogspot.com" before your search terms, e.g.

site:darrellgrainger.blogspot.com CSS selectors

will search for "CSS selectors" but only on my site.


Wednesday, August 8, 2007

AJAX

It has been a while since I have posted. I've worked on a number of things but I think the 'hot' topic right now would be testing AJAX. AJAX stands for Asynchronous Javascript And XML. See Wikipedia's page on AJAX for more information.

A few years back a web site might be set up so you have two frames (left and right). The left frame has a tree view. When you select something in the tree view the web browser sends an HTTP Request to the web server. The web server response with an HTTP Response to the web browser. The browser then displays it to the right frame.

Basically, every time you click the appropriate link, the browser sends a synchronous HTTP Request, the user waits for the response, then it display.

Along comes AJAX. Now the tree view is javascript. When you select something in the tree view, AJAX sends an HTTP Request directly to the web server. The web browser is unaware of the request and therefore does not wait for the response. You can continue using the browser while AJAX, asynchronously, waits for the response.

The problem with testing AJAX is that most test software detects when the web browser does an HTTP Request and waits for the HTTP Response. Because AJAX handles the request and response, the test tools are unaware there is a need to wait. If the selection in the tree view took 3 seconds, the test script will click the tree view and nanoseconds later expect the results to be in the right frame.

Solution #1: I can put a sleep for 3 seconds. Problem is, network conditions change. It might be next time it takes 5 seconds or 7 seconds. We could brute force it and make the sleep 1 hour. But if it sleeps for 1 hour on each click and there are more than two dozen clicks, it will take over a day for even a simple test case. NOT A GOOD SOLUTION.

Solution #2: If my test suite is written in javascript and running in the same authentication domain as the website I'm testing (e.g. Selenium) then I could write my AJAX so it sets a flag when it does the HTTP Request and clears the flag when it gets the HTTP Response. Now the test code can wait for the flag to get set then wait again for the flag to be cleared before it assumes the right frame has the results in it. This is a good solution but it requires you to modify the application under test (AUT).

Solution #3: Create a man-in-the-middle setup. The way that man-in-the-middle works is original from a phishing (fraud) scheme. You have probably seen it. You get an email from 'your bank' telling you to log in and fix a problem with your account. The link says http://mybank.com but the actual link is to http://mybank_com.evil_doers.com. The evil_doers.com website will receive the HTTP Request from your browser, look over the information you are sending then pass it on to the real mybank.com. When mybank.com receives it, it will log you in and send an HTTP Response back to evil_doers.com. The evil_doers.com will examine the response, log it and send it back to you.

It is like putting a wire tap on your telephone. We can use this for good. I have host1.com running the web browser. I have host2.com running man-in-the-middle software and it will forward things to testsite.com. On host1.com I would normally go to http://testsite.com/my_fabulous_app/index.jsp. Now I go to http://host2.com/my_fabulous_app/index.jsp. The man-in-the-middle software will be my test software.

Realistically, I could actually run the web browser and the test software on the same machine. I'd have less security issues if I did that and the URL would become http://localhost/my_fabulous_app/index.jsp.

Thursday, July 12, 2007

Bourne shell scripting made easy

Someone was having trouble writing a shell script. A common activity for Bourne shell scripting is to take the output from various commands and use it as the input for other commands. Case in point, we have a server that monitors clients. Whenever we get new monitoring software we have to use the server command line tool to install the cartridge on the server, create the agents for each client, deploy the agents, configure them and activate them.

The general steps are:

1) get a list of the cartridges (ls)
2) using a tool, install them (tool.sh)
3) using the same, tool get a list of agents
4) using the tool, get a list of clients
5) using the tool, for each client create an instance of each agent
6) using the tool, for each agent created deploy to the client
7) using the tool, configure the agents
8) using the tool, activate the agents

Just looking at the first two steps, if I was doing this by hand I would use ls to get a list of all the cartridges. I would then cut and paste the cartridge named into a command to install them.

So a Bourne shell script should just cut the same things out of the ls list.

If the cartridge files all end with the extension .cart I can use:
ls -1 *.cart

If the command to install a cartridge was:
./tool.sh --install_cart [cartridge_name]

I could use:
for c in `ls -1 *.cart`; do
./tool.sh --install_cart $c
done

This is pretty easy and straight forward. What if the command was not as clean as ls? What is the list of agents was something like:
./tool.sh --list_agents
OS: Linux, Level: 2.4, Version: 3.8, Name: Disk
OS: Linux, Level: 2.4, Version: 3.8, Name: Kernel
OS: Windows, Level: 5.1, Version: 3.8, Name: System

To install the agent I only need the Name. If I only wanted the Linux agents, how would I get just the Name? First, you want to narrow it down to the lines you want:
./tool.sh --list_agents | grep "OS: Linux"

This will remove all the other agents from the list and give me:
OS: Linux, Level: 2.4, Version: 3.8, Name: Disk
OS: Linux, Level: 2.4, Version: 3.8, Name: Kernel

Now I need to parse each line. If I use the above command in a for loop I can start with:
for a in `./tool.sh --list_agents | grep "OS: Linux"`; do
echo $a
done

Now I can try adding to the backtick command to narrow things down. The two ways I like to parse a line is using awk or cut. For cut I could use:
for a in `./tool.sh --list_agents | grep "OS: Linux" | cut -d: -f5`; do
echo $a
done

This will break the line at the colon. The cut on the first line would give the fields:
  1. OS
  2. Linux, Level
  3. 2.4, Version
  4. 3.8, Name
  5. Disk

The problem is there is a space in front of Disk. I can add a cut -b2-, which will give me from character 2 to the end, i.e. cut off the first character. What if there is more than one space? This is why I like to use awk. For awk it would be:
for a in `./tool.sh --list_agents | grep "OS: Linux" | awk '{print $8}'`; do
echo $a
done

For awk the fields would become:
  1. OS:
  2. Linux,
  3. Level:
  4. 2.4,
  5. Version:
  6. 3.8,
  7. Name:
  8. Disk

The spaces would not be an issue.

So by using backticks, piping and grep I can break things apart into just the lines I want. Piping the result of grep to cut or awk to break the line apart and keep just the bits I want.

The only other command I like to use for parsing output like this is sed. I can use sed for things like:
cat file | sed -e '/^$/d'

The // is a regex pattern. The ^ means beginning of line. The $ means end of line. So ^$ would be a blank line. The d is for delete. This will delete blank lines.

Actually, lets give an example usage. I want to list all files in a given directory plus all subdirectories. I want the file size for each file. The ls -lR will give me a listing like:
.:
total 4
drwxrwxrwx+ 2 Darrell None   0 Apr 19 14:56 ListCarFiles
drwxr-xr-x+ 2 Darrell None   0 May  7 21:58 bin
-rw-rw-rw-  1 Darrell None 631 Oct 17  2006 cvsroots

./ListCarFiles:
total 8
-rwxrwxrwx 1 Darrell None 2158 Mar 30 22:37 ListCarFiles.class
-rwxrwxrwx 1 Darrell None 1929 Mar 31 09:09 ListCarFiles.java

./bin:
total 4
-rwxr-xr-x 1 Darrell None 823 May  7 21:58 ps-p.sh

To get rid of the blank likes I can use the sed -e '/^$/d'. To get rid of the path information I can use grep -v ":", assuming there are no colons in the filenames. To get rid of the directories I can use sed -e '/^d/d' because all directory lines start with a 'd'. So the whole thing looks like:
ls -lR | sed -e '/^$/d' -e '/^d/d' | grep -v ":"

But there is actually an easier answer. Rather than cutting out what I don't want, I can use sed to keep what I do want. The sed -n command will output nothing BUT if the script has a 'p' command it will print that. So I want to sed -n with the right 'p' commands. Here is the solution:
ls -lR | sed -n -e '/^-/p'

This is because all the files have '-' at the start of the line. This will output:
-rw-rw-rw-  1 Darrell None 631 Oct 17  2006 cvsroots
-rwxrwxrwx 1 Darrell None 2158 Mar 30 22:37 ListCarFiles.class
-rwxrwxrwx 1 Darrell None 1929 Mar 31 09:09 ListCarFiles.java
-rwxr-xr-x 1 Darrell None 823 May  7 21:58 ps-p.sh

I can now use awk to cut the file size out, i.e. awk '{print $5}'. So the whole command becomes:
ls -lR | sed -n -e '/^-/p' | awk '{print $5}'

If I want to add all the file sizes for a total I can use:
TOTAL=0
for fs in `ls -lR | sed -n -e '/^-/p' | awk '{print $5}'`; do:
TOTAL=`expr $TOTAL + $fs`
done
echo $TOTAL

The expr will let me do simple integer match with the output.


NOTE: you use use man to learn more about the various commands I've shown here:

  • man grep
  • man cut
  • man awk
  • man sed
  • man regex
  • man expr

The sed and awk commands are actually powerful enough to have entire chapters written on them. But the man page will get you started.

While you are at it, do a man man.

Enjoy!

Wednesday, June 13, 2007

We're not dead yet...

It has been a while since I posted to my blog. I've been fairly busy moving into my new home. I'm in now and the computer is set up. So it is time to blog again...

We have been hiring people to work in my department, Software Quality Assurance. Because our software products are development and system administrator tools, our QA staff needs to know how to program and how to validate the information our tools are providing; do you know AIX, HP-UX, Solaris, Linux (Redhat and SuSE) and Windows? Can you confirm the Disk I/O, Process, Thread, NIC, etc. information is correct? Can you write a multithread application which is guaranteed to deadlock so our tools will detect the deadlock? Can you write a J2EE application that exercises all J2EE technologies (EJB, JDBC, Servlets, JSPs, RMI, JNDI, etc.)?

These are the sort of skills the QA staff at my company possess. We interview a lot of people. Most don't have a clue about the basics. No one (myself included) had all the knowledge necessary to do the job well. So how do we do it? An ability to learn and find answers.

As we hire people, some work out but many more don't make it through the probation period; we either terminate them or they quit. I've been trying to put my finger on what the survives have that the others don't and I think I figured it out. Those who survive have a hacker mentality. One guy I hired, Jerry, found this magazine and thought it would be right up my alley. It was called alt.2600.

It has been over a decade since I hung out in alt.2600. When I saw the magazine I thought I'd point Jerry to the alt.2600 newsgroup. I was surprised to find out it was gone. I checked google.com to see if the archives were there and there was no hint of alt.2600. If you google "alt 2600" you will find the FAQ and references to the newsgroup but the newsgroup itself is gone. The last time the FAQ was updated was April 2004.

The magazine made me realize though that hackers think differently. Case in point, when Kryponite locks came out they were advertised as impossible to cut with bolt cutters. I knew someone who took 4 foot bolt cutters and tried. He bent the bolt cuts. I looked at the lock and realized the locking mechanism overlapped the Kryponite bar by 2mm. A swift whack at this point with a 2 pound hammer and the the lock popped open. Most people looked at the ad and tried to figure out how to cut the bar (the ads indicated the bar was uncuttable). I stepped back and thought, the problem is not cutting the bar. This is narrow thinking. The real problem is removing the lock from what it held. Cutting the bar was only one way to do this.

Hackers get into web sites by looking for the weak points. They don't let the requirements lead them. The login web page only lets me enter limited information; don't use the login web page. Create your own web page and set the FORM action to point to the other web site. Design your FORM so you can send more information. Do something you know will fail just to see if there is useful information in the error message. The more you can reveal about the technology the more you can determine the weak point.

When I test a piece of software I'm looking for the weak point. This ability to see things form a different point of view lets me find the bugs the developer did not see.

Is being a hacker a dying art?

Friday, April 27, 2007

Sun is not the only vendor of Java

Many people who know about the Java programming language only know about the Sun implementation. But there are actually different vendors. There is:
IBM
HP
Apple
BEA
Blackdown

If you are programming Java on the AIX operating system (IBM's version of UNIX) then you would be using IBM Java. If you are programming Java on the HP-UX operating system (Hewlett-Packard's version of UNIX) then you would be using HP Java. Similarly, if you are programming Java on MacOS X then you are using Apple Java. BEA is not a creator of an operating system. They create a J2EE application server called WebLogic. It typically ships with the Sun version of Java and BEA's version of Java. The BEA version is called JRockit. Finally, Blackdown is an implementation associated with Linux.

The idea behind all these different implementations of Java is that they are better in some way. You should get better performance on your web applications if you use JRockit on the BEA WebLogic application server. If you are running Linux, the Blackdown implementation should give you better performance.

If you don't have access to HP-UX, AIX or MacOS X then you will not have the opportunity to use the OS manufacturers specific version. If you want though, you can download JRockit from BEA. Go to the BEA website, select Products then select JRockit. On the main JRockit page is an option to download JRockit for free. You can get Blackdown for free from the Blackdown website.

Wednesday, April 25, 2007

Identifying UNIX versions

I work in an environment with numerous different versions of UNIX and Linux. Sometimes I'll be accessing multiple machines from my workstation. Occasionally, I need to confirm the OS for the current terminal. The way to determine which version of UNIX you are using is with:
uname -a

For Solaris you would get something like:
SunOS rd-r220-01 5.8 Generic_117350-26 sun4u sparc SUNW,Ultra-60

For HP-UX you would get something like:
HP-UX l2000-cs B.11.11 U 9000/800 158901567 unlimited-user license
or
HP-UX rdhpux04 B.11.23 U ia64 0216397005 unlimited-user license

For AIX you would get something like:
AIX rd-aix09 2 5 00017F8A4C00

From this it is a little harder to see the version. It is actually AIX 5.2. If you check the man page for uname it will help you decode the hexidecimal number at the end. This will tell you things like 4C is the model ID and the 00 is the submodel ID. Additionally, AIX uses other switches to tell you about things the -a normally gives you on other platforms. For example,
uname -p # the processor architecture
uname -M # the model

For Linux things are a little tricker. The uname -a will tell you it is Linux but it will not tell you if it is SuSE Linux Enterprise Server (SLES) 10.0, Redhat AS 5.0, et cetera. To figure this out, look for a text file in /etc/ which ends in 'release', i.e.
cat /etc/*release

This text file will tell you which distribution of Linux you are using.

Tuesday, April 17, 2007

Going from small programs to large programs

Whenever I'm interviewing someone (or being interviewed) I like to know how many lines of code you have created for one project. I'm not looking for a magic number; people tend to have either programmed a few hundred to a thousand lines of code and others will have worked on something in the tens of thousands.

The reason for asking this is because you can be a junior programmer and still survive programming a few hundreds lines of code.

The trick to programming thousands of lines of code is DON'T. When junior programmers write a program they tend to write the entire program at once. If you are programming 100 lines of code, you can keep the entire concept in your head. Trying to remember 500,000 lines of code would be impossible for all but a few people.

The way you do it is to take the program and break it into sub-programs. You keep breaking it down until you have 5000 small snippets of code. They you write one of those snippets.

For example, I assigned a co-op student to write a small Bourne shell script. Our product builds in parts and has dependencies. The build system puts all the build output in a specific directory (let's call it $BUILD_DIR). The structure is:

$BUILD_DIR/$PRODUCT/$BRANCH/$BUILD/

What I wanted for the script is for the user to specify the product, branch and build. Then the script would scan the build log for references to any other product in $BUILD_DIR.

The co-op student wrote a getops loop to get the inputs from the user. Inside the loop was a case statement for each input (product, branch, build, help). In each case statement was an if/else statement for, if you did or didn't get the needed input. If you did not get the needed input was a loop to list all the possible inputs.

As you can see, I'm writing the code to get input, parse it, deal with it, etc. all in one loop/case/if/else/loop structure.

How could this be written easier?

# Check that $BUILD_DIR is defined and exists

# Get the user input
# Save the product in $PRODUCT
# Save the branch in $BRANCH
# Save the build in $BUILD

# if $BUILD_DIR/$PRODUCT is not defined or does not exist
# list possible inputs for product
# exit

# if $BUILD_DIR/$PRODUCT/$BRANCH is not defined or does not exist
# list possible inputs for branch
# exit

# if $BUILD_DIR/$PRODUCT/$BRANCH/$BUILD is not defined or does not exist
# list possible inputs for build
# exit

# build a list of all other products (omit the current product)

# search $BUILD_DIR/$PRODUCT/$BRANCH/$BUILD for references to anything from
# list of all other products

# print the results

Each break is a separate concept. I would program one at a time. I am going to write the check for $BUILD_DIR. I'm going to think about all the possible problems. The variable could be undefined, check for that. The variable could have the wrong value, check for that. The directory might not be readable by me, check for that. I'd keep thinking of things like this. Once I am positive $BUILD_DIR will hold a good value, I forget about it and focus on getting input from the user. I'm just going to get input from the user. I'm not going to validate it is good input. I'm just going to parse the command line and save all the inputs. Once I have written that, perfectly, I'm move on to validating the $PRODUCT. This will be similar to validating the $BUILD_DIR. Maybe the code to validate $BUILD_DIR should be a subroutine and I can use it to validate $PRODUCT as well.

By breaking it down into small, manageable chunks it is just writing a bunch of small code snippets. If you can write one code snippet then writing a dozen is possible.

It is good to get into this habit with small programs. If you practise this technique on small programs then writing the large ones will come naturally.

Tuesday, April 3, 2007

Planning a multi-threaded application

I have used analogies to teach programming to people for over two decades. A while ago I realized that analogies aren't only good for teaching but sometimes you can apply the concepts from something completely different to programming (and you can apply programming concepts to other aspects of life).

For example, I used to talk about Relational Database Management Systems (RDBMS) using file cabinets, drawers, folders, index cards, etc. and I realized that systems libraries used when I was a kid (before computers were affordable) actually worked and the concepts helped to form RDBMS we see today. In other words, before computers existed, secretaries, clerks, librarians, etc. had to keep track of data (books, client information, etc.). If I wanted to search for a book by author they had a set of index cards sorted by author. If I wanted to search for a book by title they had a set of index cards sorted by title. The actual books were sorted, on the shelves, according to the Dewey Decimal System. If a book was added to the library, they had to insert it on a shelf (leave room for new books so you don't have to shift the entire collection) and an index card for each index was created. Same sort of thing happens for a database. You insert the record for the book in the database and then you update all the indices.

So, how does this relate to multi-threaded applications? Well, we just need to look at a multi-threaded application differently. A professor at U of T used things like Human, Male, Female, Baby, etc. or Animal, Mammal, etc. to teach object oriented design (OOD). I learned to used project management software; people were 'resources'. You planned the activities of resources. You created dependencies between two resources, e.g. Karen might be working on something but she need work from Bob opn day 4. So at day 4 there is a dependency between Bob and Karen. If the work Bob is doing will take 9 days, I want Bob to start 5 days before Karen or I want someone to help Bob so it will get done early or I want work for Karen to do while she is waiting for Bob to finish. Basic techniques to ensure my 'resources' are used efficiently.

What if we look at the threads in an application as 'resources' and use project planning software to make sure all the different threads are working together well. Or what about systems with multiple processors. Each processor could be a resource and I want to make sure they are utilitized efficiently. If the math co-processor will take 9 seconds to complete a computation and the main cpu will need the result in 4 seconds, I have the same options as with Karen and Bob. I can start the math computation 5 seconds before the cpu needs it, I can have 5 seconds of other work for the cpu while it waits for the math computation, I can put more math co-processors. As I type this I also realized, I could put a faster math co-processor. This means for the human scenario, I can put someone who is faster than Bob on the task so it will get completed in 4 days rather than 9 days.

So for everyone who thinks learning how to use project management software is only for managing project, think again.