If you want to use open source test automation frameworks for testing web applications there are two categories. Some of the frameworks have their own engines (CSS, Javascript, HTML, etc.) and others use a real web browser.
One advantage of the frameworks with their own engines is they have more control and you don't have to deal with the real life implementations of a web browser. You are also not dealing with TCP/IP, network delays, etc.
The disadvantage, on the other hand, you are not dealing with real life implementations of a web browser. You are also not dealing with TCP/IP, network delays, etc. Basically, if you use a framework with its own engine and your application passes, it does not guarantee it will work on a real web browser like Firefox or Internet Explorer.
Two frameworks which drive a real web browser are
Watij (Web Application Testing In Java) and
Selenium (pronounced sa-LEE-nee-am).
Watij is a Java test framework. You will need to know Java and jUnit to use Watij. The documentation is VERY lacking. The people developing Watij are developers. They are not creating a commercial product. If I was programming a calculator, adding comments to the source code would not be important. Instead, I would want the source code to be self-documenting. The method names should tell me what they do. The class names should imply what methods exist in it (i.e. I'd know what to expect in the file even before I opened it). Additionally, I would write unit test cases to show how the code was expected to be use.
As something built on the jUnit framework, I expected better documentation. I quickly realized that it was best to have the source code jars added to my project and occasionally needed to step into the Watij code to figure out how things worked.
Everything was nicely structured however. You have HtmlElement which was the base class for all other elements. You have things like Button, Table, Link, etc. which inherit HtmlElement. These are all interfaces and you have implementations like IEButton. I believe the idea was that some day you might have IEButton, FirefoxButton, ChromeButton, OperaButton, etc.
In Java I might have:
public List doSomething() {
List text = new ArrayList();
// more code here
return text;
}
Similarly, in Watij all the methods return an interface rather than a specific implementation. So the signature for a method returning an IELink is:
public Link getLink(String locator);
Nothing special for a Java test framework. Additionally, a Table extends a TableBody, a TableBody (and subsequently a Table) can tell you how many rows and columns it has. It can also return a specific TableRow or TableCell. Back to the fact there is no JavaDoc for the methods but it does not made because the code is self-documenting.
When a method need to return a collection of elements, e.g.
Links links = ie.links();
The implementation of Links is typical Java. It assumes we have Collections, Lists, Maps, etc. So the Links implementation is Iterable<link>. So we can have:
for(Link link : links) {
// do something with each link in the collection
}
Essentially, writing automation is no hard for a Java programmer than writing a simple application.
Selenium on the other hand is totally different from Watij. First, it is not a Java framework. Selenium comes in multiple parts and one of them allows you to use Java. The starting point for Selenium is Selenium IDE. You get an AddOn for Firefox. When you open the Selenium IDE it turns on recording right away. Everything you do in Firefox is then recorded into an HTML table. The table has three columns. The first column is the command, the second column is the target and the third column is an optional value. For example, if I wanted to click a button the command would be 'click' and the target would be some way of identifying the button. This could be CSS, an xpath, the id of the button, the name of the button, etc. Another example using all three columns would be entering text in a text field. The command would be 'type', the target would be the text field and the third field would be the text you want to put in the text field.
This is great for doing some quick and dirty automation but lacks the ability to build into a test suite. Additionally, if every test case started with logging into the application you would have duplicate code. If the way you log in changed, you would have to edit all the test cases. This is possibility the number one reason test automation fails. It is not maintainable.
The next part of Selenium is Selenium RC (remote control). After you record the test case using the IDE, you can save it in a variety of different languages. If you are a Java programmer, you can save it as Java code.
To run the Java code you'll need the Selenium RC jars and a Selenium Server. The way it works is the Selenium RC jar is a client. Your Java code sends commands to the server via the client. The server then uses Selenium IDE to control the web browser (via Javascript).
The nice thing about this is you can now record a snippet using the Selenium IDE, save it as Java and incorporate it into your Java test suite.
This is the theory. The reality is that how Selenium IDE recognizes the elements in the browser is not always the best choice. For example, I might have a set of SPAN elements inside the TABLE to make the cells format nicely. Along comes Internet Explorer 9 and my table looks horrible. So I have to add a DIV around each SPAN. From a customers point of view everything looks the same, but from a Selenium locator point of view my test cases all break. It can no longer find the cell. So you will need to look at the locator Selenium IDE has selected and decide if you want to change it.
Additionally, AJAX and timing are a huge issue. For example, you have a form with two select lists. The first list is a list of countries. The second list is empty. You select your country and an AJAX call populates the second list with states, provinces, territories, etc. Selenium IDE will record you waited 1 second before selecting your province because you selected Canada for the country. If you select United States it takes 2 seconds to populate the second list. A bad automator will just add a few seconds to the wait. A good automator will look for something in the DOM which signals the second list is populated. For example, the second list might be hidden. So you want to wait for the second list to become visible. Knowing to code this is one thing you need to have when using Selenium IDE and knowing what to code is even harder.
So the record feature is not that good. It will most often encourage less experienced automators to create code duplication and use sleep/wait statements to handle things like AJAX/Web 2.0 features.
Another issue with Selenium is the Java support. When you look at the Java support it is fairly simplistic. There is no hierarchy to the data structures and all the methods return arrays. You'll also find things like, I can find out how many elements match an xpath but I have no way of iterating over the list of elements. In Watij you can do the following:
List result = new ArrayList();
String myXPath = "//A[contains(text(), 'Phoebe')]";
Links phoebeLinks = ie.links(xpath(myXPath));
for(Link currentPhoebeLink : phoebeLinks) {
result.add(currentPhoebeLink.text());
}
To do the same thing in Selenium RC using Java, in theory, you need to do:
String[] allLinks = selenium.getAllLinks();
then iterate over all the strings to figure out which ones are the ones you are looking for. However, when I use this method it actually returns back an array of one element and the one string it does contain is "". In other words, it doesn't seem to work.
My workaround for this problem was to use the getHtml() method. This returns all the source code for the current page. I can then load it into something like jTidy and use the jTidy interface to traverse the page and find all the links. To actually iterate over all the links and do something with them I'd have to create a unique list of the links using jTidy then go back to Selenium and interact with them.
Essentially, I found that writing a test case in pure Java was simple with Watij and difficult if not sometimes impossible with Selenium RC. Ultimately, the maintenance on Selenium will be the potential downfall.
On the other hand, when I actually run the test cases I find that Internet Explorer has problems. It hangs and locks up occasionally. For an automated test suite this is unacceptable. With Watij I am stuck. With Selenium I can run the tests in Firefox, Opera, Safari or Internet Explorer. So automated nightly build and test is possible with Selenium but not really viable with Watij.
As I see it, Watij is a fantastic house built on a bad foundation. Selenium is a good starter home which can easily be moved to a variety of different foundations.
Neither is a great solution. I spent a year trying to work around the bugs in Internet Explorer with Watij but in the end just gave up. For now, I'm looking at Selenium. Maybe it is time to start doing some Selenium RC programming, i.e. give back to the community.