Wednesday, January 27, 2010

xpath

I have been doing a lot of web testing. The general idea behind all UI test automation tools is to locate an element on a page then do something with it, set it, clear it, read it, etc.

For the web automation tools you can use:

(a) the position on the screen (x,y coordinations)
(b) the position in the DOM (e.g. /html/body/table/tbody/tr[2]/td[4]
(c) a unique attribute

The position on the screen never works. Different browsers, fonts, screen resized, etc. will change the layout and ultimately, change the screen position of elements. I wouldn't use this. Working with development to provide alternative means will be less work than maintaining automation with x,y coordinates.

The position within the DOM is a little brittle. When a browser is patched or a new browser needs to be supported it is not uncommon for the developers to throw in some span or div elements to help with layout. So the element /html/body/table might change to /html/body/div/span/table. The more precise the positioning information the more brittle it will be.

A unique attribute would be something like the id of a tag. For example:
Darrell
I can find this via the class or id attribute, or both. This is where xpath comes in handy. The automation tools I have been using (Watij, Selenium) can use xpath to locate an element. For my td element I can use:
//TD[@class='username-cell']
or
//TD[@id='username']
or
//TD[@class='username-cell' and @id='username']
The id attribute is required to be unique. So if the element has an id, that is the attribute to use. If you start an xpath with // it tells the tool to start searching anywhere in the DOM. Starting with a single / will start at the root. For a web page that will always be /html.

Xpath can be quite powerful in identifying elements. You have a few 'tricks' you want to use. First is that id=' foo' is not the same as id='foo '. The whitespace makes a difference. To get around this I would use:
//TD[contains(@id,'foo')]
Now the whitespace does not matter. you have to be careful with this however. If there are two matches, it depends on the automation tool as to what will happen. If I have:

darrellDarrell
then:
//TD[contains(@id,'user')]
will have unpredictable results. Not something you want in test automation. So how to get around this?
//TD[contains(@id,'user') and not(contains(@id,'username'))]
Any attribute inside the tag can be used via the @ symbol. You can do things like look at the style attribute using @style. Because the order of the things in style does not matter to the browser, the contains() function helps a lot.

Finally, the text between the open and close tag can be found using the text() function. So if I had:
I can find it using:
//A[contains(text(),'Google')]
What about the difference between Google and google? For matching either you can use:
//A[contains(lower-case(text()),'google')]
This will take the text in the anchor (e.g. Google) and changing it to lower case (e.g. google) then comparing that resulting string to 'google'.

In addition to and there is an or keyword as well but I usually find it better to narrow things down (and will filter out) rather than build up the matches (or will combine).

There is a lot more to know about xpath. If you are curious, ask.

Additionally, I find a developer will put an id on something because he uses it from javascript to find it and all the elements underneath it. So if there is:

  • Darrell


  • Jerome


  • Mark

  • I would tend to use:
    //DIV[contains(@id,'users')]/UL/LI[contains(text(),'Darrell')]
    to find the element which contains 'Darrell'. The developer will tend to not want to change the structure under the id'd tag because it will cause them a lot of maintenance as well.

    Test Automation Manifesto

    I have been recently reading xUnit Test Patterns by Gerard Meszaros. Excellent book with a lot of things I learned the hard way. I also found an article by Gerard et al titled, "The Test Automation Manifesto". The manifesto is:

    Automated tests should be:

    1. Concise: As simple as possible and no simpler.
    2. Self Checking: Test reports its own results; needs no human interpretation.
    3. Repeatable: Test can be run many times in a row without human intervention.
    4. Robust: Test produces same result now and forever. Tests are not affected by changes in the external environment.
    5. Sufficient: Tests verify all the requirements of the software being tested.
    6. Necessary: Everything in each test contributes to the specification of desired behavior.
    7. Clear: Every statement is easy to understand
    8. Efficient: Tests run in a reasonable amount of time.
    9. Specific: Each test failure points to a specific piece of broken functionality; unit test failures provide “defect triangulation”
    10. Independent: Each test can be run by itself or in a suite with an arbitrary set of other tests in any order.
    11. Maintainable: Tests should be easy to understand and modify and extend.
    12. Traceable: To and from the code it tests and to and from the requirements.

    In his article he talks about bad code 'smells'. A 'smell' is something which you notice again and again as a problem. Even before you have clearly identified it, you can 'sniff' them out.

    The first two code smells he talks about are THE reason record and playback test automation doesn't work.

    When you use record and playback to automate, it will (a) hard-code test data and (b) duplicate code. If I'm testing an application which requires me to log in before each test, the recorder will record me logging in for each test case. If I do not refactor the log in to a function that all the test cases use and the developer changes the login, I would have a maintenance nightmare. Additionally, if the username and password change and the test data is hard-coded, I'd have to find all the instances and change them.

    By the way, the idea of 'refactoring' is to change the code to be more maintainable without changing the way it runs. For example, if I had the following code:

    // test#1
    // put "darrell" in username text field
    // put "password" in password text field
    // click Submit button
    //
    // test#2
    // put "darrell" in username text field
    // put "password" in password text field
    // click Submit button
    //
    // test#3
    // put "darrell" in username text field
    // put "password" in password text field
    // click Submit button
    //
    // test#4
    // put "darrell" in username text field
    // put "password" in password text field
    // click Submit button
    //

    and it worked as expected, I could refactor it into:

    // login(username, password)
    // put $username in username text field
    // put $password in password text field
    // click Submit button

    // test#1
    // login("darrell", "password")
    //
    // test#2
    // login("darrell", "password")
    //
    // test#3
    // login("darrell", "password")
    //
    // test#4
    // login("darrell", "password")
    //

    This will run pretty much the same but now if the Submit button gets changed to a Login button, I don't have to change all 4 test cases (what if I have 40,000 test cases). I just change:

    // login(username, password)
    // put username in username text field
    // put password in password text field
    // click Login button

    and it updates all the test cases. This code still has hard-coded data. How about changing this to put the data in a property file:

    # login data
    username=darrell
    password=password

    then change my code to:

    // read property file
    // username = getProperty("username")
    // password = getProperty("password")

    // test#1
    // login(username, password)
    //
    // test#2
    // login(username, password)
    //
    // test#3
    // login(username, password)
    //
    // test#4
    // login(username, password)
    //

    Now I can edit the property file if I want to change the username or password.

    I picked property file to hold the test data but it could just as easily been a database, spreadsheet, text file, compiled resource, global variables, etc.

    Tuesday, January 19, 2010

    the math behind test automation

    It has been a while since I had time to write to my blog. A lot has happened. In the world of test automation I created a kick ass test suite using Watij. Built out a wonderful set of libraries and had a tonne of reusable code. I could whip out a new test case in minutes. Complex test cases might take an hour or two.

    New features and code changes required a minimal amount of work. This is usually key to the survival of a test automation framework. I could have used record and playback to create automation but small changes in the application would render the automation useless. Most test tools recommend re-recording the application. If recording the test case and adding in the appropriate verifications takes longer than manually testing the application then a record and playback automation tool is pointless.

    You need to create reusable code. Software automation is development. I have always believed this and recently read an article on infoQ, by Dale H. Emery which reiterated this belief.

    The only way a test automation suite pays off is by being maintainable. Here is the math:

    -let x be the time it takes to run the test suite manually once
    -let y be the cost for each unit of x
    -let z be the number of iteration we need to run the full test suite
    -therefore the cost of manual testing is xyz

    -let a be the time it takes to create the automation test suite
    -let b be the cost for each unit of a
    -assume the time it takes to run the test suite is infinitesimally small
    -therefore the cost of automated testing is ab (running the tests more than once does not incur any significant costs)

    -so long as ab < xyz, the return on investment is worth it Companies selling test automation tools will often sell them with the idea that you can record and playback the automation almost as easily as manual testing. Even if a = 2x and b = 2y then ab < xyz == 2x2y < xyz == 2(xy) < xyz == 2 < z Or in english, if you run the automation more than twice, it pays for itself. Sounds pretty good, doesn't it? This is an incredible simplistic view. It does not take into consideration the time dealing with bugs in the automation tool, lack of support for new technologies or configurations, training the staff, learning how to avoid false positives, learning how to eliminate false negatives (these greatly increase the cost of automation especially if not handled well). After you take all that into consideration, the reality becomes that you need to run the test automation 6 to 10 times with no modification to break even. And everyone knows that the development team will not change the application in a 6 to 10 week period, right? Bottom line, even on the first release the test automation does not pay for itself. So you spent $10,000 on a test automation tool and you hired a consultant to use it and create your test suite for another $50,000 only to find out that it never really worked or therefore were additional costs. If you decide to drop the tool (which has an annual service contract), the sales team jumps into high gear and convinces you that maybe after the next release it will pay for itself. By the third release you'd start seeing a profit. You have invested thousands and want to believe you can make this money pit viable. You desperately believe the sales staff. The record and playback is a losing battle. The code is just not maintainable. There is no reuse of code. It is equivalent to cut&paste. Any intermediate developer will tell you that cut&paste is a bad thing. If you have a section of code that clicks the Login button and they change the way the button is identify, you find yourself changing a thousand lines of code. If the automation used a library call to click the Login button then one change is all you need for all test cases which click the Login button. And that is the trick. As an automation developer, I record a test case. I look at the code produced and refactor it so common operations are placed in a library. The next time I do a record, if I see the same chunk of code, I replace it with the library call. The problem now is remembering everything I put into a library and refactoring recording to use the library call. For this, structure your library. I like to use an object oriented language for automation. I can then use one class for each feature or 'screen' or 'window'. I basically break the application into 'screens' or states and model my test framework around that. Now my library has an easy to remember structure just like a language's set of libraries (e.g. Java Software Development Kit or C++ Standard Template Library). So now I'm developing a library similar to the C++ STL or Java SDK. If I'm a failed developer forced to do test automation, I'm probably not a good tester and I definitely don't have the skill set to create the STL. What else can I do to make the math work out for test automation? Look at maintenance and reporting. If a test framework cannot tell you quickly and easily how the state of testing is then it is not very good. You need to have good reporting capabilities. If the framework has this built in all the better. If it just happens as part of the development of a test case great. For my last test framework I added in log4j messages to all the library calls. If set to NONE it would output nothing. If set to INFO it would print out the test steps as them occurred with the data being used (helps to reproduce the test manually, if necessary). If set to DEBUG, tells you details and interworkings of the library calls. This is more for maintaining the code than reporting what is happening. As for code maintenance, you want something that works in a good debugger. Running to a point then examining the state of the application under test (AUT) is important for fixing broken automation. Being able to quickly and easily find, manipulate and verify elements of the AUT. Languages like Watij are Java based and take full advantage of the Java programming language. If I have an xpath that matches a dozen text fields I can ask Watij to give me a List then I can use the Iterator to look at each text field.

    On the other hand, languages like Selenium are very popular but its native language is HTML tables. Each row (TR) in the table is a test step. The first column is the action, the second column is the locator and the third (optional) column is any extra parameters needed.

    You can get Selenium-RC and a Java API but the Java support uses String[] rather than Lists and Collections. I can find out how many text fields match a given xpath but I have no way of locating them. I often find myself adding the support I've grown used to in Watij. This means more code and more maintenance.

    The unfortunately thing is nothing is perfect. Watij is easier to use as a Java developer but it only works with Internet Explorer (I have found that IE locks up occasionally; the status bar indicates it is waiting for 1 more item to load but everything is loaded). IE is not a very stable platform and with Watij you have no choice but to deal with it. Selenium does not have the Java structure and support of Watij but I ran it continuously on IE and it never locked up (same AUT). Additionally, I can run Selenium on my Mac, a Linux box or Windows. It works with Safari, Firefox and IE.

    Watij is easier to maintain but if the tests hang and never complete, Selenium seems to be the solution. Going forward I will try to write a set of miscellaneous libraries which mirror the Watij structure. Hopefully once that is done, Selenium will be just as powerful as Watij and far more stable.