Wednesday, June 23, 2010

Understanding how xpath relates to web pages

When using automation tools like Selenium or Watij you often find yourself creating an xpath to find an element. Talking to a few people there seems to be a lack of understanding of how an xpath relates to a web page.

I think the step which is missing for most people is understanding how to look at a web page.

A web page is merely a group of blocks inside blocks. To illustrate I have the following image:

Image the outer block is the <BODY> of the web page. Inside the outer block, i.e. the BODY are two rectangles. Let's say they are <TABLE> elements. The top table, i.e. /HTML/BODY/TABLE[1], has one row and three columns. The lower table, i.e. /HTML/BODY/TABLE[2], has three rows and four columns.

Let's say that both tables have one row of cells where the class was 'foo', i.e. <TD class='foo'>. If I wanted to find all cells with class='foo' and the text contained 'bar' I would use:

    //TD[@class='foo' and contains(text(), 'bar')]

But what if I wanted to search only the second table? Then I would use:

    //TABLE[2]/TBODY/TR/TD[@class='foo' and contains(text(), 'bar')]

Essentially, the longer the xpath the small the area of the web page I am searching. Using //BODY will search the largest square in my example. Using //BODY/TABLE[2] will search the lower table or the second level in.

If you look at the third row of the lower table you can see the 'cells' contain another level of element. Let's say that the cells, i.e. <TD>, contains a <DIV>. Using //TABLE[2]/TR[3]/TD/DIV[1] focuses on the first div in the last row of the lower table.


2 comments:

Note: Only a member of this blog may post a comment.