Google Analytics


To search for specific articles you can use advanced Google features. Go to and enter "" before your search terms, e.g. CSS selectors

will search for "CSS selectors" but only on my site.

Friday, April 20, 2012

How to find a popup window if it does not have a name

When using Selenium 2.0 (WebDriver) you will sometimes find clicking a link opens a popup window. Before you can interact with the elements on the new window you need to switch to the new window. If the new window has a name you can use the name to switch to the popup window:


However, if the window does not have a name you need to use the window handle. The problem is that every time you run the code the window handle will change. So you need to find the window handle for the popup window. To do this I use getWindowHandles() to find all the open windows, I click the link to open the new window. Next I use getWindowHandles() to get a second set of window handles. If I remove all the window handles of the first set from the second set I should end up with a set with only one element. That element will be the handle of the popup window.

Here is the code:
String getPopupWindowHandle(WebDriver driver, WebElement link) {

    // get all the window handles before the popup window appears
    Set<String> beforePopup = driver.getWindowHandles();

    // click the link which creates the popup window;

    // get all the window handles after the popup window appears
    Set<String> afterPopup = driver.getWindowHandles();

    // remove all the handles from before the popup window appears

    // there should be only one window handle left
    if(afterPopup.size() == 1) {
        return (String)afterPopup.toArray()[0];
    return null;
To use this I simply call it with the WebElement which clicking opens the new window. You want to add some error handling to this but the basic idea of finding the popup window is here. To use the method I would use:
String currentWindowHandle = driver.getWindowHandle();
String popupWindowHandle = getPopupWindowHandle(driver, link);
// do stuff on the pop window
// close the popup window

Wednesday, April 4, 2012

Regular Expression

I've been processing files and data a lot lately which means I've been using Regular Expressions.

Regular Expressions is a very powerful pattern matching tool. If you have used MSDOS or Bourne shell you are familiar with wildcards like "*.txt" will match all files ending with .txt. Regular expressions take this to a whole new level.

First thing to note is there are different implementations of Regular Expression. The basic concepts are the same and most the syntax is the same but there are subtle differences. I'll talk more about this as I give examples of the language.

The second thing to note is, some of the special symbols from MSDOS or Bourne shell are used by Regular Expression but they have a completely different meaning. Most notably is the asterisk (*).

The example above, "*.txt", would be a bad Regular Expression. Why? The asterisk means the previous character zero or more times. There is no character preceding the asterisk so it is a syntax error.

For simple things like "*.txt", Regular Expression can be overly complex. The dot (.) means any character. So if I want to emulate the "*" of MSDOS, I would use ".*" in Regular Expression. If I wanted an actual dot I would use "\." in Regular Expression. So the whole "*.txt" in MSDOS becomes ".*\.txt" in Regular Expression. In most languages, the "\." would get processes as a control character by the String implementation. The slash (\) would never make it to the Regular Expression parser. So if you want "\." to reach the Regular Expression parser, you need to use "\\." because the String implementation will parse this, resulting in "\.", then pass it to the Regular Expression parser.

The language I use most right now is Java. If you look at the Java API documentation for the Pattern class you will see this is the Regular Expression parser.

Some of the basic stuff:

  • Anything not a special character is matched verbatim. So in my example above "txt" only matches "txt".
  • If you want to match a special character you need to escape it. From my example, the dot is a special character. To match a dot and nothing else you use "\\.". I use double slash because Java will parse the "\\" before sending it to Pattern.
  • Special characters from things like println() or printf() work the same in Regular Expression. These are "\t" for tab, "\n" for newline, "\r" for return, "\f" for form-feed, "\a" for a bell. A bell in ASCII is control-g or "\x07" but "\a" is better because you shouldn't assume ASCII.
  • You can have a sets using square brackets. If I have "[abc]" this will match "a", "b" or "c".
  • You can use the square brackets for negation. If the first character in the set is caret (^) it means 'not'. For example, "[^abc]" would match anything not "a", not "b" and not "c".
  • If you want all digits you could use "[0123456789]" but there is a shorthand for this. A range can be specified using a minus (-) symbol. This example would be "[0-9]". You can also do things like "[a-z]" but alphabetic strings can be problematic if you allow different character sets.
  • If you want upper and lower case letters you might think "[a-Z]" would work but this is an error. The letter 'Z' in ASCII has a value of 90 and 'a' has a value of 97. Second attempt might be "[A-z]". This is closer but in ASCII the symbols '[', '\', ']', '^', '_' and '`' are between 'Z' and 'a'. So you have too many characters in this set. The solution is a union (like in Set Theory). You want "[a-z]" union "[A-Z]". In Regular Expression this is written as "[a-zA-Z]".
  • You can also write a union as "[a-z[A-Z]]". This might seem like extra typing and in some cases it is. What if you wanted all consonants? That would be 21 letters uppercase and 21 letters lower case. A string with 42 letters (you cannot really use a single range). You could use "[b-df-hj-np-tv-zB-DF-HJ-NP-TV-Z]" but even that is a little ugly. How about: "[a-zA-Z[^aeiouAEIOU]]". When I look at that it is pretty obvious what I'm trying to match. It reads as "all letters but not vowels".
  • There is 'syntactic sugar' for some things:
    • Rather than "[0-9]" I can use "\d" (the d is for digit)
    • Rather than "[^0-9]" I can use "\D" (uppercase implies NOT)
    • Rather than "[ \t\n\x0b\\f\r]" I can use "\s" (the s is for space or whiteSpace)
    • Rather than "[^ \t\n\x0b\\f\r]" I can use "\S" (uppercase implies NOT)
  • A 'word' is a String made of letters, digits or underscore. A character of a 'word' therefore would be: "[a-zA-Z\d_]". Syntactic sugar for this is "\w".
  • Alternately, "\W" is for not a 'word' character.

Some slightly more advanced stuff would be boundary qualifiers:

  • The caret (^) not in a set means beginning of line. So if I have the string "^a" it matches if 'a' is the first character in the string. With wildcards or substring matching this can be very helpful. For example, "^def" will not match a substring check with "abcdefghi" but "def" will match.
  • The dollar ($) is for end of line. For example, "def$" will not match "abcdefghi" but "def" will match.
  • Capture groups are used for substitution. For example, if I have a string with my full name, "Darrell Grainger" and I want to change it to "Grainger, Darrell" I would do the following:
String name = "Darrell Grainger";
String flip = name.replaceFirst("(\\w*) (\\w*)", "$2, $1");
  • The "\\w*" means get the first word. It will match "Darrell". By wrapping it with parenthesis it becomes a 'capture group'. So the first "(\\w*)" gets saved into "$1" and the second "(\\w*)" gets saved into "$2".  In other implementations of Regular Expression, capture groups are saved into things like "\1" rather than "$1".
  • Capture groups are great if you are processing a number of strings in an array. This example will flip the first and second word for any set of strings.
More advance stuff would be Greedy quantifiers versus Reluctant quantifiers. Lets look at this with capture groups.
String s = "aaabbbaaa";
String s1 = s.replaceFirst("(a*)(.*)", "$2 $1");
String s2 = s.replaceFirst("(a*?)(.*)", "$2 $1");
The string s1 will contain "bbbaaa aaa".
The string s2 will contain "aaabbbaaa ".

For s1, what happened is "(a*)" matched "aaa" and "(.*)" matched "bbbaaa".
For s2, what happened is "(a*?)" was a Reluctant quantifier. Because "(.*)" is a Greedy quantifier, it captured everything. This left nothing for "(a*?)" to capture.

What happens under the hood is that the Regular Expression parser will find the Greedy quantifiers, read in the entire string and see if it matches. If it does not it pushes one character back out, checks for a match, pushes a character back out, checks for a match. It keeps doing this until it finds a match. Whatever didn't match is used to process Reluctant quantifiers. 

While processing the Reluctant quantifiers the parser will read in one character, check for a match, read another character, check for a match, read another character, check for a match. It keeps doing this so long as things are matching. The moment there isn't a match it stops.

So the s1 string processed "(a*)" first, because it is a Greedy quantifier and captured "aaa" into "$1". Then it processed "(.*)" which matched the rest of the string. This captured "bbbaaa" into "$2".

With the string s2 it processed "(.*)" because it is a Greedy quantifier and "(a*?)" is a Reluctant quantifier. The "(.*)" grabbed the entire string and put it into "$2". This left an empty string "". The empty string is used to process the Reluctant quantifier "(a*?)" and "" gets captured into "$1".

Here is a table of the Greedy versus Reluctant quantifiers:

Greedy Reluctant Meaning
X? X?? X, once or not at all
X* X*? X, zero or more times
X+ X+? X, one or more times
X{n} X{n}? X, exactly n times
X{n,} X{n,}? X, at least n times
X{n,m} X{n,m}? X, at least n but not more than m times

There is more the Regular Expressions but this information is what you need for most situations.

Tuesday, April 3, 2012

Frames and WebDriver

When dealing with iframes and WebDriver things can quickly get confusing. Especially if you add popup windows to the mix.

When you have an iframe, it is a separate DOM. You can look at it as a separate web page inside the current web page. Lets take an example diagram:

If we look at the source code for this it might be something like:
    <iframe src="frame1.html" style="border: red;">
    <iframe id="2" src="frame2.html" style="border: green;">
        <iframe id="2-1" src="frame2-1.html">...</iframe>
        <iframe id="2-2" src="frame2-2.html">...</iframe>
        <iframe id="2-3" src="frame2-3.html">...</iframe>
        <iframe id="2-4" src="frame2-4.html">....</iframe>
        <iframe id="2-5" src="frame2-5.html">....</iframe>
        <iframe id="2-6" src="frame2-6.html">...</iframe>
        <iframe id="2-7" src="frame2-7.html">...</iframe>
        <iframe id="2-8" src="frame2-8.html">...</iframe>
        <iframe id="2-9" src="frame2-9.html">...</iframe>
    <iframe id="3" src="frame3.html" style="border: brown;">
        <iframe id="3-1" src="frame3-1.html">...</iframe>
        <iframe id="3-2" src="frame3-2.html">...</iframe>
    <iframe id="4" src="frame4.html" style="border: blue;">
        <iframe id="4-1" src="frame4-1.html">...</iframe>
        <iframe id="4-2" src="frame4-2.html">...</iframe>
        <iframe id="4-3" src="frame4-3.html">...</iframe>
        <iframe id="4-4" src="frame4-4.html">...</iframe>
        <iframe id="4-5" src="frame4-5.html">...</iframe>
        <iframe id="4-6" src="frame4-6.html">...</iframe>
        <iframe id="4-7" src="frame4-7.html">...</iframe>
        <iframe id="4-8" src="frame4-8.html">...</iframe>
        <iframe id="4-9" src="frame4-9.html">...</iframe>
All the rectangles in the diagram are iframes. The iframe with the red border would be the first iframe in the HTML code. Inside each iframe will be a full HTML page. It will have the <html></html> and everything which goes inside an HTML page.

So in WebDriver you have a switchTo method. The switchTo method returns a WebDriver.TargetLocator interface. If we look at the WebDriver.TargetLocator interface we see the following methods:

  • frame(int index)
  • frame(String nameOrId)
  • frame(WebElement frameElement)
  • defaultContent()
We can use the index to find the frame. If you have one main page and it contains a set of iframes, this is fine. However, if you have frames within frames it can get a little difficult to follow. Even if you can figure it out today, you will have to go through the whole exercise again if they change the layout by moving, adding or deleting a frame.

The best way to find a frame is with the id attribute or find the frame element using findElement() then use the third version listed above.

Now here is the most important thing to remember: you cannot jump in two or more frames. So if you are at the main content page and want an element in frame3-1.html you have to switch to frame3.html then to frame3.1.html. Assuming we are at the main page, the code for this might look like:


Additionally, if I have switched to frame4-7.html and I want to go to frame2-1.html, I have to go back to the top, then to frame2.html, then to frame2-1.html.

So how do you get to the top? If you are dealing with iframes then the defaultContent() method will take you to the main page, above all the iframes. If you are dealing with frames, defaultContent() will take you to the first frame.

So you can either leave focus were every you last left it then every action in a frame assumes you are at some unknown focus, call driver.switchTo().defaultContent() to get to the top, then go down to the frame you want OR you can start at the top, go to the frame you want then back to the top when you are done. The first way you go to the top as needed. The second way you ensure you are always at the top. Both ways work. It is just a matter of convention.

Sometimes drawing the layout of the page is a little harder then the diagram above. Additionally, if the layout changes, it can be difficult to alter the picture, depending on how you drew it. What I like to do is draw the relationships like a tree. For example, the diagram above might be draw as:

From this tree the rules are that you can go down a branch (switchTo().frame() from the parent) or you can get to the root of the tree (defaultContent() for iframes). You cannot jump across nodes or up levels.