Google Analytics


To search for specific articles you can use advanced Google features. Go to and enter "" before your search terms, e.g. CSS selectors

will search for "CSS selectors" but only on my site.

Tuesday, March 21, 2017

Using Charles from the command line

If you are testing network traffic you are probably familiar with Fiddler. Fiddler is a nice, easy to use tool for monitoring network traffic.

It works very easy. You start up Fiddler and it configures your Internet Settings. Now when you start up a web browser, it automatically routes traffic through Fiddler. As you hit web pages on the browser, the HTTP requests and response show up in Fiddler. It is very easy to read and understand what is going on.

If you are using a macOS computer you will be sad to learn that Fiddler does not exist for macOS. It is a Windows only product. If you check for free options to do the same thing you find Wireshark (formerly Ethereal). But Wireshark's configuration and output assumes you have knowledge of TCP, HTTP, Sockets, packets, etc. You can get the information that you need but it is not as easy as Fiddler.

Additionally, to play back a request with some modifications is a lot harder with Wireshark than with Fiddler.

So what do you do? Charles Proxy. Unfortunately, it is not free but at $50 it is a good investment. If you are working at a company with many people needing it, there are discounts available as well.

Now if you get Charles you will find it automatically starts up and changes the Network Settings on your macOS. So all the browsers and anything which uses Network Settings, will automatically go through Charles. 

What about command line? For example, I have a Docker script which creates a container, deploys a web service and waits for someone to hit it. What if I'm creating automation using Python, Java, bash script, etc.? These do not use the macOS Network Settings. So you will see nothing in Charles.

The solution is to add the necessary information to the shell before you launch your test scripts.

The way Charles works is rather simple. If my machine is using and I want to hit ( it might following the following route:
The way Charles works is creating a MITM (Man-In-The-Middle). So if I want Charles to be able to observe the traffic the route might be:
  • CharlesProxy
The way it does this is by creating proxy settings in Network Settings. To create proxy settings on the command line you need to set certain variables. For HTTP traffic and HTTPS traffic Charles tells macOS to set it to use IP address and port 8888.

For the command line you want to use:
export http_proxy=""
export https_proxy=""
Additionally, Charles tells the macOS to bypass certain addresses. What I do is go to System Preferences, select Network, select the Advanced... button, to to the Proxies tab.

On this page, assuming you are running Charles, you will see a bunch of addresses in the Bypass proxy settings box. Select all of them, copy them into the clipboard, go back to the command line and enter:
export no_proxy="<paste>"
With these three settings, anything you run from the command line will go through Charles.

However, if you close the shell you lose all the settings. If you want to keep the settings you can add them to your ~/.bash_profile text file. Every time you open a shell it will add the proxy information to the shell. HOWEVER, you don't want this if you are not using Charles. To disable this you need to enter:
unset http_proxy
unset https_proxy
unset no_proxy
So what I do is add the following to my ~/.bash_profile text file:
# Charles shortcuts
function charles_on {
        export http_proxy=""
        export https_proxy=""
        export no_proxy=",,,,,,,,,,"
function charles_off {
        unset http_proxy
        unset https_proxy
        unset no_proxy
By adding this to my ~/.bash_profile text file I can use:
to enable Charles. And I can use:
to disable Charles. Whenever Charles is not running I MUST disable Charles on the command line.

Wednesday, August 24, 2016

Mobile web testing

I've been working on a project recently which required testing on a mobile device. The project started in April this year and was focused solely on iOS.

When I looked into what was available for mobile testing I found a number of different tools:

  • KIF
  • Appium
  • Frank
  • Calabash
  • EarlGray
  • UI Automation
What I found next was a bother. I need a tool which would be used for testing the IPA we would ship from an app store. Tools like Frank and Calabash are great at automating tests but they required you to build a special version of the app. This would not be the same app you deployed to an app store.

This made it easy to eliminate those two great tools from my list of potential test automation tools.

I then looked into KIF and EarlGray. They had great reviews and looked really promising until I noticed Apple made significant changes to the UI Automation framework and broke KIF and EarlGray. So if I wanted to test against iOS 9.3 developed with XCode 7 and Swift I was probably not going to want to use KIF or EarlGray.

So the obvious choice was Appium. However, even Appium seem to be affected by the changes Apple announced at the July 2015 Developer Conference. :(

Since our iterations were one week and waiting to see who would 'fix' issues with their framework wasn't really an option, we branched Frank and started developing the app using Frank for UI testing. In the meantime we looked at UI Automation (our app was iOS only, so we didn't need to worry about Android support).

Initial use of UI Automation seemed good. So we started automating UI tests with it but continued to keep the Frank tests running in parallel. However after a few iterations we started to see maintaining the UI Automation tests was becoming increasingly difficult. Since I got into UI automation in 1998 I have found that failure to maintain a test automation framework is one of the common reasons for UI automation to fail. We didn't have a team of 20 QA Automation experts to keep the UI Automation framework going. :(

So I had a second look at the previously discarded frameworks. To my surprise and delight I found that support for them had been re-established and I took a second look at using Appium. 

Appium is definitely not fast and I'm looking for ways I can reduce the execution time of the Appium test suite (currently 15 minutes when run on hardware; I'd like to get it down to 5 minutes plus add more tests; maybe run tests in parallel on four or more phones).

Bottom line, Appium seems to be working well for us. I've created a page object model framework. In my next article I'll talk about using the Appium Inspector on a Mac laptop and things I found blocked me or slowed me down.

The difference between NotFoundException, NoSuchElementException and StateElementReferenceException

Been a while since I posted something here...

Recently, someone on the WebDriver Google Group asked what the difference between NotFoundException, NoSuchElementException and StateElementReferenceException are. Here is the answer I posted:

The NotFoundException is a super class which includes the subclass NoSuchElementException. The known direct subclasses of NotFoundException are: NoAlertPresentException, NoSuchContextException, NoSuchElementException, NoSuchFrameException and NoSuchWindowException. So if I want one catch statement to catch all five exception and they will all be handled the same way then I can just handle NotFoundException. But if I want to handle any of these five exceptions differently, I can catch the more specific subclass.

The NoSuchElementException is thrown when the element you are attempting to find is not in the DOM. This can happen for three reasons. 

The first is because the element does not exist and never will. To fix this, change your findElement to be correct.

The second is that you need to do something on the page to make the element appear. For example, the user selects Country and javascript populates a City field. If you attempt to look for a city before you select a country, the city you are looking for does not exist and you get a NoSuchElementException. To fix this you have to make sure the steps in your test are correct.

The third is that the element is generated by javascript but WebDriver attempts to find the element before the javascript has created it. The fix for this is to use WebDriverWait to wait for the element to appear (visibility and/or clickable).

StaleElementReferenceException is when you find an element, the DOM gets modified then you reference the WebElement. For example,

WebElement we = driver.findElement(By.cssSelector("#valid"));
// you do something which alters the page or a javascript event alters the page;

A classic example if this might be:

List<WebElement> listOfAnchors = driver.findElements(By.tag("a"));
for(WebElement anchor : listOfAnchors) {;

This code will get all the anchor elements into a list. Lets say there are 5 anchors on the page. The list now has 5 WebElement references. We get the first reference and click it. This take us to a new page. This is a new DOM. We print the title of the new page. Then we use back() to go back to the original page. The DOM looks just like the same DOM but it is a different DOM. So now all the references in the list a stale. On the second iteration, it gets the second reference and clicks it. This will throw a StaleElementReferenceException.

More difficult to debug is:

WebElement we = driver.findElement(...);
// javascript event gets fired by the website;

Sometimes this will throw a StaleElementReferenceException but sometimes the timing will be different and the click will work. I've seen many people have this intermittent problem. They add more code which doesn't fix the problem. The extra code just changes the timing and hides the problem. The problem comes back a few days later. So they add more random code. It looks like they fixed the problem but they just changed the timing. So if you get a StaleElementReferenceException and it is not clear why, it is probably this problem and you need to figure out how to make the findElement and click atomic.

Friday, January 8, 2016

Difference between using current node (.) and text() function in XPath for Selenium locators

Recently someone asked about using the partial link text to find an anchor with an IMG tag in the middle of the text. The HTML snippet was:

<a href="something.html">
<img src="filename.gif">
partial link text
The initial attempt for a locator was:
"//a[contains(text(),'partial link text')]"

Normally I would expect this to work. However, the text() function does not seem to find it. Peter Jeffery Gale (thanks Peter) noticed that the following locator did work:
"//a[contains(.,'partial link text')]"
The . notation is the current node in the DOM. This is going to be an object of type Node. I'm guessing that the Node is getting cast to a string. Something similar to:
"//a[contains(string(.),'partial link text')]"
The end result seems to be that getting the entire Node, convert it to a string and scanning the string for a substring always works. Using the XPath function text() to get the text for an element only gets the text up to the first inner element. If the text you are looking for is after the inner element you must use the current node to search for the string and not the XPath text() function.

Thursday, September 25, 2014

Silencing ChromeDriver with WebDriver

While setting up a test environment today we decided to have the tests running on the same machine as the build radiator.

A build radiator takes up the entire display. It shows a green bar for each job on the build server. If someone checks in a change and it breaks a test, the bar turns red and everyone stops to fix the build.

The consequence of this is that the build radiator has to be visible to everyone in the room. Having a browser open on the display is not an option.

So we need to run our WebDriver tests without showing the browser or any other output. Our build server is running Linux. So we have WebDriver tests. We can run them from the command line using something like:
java org.testng.TestNG testng.xml
where testng.xml is a TestNG test suite example. When we run it as this we see the browser open and the tests executing. The tests were written using ChromeDriver. When we run this on the build radiator however, we don't want the browser opening. The solution is actually quite easy for Linux. We use an application called Xvfb:
xvfb-run --server-args="-screen 0 1600x1200x24" java org.testng.TestNG testng.xml
The command xvfb-run will run the application using the X Virtual FrameBuffer. The --server-args lets us pass arguments to the server. The "-screen 0" tells xvfb to use screen 0. The "1600x1200x24" tells xvfb to make the virtual display 1600 by 1200 with 24 bit depth. If your application has to work on 1024 by 768 and 16 bit colour then you can use "1024x768x16".

When you execute this you will not see the browser open. It almost seems like nothing is happening. The only thing you will see is the output from TestNG (a dot for a pass, an I for an ignore and an F for a failure) and the output from chromedriver. What if you want to look at the logs and see just the output from TestNG; not interlaced with output from chromedriver?

This requires a few changes to the creation of the WebDriver object. Normally, you might have something like:
System.setProperty("", "./chromedriver");
WebDriver driver = new WebDriver();
but this outputs chromedriver log information to the screen. You could use:
System.setProperty("", "./chromedriver");
System.setProperty("", "--disable-logging");
WebDriver driver = new WebDriver();
This will stop most the output but you will see the header for when chromedriver starts up:
Starting ChromeDriver (v2.9.248307) on port 9515
So how do you get rid of this? I was digging through the code for chromedriver (remember it is open source) and I found some code where it was checking for the property If this was set to true then it would run with the silent flag set to true. So I tried:
System.setProperty("", "./chromedriver");
System.setProperty("", "--disable-logging");
System.setProperty("", "true");
WebDriver driver = new WebDriver();
Sure enough that did it. Complete silence from chromedriver.

Saturday, July 12, 2014


I was recently poking around on my Terminal (Mac OS X) and I noticed one of the environment variables was:
So I checked the man page for the bash shell to see what I could find about it:
man bash
Reading the man page I find DIRSTACK is an array relating to popd, pushd and dirs. Rather than using cd to change to a directory I can use pushd. For example:
pushd ~/Downloads
This will change directory to ~/Downloads plus it will add the directory to the DIRSTACK array. I can add some more to the DIRSTACK using:
pushd ~/Documents
pushd /Volumes
Now if I issue a dirs I will see:
/Volumes ~/Documents ~/Downloads
If you search for popd, pushd and dirs on the bash man page you will find all the settings for these builtin commands:
dirs [-clpv] [+n] [-n]
+n display the nth entry from the left, e.g. +2 will display the entry in position 2, this is zero-indexed
-n displays the nth entry from the right, just like the +n this is zero-indexed, e.g. -0 is the first entry
-c clears the DIRSTACK
-l displays a longer list, e.g. ~ gets expanded to the full directory name /Users/darrell
-p display one entry per line
-v display one entry per line with a number at the start of each line

You might thing the -v option is just line numbers but they are more than that. The numbers are directly related to the -n and +n option. Additionally, I can refer to specific entries in the list using ~n. For example, if the dirs -v displays:
 0  ~/Public
 1  ~/Downloads
 2  ~/Documents
 3  ~
then ls -l ~2 will be the same as ls -l ~/Documents. I can also use the tilde notation for popping elements off the stack as well. The next command, popd, has the following format:
popd [-n] [+n] [-n]
-n is literally -n, when you normally popd it will change to the directory you pop, -n will suppress this
+n removes n entries from the left, e.g. +2 will remove from third element from left (zero-indexed)
-n removes n entries from the right, e.g. -1 will pick the second element from right

The pushd commands looks similar:
pushd [-n] [dir]
pushd [-n] [+n] [-n]
-n is literally -n, and like popd it adds to the stack but does not cd to the new directory.
[dir] will push [dir] on the DIRSTACK then cd [dir]
+n will rotate the stack so the nth directory from the left is at the top
-n will roate the stack so the nth directory from the right is at the top


Friday, July 11, 2014

Interactive Ruby Shell

My current project uses Ruby and has a web testing component to it. The obvious choice for testing a web application with Ruby would be Selenium-WebDriver.

If you are familiar with Ruby you should be familiar with the Interactive Ruby Shell or irb.

If I enter irb at a command prompt I am placed at the Interactive Ruby Shell:

1.9.3-p547 :001 > 
Once you are at the Interactive Ruby Shell you can try things to see how they work. In a compiled language like Java you would have to compile the code into class files then execute them. With Ruby you can actually type the lines out and see what happens immediately. For example, to do the basic Selenium example I can enter:
require 'selenium-webdriver'
driver = Selenium::WebDriver.for :chrome
At this point a chrome browser should open. If it does not possible problems might be if chromedriver isn't in your PATH. Before you open the command prompt make sure that chromedriver is in your PATH. If it is in the PATH and you run irb then the Ruby code above should open a Chrome browser. It also assumes you have Chrome installed.

Once the browser opens you can do:
driver.methods - Object.methods
All objects in Ruby have a methods method. All objects also inherit Object. So the line above says to give me all the methods for driver and subtract all the Object methods from the list. So what will remain are the Selenium WebDriver methods:

 => [:save_screenshot, :screenshot_as, :action, :mouse, :keyboard, :navigate, :switch_to, :manage, :get, :current_url, :title, :page_source, :quit, :close, :window_handles, :window_handle, :execute_script, :execute_async_script, :first, :all, :script, :[], :browser, :capabilities, :ref, :find_element, :find_elements]
From this list I can see all the things I can do with driver, an instance of Selenium WebDriver. So now that I have an instance of WebDriver and I have the browser open I can enter:
driver.get ''
text_field = driver.find_element :id => 'gbqfq'
text_field.send_keys 'Selenium'
puts driver.title
As I type these lines I will see the browser switch to, sending the text 'Selenium' to the browser and the browser closing (driver.quit).