Automating Tests at the Browser Layer

With the rise in popularity of agile development, there is much work being done with various kinds of testing in the software development process. Developers as well as testers are looking at creative solutions for test automation. With the popularity of Open Source xUnit test framework tools such as JUnit, NUnit, HTTPUnit, JWebUnit and others, testing when developing software can be fun. Getting a “green bar” (all automated unit tests in the harness passed) has become a game, and folks are getting more creative about bringing these test tools to new areas of applications.

One area of web applications that is difficult to test is the browser layer. We can easily use JUnit or NUnit at the code level, we can create a testable interface and some fixtures for FIT or FITnesse to drive the code with tabular data, and we can run tests at the HTTP layer using HTTPUnit or JWebUnit. Where do we go from there, particularly in a web application that relies on JavaScript and CSS?

Historically, the well-known “capture/replay” testing tools have owned this market. These are an option, but do have some drawbacks. Following the home brew test automation vein that many agile development projects use, there is another option using a scripting language and Internet Explorer.

IE can be controlled using its COM interface (also referred to as OLE or ActiveX) which allows a user to access the IE DOM. This means that all users of IE have an API that is tested, published, and quite stable. In short, the vendor supplies us with a testable interface. We can use a testable interface that is published with the product, and maintained by the vendor. This provides a more stable interface than building one at run-time against the objects in the GUI, and we can use any kind of language we want to drive the API. I’m part of a group that prefers the scripting language Ruby.

A Simple Example

How does it work? We can find methods for the IE COM On the MSDN web site, and use these to create tests. I’ll provide a simple example using Ruby. If you have Ruby installed on your machine, open up the command interpreter, the Interactive Ruby Shell. At the prompt, enter the following (after Brian Marick’s example in Bypassing the GUI, what we type is in bold, while the response from the interpreter is in regular font).

irb> require ‘win32ole’
=>true
irb> ie = WIN32OLE.new(‘InternetExplorer.Application’)
=>#<WIN32OLE:0x2b9caa8>
irb> ie.visible = true
#You should see a new Internet Explorer application appear. Now let’s direct our browser to Google:
irb> ie.navigate(“http://www.google.com”)
=>nil
#now that we are on the Google main page, let’s try a search:
irb> ie.document.all[“q”].value = “pickaxe”
=>”pickaxe”
irb> ie.document.all[“btnG”].click
=>nil

#You should now see a search returned, with “Programming Ruby” high up on the results page. If you click that link, you will be taken to the site with the excellent “Programming Ruby” book known as the “pickaxe” book.

Where do we go from here?

Driving tests this way through the Interactive Ruby Shell may look a little cryptic, and the tests aren’t in a test framework. However, it shows us we can develop tests using those methods, and is a useful tool for computer-assisted exploratory testing, or for trying out new script ideas.

This approach for testing web applications was pioneered by Chris Morris, and taught by Bret Pettichord. Building from both of those sources, Paul Rogers has developed a sophisticated library of Ruby methods for web application testing. An Open Source development group has grown up around this method of testing first known as “WTR” (Web Testing with Ruby). Bret Pettichord and Paul Rogers are spearheading the latest effort known as WATIR. Check out the WATIR details here, and the RubyForge project here. If this project interests you, join the mailing list and ask about contributing.

*This post was made with thanks to Paul Rogers for his review and corrections.