Browse Category: Selenium WebDriver

Selenium Testing beyond GUI Browsers.

“Selenium automates browsers”. So goes the introductory line on To what extent is this a limitation?

The topic of the blog provokes questions – Can we structure Selenium-based test framework to test beyond GUI browsers? If not, at the very least, how can we improve test-effectiveness by extending an existing Selenium test-suite to work with other test-actions that do not use browsers.

The first question above needs in-depth technical discussion. We focus on the second one here which can be addressed at a more conceptual level.

We have an example below which illustrates how one can improve the efficiency and coverage of automation by going beyond the Browser GUI and enabling backend operations.

Consider a simple web-application which has Selenium test suite containing hundreds of automated scripts that need to run daily. In addition, there are test-scenarios that need to be selectively executed during the day using some of the scripts. The framework is robust and the application relatively stable.

All scripts need to necessarily navigate across the web-application to get to specific web-pages. There are often many test scenarios that need to be executed on a specific web-page. The time for the script to get to that specific web-page depends on the performance of the application and latency in loading the intermediate web-pages. This would in turn increase the overall time needed to execute the automation suite when the script count is in hundreds. How can this be reduced?

Problem Statement: how do we limit the steps of navigating the web front-end using Selenium scripts so as to reduce the overall test-execution time?

Refer the illustration below.

Selenium Beyond GUI

Total Execution Time – 2 minutes.

We have the above scenario wherein there are parameters to be set on Web-Page-B which are necessary for validations on Web-Page-C. Every individual script logs-out and logs-on to the application, and total time taken is close to 2 minutes.

Now if there are 50 scripts that set different parameter combinations, the Selenium suite that runs scripts on the browser GUI would potentially take 50 * 2 = 100 minutes just for navigating back and forth on the web pages, especially if we need to log in and log out after every script.

The actual verification point however is only on Web-Page-C for every kind of parameter setting.

Selenium Beyond GUI img 2


The parameter setting could be handled by Python or Perl scripts running in the backend. This would then cut down the navigation on the GUI.

Selenium Beyond GUI -img 3

The test flow is handled as below.

·       We test the end-to-end GUI navigation one time. The first test scenario covers this part.

·       At the same time, we trigger a script that directly accesses the backend. There could be several ways to do this – server side scripts, API calls, database queries… This depends on the application architecture and what is being tested.

·       The parameter is set at backend, the validation is done by the Selenium script on web-page C, this step is iterated over 50 parameters to be set for the 50 scenarios to be tested

The key here is to enable the automation framework to detect when the backend parameter is set to progress onto GUI validation, and then continue iterating between the two steps.

Total execution time now comes down drastically since webpage navigation is no longer needed.

This is an example of how existing Selenium suite can be extended with backend operations that improve automation efficiency and overall test effectiveness. The concept is proven; the implementation is heavily dependent on application architecture and specific test scenarios.


A couple of words about Page Object

Often in the testing of web applications, every form on a page is described as a separate entity (object) – usually, one single form (in most cases it represents an isolated functionality) is equal to a separate class. All these objects/classes consist of web element instances (various web elements located on the related web form) and methods for interaction with the elements. Let’s imagine there is a Login form in a web application. The form may be described as:

public class Login
   IWebDriver driver;
   public Login(IWebDriver driver) { this.driver = driver; }
   //the class contains only one method
   //this is abridged example of the method. There is neither verification for null reference exception, nor check of driver’s actions, nor logging of events, nor handling of exceptions.
   //The method takes two strings as parameters: user name and password, and returns an instance of the web application start page - StartPage
   public StartPage LogIn(string userName, string pwd)
      IWebElement name = driver.FindElement(By.nameTextboxLocator);
      IWebElement pwd = driver.FindElement(By.passwordTextboxLocator);
      IWebElement btnLogin = driver.FindElement(By.loginButtonLocator);
      return new StartPage([parameters]);

Therefore, the test would be looking like this:

public class LogIn()
   var Page_Login = new Login(driver);
   var StartPage = Page_Login.LogIn(userName, password);

This approach allows you to create a big number of tests very fast and easy. But there are a number of underwater stones hidden inside as well. This way of writing new tests is good enough for little and simple projects where there are only a couple of hundreds of web elements and methods. and were only a few persons are doing the testing. Because, if the tested web application is very complex or the application is developed very fast and a lot of functionalities are often changed or you have a big team of automated testers (and most of them are juniors), it is likely that the maintenance of the framework will be your nightmare. Because sometimes you will be forced and called upon to maintain the framework.

A simple example: what if after clicking LogIn button a dozen of other different pages may be loaded in addition to StartPage? For example, imagine that according to the type of user’s contract, his role, profile settings, user location (the country where the request has come from) and settings of the testing environment (server which the user is connected through), there may be loaded eighteen different pages. In this case, you should override the method eighteen times to get all the pages, and after some time you should maintain all these versions of the method if something has been changed in the application. Of course, you can use a generic method aka <T>Login(username, password) in your tests but we’re talking about a maximum ease of writing and understanding of tests. It’s hard to imagine that a customer or PM or manual tester at least superficially understand the generics.

The second underwater stone is the scaling. If you described a page or some of its part as a single object and one day something is changed of this page (or its part) you will have to update the object itself, its methods (there may be a lot of ones) and related tests (the number of such tests may be actually huge). In general, the causes from above are already sufficient to decide not to implement the Page Object pattern to web pages or forms but to implement the pattern to page’s elements. In my approach, an object I will interact with is a web element, not the entire page. You may call it Element Object. I describe every single web page (or web form) as a separate class and it still contains a set of web elements located on the related web page but there are no methods for interaction with its elements. Instead, the web elements (buttons, links, images, tables, etc.) are described as the page’s properties. Every element is defined in its own class and has its own methods for interaction. Thus, the Page Object pattern is implemented in relation to web elements, not web pages.

Please, keep in mind that the bigger and complex object is the more difficult is its maintenance. You can describe, for example, the entire login form as a single entity or you can describe separately every element of this form as a single entity. The second approach will be more flexible.


public class Login
   //the class contains a set of properties, not methods
   //this is full description of login form. No additional verification, actions or logging is required.Everything, together with all properties and methods of every element, is encapsulated inside appropriate web element class
   public Textbox Textbox_UserName { get { return new Textbox(“locator of the element on the page”, [other parameters]); } }
   public Textbox Textbox_Password { get { return new Textbox(“locator of the element on the page”, [other parameters]); } }
   public Textbox Button_LogIn { get { return new Button(“locator of the element on the page”, [other parameters]); } }
public void TestLogIn()

Firebug, Firepath, and other developer tools.

Firebug, Firepath, and other developer tools.

First thing should come to your mind, what are Firebug and firepath and why do I need to learn this? The answer is pretty simple, you should know that what you want to tell your selenium code to perform some action, and where? As a human, you exactly know that where you want to click on the screen or what kind of items you like to interact with. If you want to click on third menu item out of six on the screen, your mind will tell your hand or finger to move the cursor to that menu item and click but in case of selenium or another test tool, your test code should know that unique location before the click. Firebug and firepath help you to make your life easier by identifying your target element in your Firefox browser.

Installation Steps:

Firefox Users


Start firefox browser, open URL and search for Firebug add-on. You will see the following screen. You may notice different version of firebug based on your existing firefox version. Add this extension to your Firefox by clicking button ‘Add to Firefox’.
Firepath Addon

This action may ask to restart your firefox. Make sure you do not have any unsaved changes on any firefox instance opened on your computer. Once Firefox restarts, you will see the following icon in the browser’s add-on area. Now you can start using firebug.



Firepath is firebug extension and adds a development tool to edit and identify XPath or CSS selector of web element. To install firepath, you need to do the same as firebug in above steps and restart the browser. Once browser started, you can start using it.


Now you want to understand about an element like HTML code, XPath, and properties of that target element. Take an example, my application under test is, and I want to click on ‘Add to Cart’ button on the Nexus phone. Take your cursor to the desired button or element, do right click and choose “Inspect Element with firebug’ as below.

Firepath Addon

As soon as you click on the last option in the above dialog, you will see developer tool open where you can see all HTML code for the web page. There is no need to go into detail with other tabs which I will go in later chapters.

Firepath Addon

you can see Firepath tab on the above screen too. Now you if you want to have your script to click on this button, you have to give the locator of the button. The locator is a unique address of an element on the webpage so that your script can find and act on it. To find the locator either you can right click on the highlighted element on the above screen inside developer tool and select “Copy XPath” or “Copy CSS Path” as given below and save in your script at an appropriate place.

Firepath Addon

Firepath can better help to get CSS or XPath than getting from here. Go to FirePath tab in firebug tool and click inspect element and then point to target element on the webpage. It will highlight target element and give you the option to choose XPath/classpath as per your choice as below.

Firepath Addon

In Chrome Browser

If you are using google chrome, developer tool is already there. You do not have to worry about. Put your mouse cursor to the element where you want to perform the action, and do right click and choose “Inspect”. Developer tool will automatically open. right click on the highlighted and you can choose to copy XPath or CSS Selector as below.

Firepath Addon

There are many chrome extensions are available which can help you to find locator from chrome browser. You can play along with them and see if those are easy enough to use. There is no perfect tool but each developer in the world building something which is making our life easier. You can see various tabs above and we will talk more about that some other time.


Thanks for reading and your valuable feedback is important to make this website helpful for others. Do not hesitate to correct if you find any mistake or unclear line. I will try to improve and update periodically.