Browse Category: Selenium WebDriver

Selenium Testing beyond GUI Browsers.

“Selenium automates browsers”. So goes the introductory line on To what extent is this a limitation?

The topic of the blog provokes questions – Can we structure Selenium-based test framework to test beyond GUI browsers? If not, at the very least, how can we improve test-effectiveness by extending an existing Selenium test-suite to work with other test-actions that do not use browsers.

The first question above needs in-depth technical discussion. We focus on the second one here which can be addressed at a more conceptual level.

We have an example below which illustrates how one can improve the efficiency and coverage of automation by going beyond the Browser GUI and enabling backend operations.

Consider a simple web-application which has Selenium test suite containing hundreds of automated scripts that need to run daily. In addition, there are test-scenarios that need to be selectively executed during the day using some of the scripts. The framework is robust and the application relatively stable.

All scripts need to necessarily navigate across the web-application to get to specific web-pages. There are often many test scenarios that need to be executed on a specific web-page. The time for the script to get to that specific web-page depends on the performance of the application and latency in loading the intermediate web-pages. This would in turn increase the overall time needed to execute the automation suite when the script count is in hundreds. How can this be reduced?

Problem Statement: how do we limit the steps of navigating the web front-end using Selenium scripts so as to reduce the overall test-execution time?

Refer the illustration below.

Selenium Beyond GUI

Total Execution Time – 2 minutes.

We have the above scenario wherein there are parameters to be set on Web-Page-B which are necessary for validations on Web-Page-C. Every individual script logs-out and logs-on to the application, and total time taken is close to 2 minutes.

Now if there are 50 scripts that set different parameter combinations, the Selenium suite that runs scripts on the browser GUI would potentially take 50 * 2 = 100 minutes just for navigating back and forth on the web pages, especially if we need to log in and log out after every script.

The actual verification point however is only on Web-Page-C for every kind of parameter setting.

Selenium Beyond GUI img 2


The parameter setting could be handled by Python or Perl scripts running in the backend. This would then cut down the navigation on the GUI.

Selenium Beyond GUI -img 3

The test flow is handled as below.

·       We test the end-to-end GUI navigation one time. The first test scenario covers this part.

·       At the same time, we trigger a script that directly accesses the backend. There could be several ways to do this – server side scripts, API calls, database queries… This depends on the application architecture and what is being tested.

·       The parameter is set at backend, the validation is done by the Selenium script on web-page C, this step is iterated over 50 parameters to be set for the 50 scenarios to be tested

The key here is to enable the automation framework to detect when the backend parameter is set to progress onto GUI validation, and then continue iterating between the two steps.

Total execution time now comes down drastically since webpage navigation is no longer needed.

This is an example of how existing Selenium suite can be extended with backend operations that improve automation efficiency and overall test effectiveness. The concept is proven; the implementation is heavily dependent on application architecture and specific test scenarios.

Recording Selenium Test Execution

After long time, got few min to think about new blog. There are many blog available for each and every solution but I was thinking to have few which is either not available or very limited information available over internet. I have free version of sauce lab and I am always fascinated by this kind of tool and I would love to build such tool sometime in future. There is one feature that is recording the session is always attracted me. I tried to find some resources which can help me to achieve so in my framework.

I found Microsoft expression encoder to do recording your screen. You can use their dlls to add into your C# based framework to execute and record the session. In you framework you can write a method which starts recording since beginning and later during cleanup task, you can have some login to determine if you want to save recording or not. For example, I my framework, I start recording based on my configuration file and in teardown methods, I check if there is an error or not. if yes then I encode into wmv format else I discard recording.

Follow following steps to have code and settings into framework:

Step 1:

Install Microsoft Encoder Expression: Refer for that ( Once you install, you may get following folder in installation location. C:\Program Files (x86)Microsoft Expression. If you try to explore few folders inside this folder you will find SDK folder @: C:\Program Files (x86)\Microsoft Expression\Encoder 4\SDK, where you can see Microsoft.Expression.Encoder.dlls which you can use in your framework code.

Step 2:

Include dlls given in SDK folders into your framework. You can write a method and call that. I have a key value pair in my framework config file and I can decide if I want to do recording or not.  In your test’s [setup] method, you can call this method which will start recording at the beginning of the test.

using System;



using OpenQA.Selenium;



using Microsoft.Expression.Encoder.ScreenCapture;

using System.Drawing;

using Microsoft.Expression.Encoder.Profiles;

using Microsoft.Expression.Encoder;

namespace FRAMEWORK


    //Call this method in setup method.   
    public static void StartRecordingVideo()


        //Provide setting in config file if you want to do recording or not.
        if (testEInfo.isRecording)


            job = new ScreenCaptureJob();

            job.CaptureRectangle = Screen.PrimaryScreen.Bounds;

            job.ShowFlashingBoundary = true;

            //provide the location where you want to save the recording.
            job.OutputPath = AutomationLogging.newLocationInResultFolder;





Step 3:

Now your test started and in the background your screen recording going on. once you reach to the [teardown] method. you can decide if you want to keep the recording or not. In my case I want to keep recording only if there is test failure so my developers can review else there is no point of having recording if tests are passing. To do so I have following method in above code which I call in code at the very end.

public static void StopRecordingVideo()
            if (testEInfo.isRecording)
                string filename = job.ScreenCaptureFileName;
                if (AutomationLogging.countOfError > 0)
                    MediaItem src = new MediaItem(filename);
                    Job jb = new Job();
                    jb.OutputDirectory = AutomationLogging.newLocationInResultFolder;
                    string output = ((Microsoft.Expression.Encoder.JobBase)(jb)).ActualOutputDirectory;



during encoding, you may notice that encoder is eating little more memory and you system may little slow. Try at your end and let me know if you have any question.

Smart click (part 1)

When using the Selenium Webdriver tool for testing web-based applications, I faced a problem: sometimes the common Click() method does not work for specific elements. Or, for example, it may work fine in the FF browser and fails to click in Internet Explorer. After a little investigation, I noticed that instead of the Click() method I can use methods from the Actions class, or I can click elements with JavaScript. But I also discovered that I have to use one action for FF and another for IE – often an action worked with one browser but failed with the other. I had to add extra logic for specific elements and it began spreading some mess in my code.

Then a couple of months later “funny” things started to happen regularly. Guys from Mozilla Corp. or Microsoft delivered a new version of their browser. The old version of Selenium Webdriver didn’t support this new browser, so I had to upgrade it to the latest one. But oops… most of my specific clicks turned out to work conversely. Or did not work at all, and I had to return to the common Click() method. Chaos in my code was about to come.

That’s why I started looking for a method that would work consistently with any browser, webdriver, or elements and would also give me some additional helpful features like events logging with screenshots, handling exceptions, coded-behind verification, etc. Finally, I realized I had to invent it myself.

I asked myself what I do when I click a web element (assume a button) in a browser manually and came up with the following list:

  1. I determine what the button I am going to click.
  2. I am sure if I can click the button. I check if it is present on the page, is displayed, is enabled for a click, etc. In other words, I verify conditions for a successful click.
  3. Then I click the button.
  4. I verify that click has been successful, that some changes has occurred – expected or not – even if I’ve got an error, it means that the click itself has happened. So, I verify conditions after the click.

I do all the thing from the list above unconsciously. So does everyone else.

So I decided to write my own version of the Click() method reproducing all my manual actions for clicking a button in real life. Here is the recipe:

If you have separate classes for different types of web elements in your testing framework, such as buttons or links, you can add the method Click() to this class(es) and use a private IWebElement instance defined inside the class for interactions. For example:

public class WebButton
    private IWebElement element;
    public WebButton(/*constructor parameters*/) 
        element = //instance of IWebElement according to constructor parameters

    public void Click(/*possible parameters*/)
        //logic for clicking the element in a smart way

If you use an ordinary IWebElement interface for describing web elements, you can create an extension method for the interface. Like this:

public static class Extensions
    public static void Click(this IWebElement buttonToClick, /*possible parameters*/)
        //logic for smart click on buttonToClick

You will be able to invoke the methods in your tests in this way:

//1st case
WebButton button = new WebButton(/*constructor parameters*/);

//2nd case
IWebElement button = driver.FindElement(By.Id("buttonID"));