Interaction with an application may be provided in two different ways. Assume I have a login form which I want to work with. In automated testing, I may consider the form as a separate functionality and as a single element/module of the application. Or I may consider it as a set of individual elements with their own methods which may be used for performing authorization functionality. I prefer the use of the second approach. Here is why.
In the first case, I should create a method (or a number of methods) for performing testing of signing in process. The method likely will be very complicated or the number of methods will be pretty big. It occurs because I must add new branch/section to the method almost for every new scenario (or I must create new methods for a separate scenario) and its number may be really huge. Even for this simple form, there may be a lot of scenarios like the following:
Finally, I may get a super method which does a lot of things but only God knows which ones exactly, and how, and why. Or I may have dozens of methods with similar names and logic which makes me know all the features and particularities of every method to be able correctly to select the right method for every test case. When there will be a couple of hundreds of such methods my brain and my memory become really well-trained ))
For the second case, I should have instances of elements necessary for the current test case. A login form usually has the following: textbox for username, textbox for password, dropdown list of domains available, checkbox ‘Remember me’, link to password restore page and submit button.
So, a test case may look like:
But the most important is that I do not have to add new logic to the form methods for every new scenario. I add new tests instead.
As I said previously I’m going to develop a framework for testing a web application. Therefore, during the testing, I will have to interact with web elements. But what a web element is? In terms of HTML document, a web element is a node of the document written with a start tag and with an end tag. Everything from the start tag to the end tag is the HTML element. Some of the elements are visible and we know them as buttons, text, links, pictures, checkboxes, etc. Others are not displayed and are used above all for page formatting. They are blocks, frames, scripts, containers and so on.
In terms of Selenium Webdriver tool, a web element is an object which represents an HTML element. It is declared in the code as an instance of IWebElement interface. Basically, the IWebElement is all-sufficient, it has a set of properties and methods which allows interaction with most of HTML elements. But in spite of this, I wrapped the IWebElement into my own classes which represent different types of HTML elements: separate class for every web element type – for buttons, for textboxes, for pictures, and so on.
If you are using the pure IWebElement as it is, your code will be as following:
Pretty simple and clear, but…
Assume one of your automated tests has failed and you need to know when and how it had happened. For example, you want to know if some action (assume Click()) has been performed over myButton. So you have to add more lines of code to verify that:
• myButton exists and the driver can find it
• myButton is displayed
• you can click myButton (it is available for interaction – enabled)
• in addition, you have to write down all these actions into the log file and to catch all possible exceptions
Thus your code grows approximately 4-5 times
Then you find out that myButton.Click() method works fine with FF browser but does not do it with IE. So you have to implement additional logic to make it work with Internet Explorer as well. The code grows two times again.
Finally, you want to check the expected result has occurred after clicking action had been performed over myButton. Therefore, you have to know that a valid event has arisen, for example, the hash-code of the page has changed, or its URL address, or the number of opened windows, and so on. Also, you have to put this information into your log file. And do not forget to handle exceptions…
As you can see, you have to do a lot of things to make a method for clicking a single web element being well-designed and easy for use. Also, you have to do all these things for all the elements on the page. Here is an example:
The question is: don’t you notice that a very big part of the code is repeated? And what if there are not two but twenty-two elements on the page? And you have a lot of pages in the application? And what if one day you have to change some part of the code?
I think it will get to be a problem.
This is the reason why I wrapped the IWebElement into my own classes.
Let’s see the example:
Now all the code related to any button (verifications, logging, handling of exceptions and “smart” clicks) is put inside the class of Button and is written only ONCE.
The most common usage of PageObject development pattern in testing assumes considering a page or a form as a separate entity (object). Usually in code one single object (in most cases it represents an isolated functionality) is equal to a separate class. All such objects/classes consist of web element instances (various web elements located on the related web form) and methods for interaction with the elements.
Let’s imagine there is a login form in a web application and a start page which is loaded when a user is successfully signed in. With the PageObject pattern the form may be described as:
Therefore, the test would be looking like this:
This approach allows creating a big number of tests very fast and easy. But there are a number of underwater stones hidden inside as well. This way of development of new tests is good enough for little and simple projects where there are only a couple of hundreds of web elements and methods, and were only a few persons are doing the testing. Because, if the tested web application is very complex or the application is developed very fast and a lot of functionalities are often changed or you have a big team of automated testers (and most of them are juniors or students), it is likely that the maintenance of the framework will be your nightmare. Because sometimes you will be forced and called upon to maintain the framework.
A simple example: what if after clicking LogIn button a dozen of other different pages may be loaded in addition to StartPage? For example, imagine that in accordance to the type of user’s contract, his role, profile settings, user location (the country where the request has come from) and settings of the testing environment (server which the user is connected through), there may be loaded eighteen different pages. In this case, you should override the method eighteen times to get all the pages, and after some time you should maintain all these versions of the method if something has been changed in the application. Of course, you can use a generic method aka <T>Login(username, password) in your tests but we’re talking about a maximum ease of writing and understanding of tests. Anyway, there must be verification that a page is loaded for all possible returned types and it likely will turn the method into an overcomplicated monster.
Another underwater stone is the scaling. If you have described a page or some of its parts as a single object and one day something is changed on this page (or its part) you will probably have to update the object itself, its methods (there may be a lot of ones) and related tests (the number of such tests may be actually huge).
Keep in mind that the bigger and complex object is the more difficult is its maintenance. You can describe, for example, the entire login form as a single entity or you can describe separately every element of this form as a single entity. The second approach is more flexible.
Interaction with an application using Selenium WebDriver tool is carrying around IWebDriver interface. This interface allows launching a definite type of browsers, doing common actions in browser windows or tabs and interacting with web elements. It is derived from ISearchContext and IDisposable interfaces. Hereyou can find full specifications. The interface can work with all main types of browsers – IE, Firefox, Safari, Chrome, Opera, Android, etc. Also, there are some additional implementations of the interface created for specific tasks. More details about it are here.
Based on the principle of maximum simplicity together with the maximum functionality of the framework, I decided not to invent any newfangled bicycles and leave IWebDriver interface in its original form. In the project, it is used as an instance (non-static) variable. This approach allows creating extension methods for IWebDriver and it also makes possible to create a number of IWebDriver instances for some unbelievable scenario with use of a number of different browsers. I can hardly believe that someone will be doing that but actually, such a possibility will be available.
This approach assumes that a reference to IWebDriver instance should be passed from one class to another, so a lot of class constructors will get an additional parameter. Nevertheless, this approach is maximum versatile.