Getting Started With Selenium

Section 1

What Is Selenium?

Selenium is a free and open-source browser automation library used by millions of people for testing purposes and to automate repetitive web-based administrative tasks.

It has the support of the largest browser vendors, who have integrated Selenium into the browsers themselves. It is also the core technology in countless other automation tools, APIs, and frameworks used to automate application testing.

Selenium/WebDriver is now a W3C (World Wide Web Consortium) Recommendation, which means that Selenium is the official standard for automating web browsers.

This is a preview of the Getting Started With Selenium Refcard. To read the entire Refcard, please download the PDF from the link above.

Section 2

Getting Started

Selenium language bindings allow you to write tests and communicate with browsers in multiple languages:

Java
JavaScript
Python
Ruby
C#

In order to start writing tests, you first need to install the bindings for your preferred programming language.

Java (With Maven)

In your test project, add the following to your pom.xml. Once done, you can either let your IDE (Integrated Development Environment) use Maven to import the dependencies or open a command-prompt, cd, into the project directory, and run mvn clean test-compile.

    XML
   
xxxxxxxxxx

<!-- https://mvnrepository.com/artifact/org.seleniumhq.selenium/selenium-java -->
<dependency>
    <groupId>org.seleniumhq.selenium</groupId>
    <artifactId>selenium-java</artifactId>
    <version>LATEST</version>
    <scope>test</scope>
</dependency>

You will need the Java Development Kit (version 8+ for 3.x and 4.x versions of Selenium) and Maven installed on your machine. For more details on Selenium Java bindings, see the API documentation.

JavaScript (npm)

JavaScript offers two different approaches for incorporating Selenium/WebDriver into your tests:

Traditional JavaScript Bindings

    Shell
   
xxxxxxxxxx

npm install selenium-webdriver

You will need Node.js and NPM installed on your machine. For more information about the Selenium JavaScript bindings, check out the API documentation.

WebDriver.IO

WebDriver.IO is a "next-gen" test framework for getting started with WebDriver in JavaScript. It's a fully-featured, W3C-compliant test framework, available with full documentation at webdriver.io.

Python

Use the command below to install the Python bindings for Selenium:

    Shell
   
xxxxxxxxxx

pip install selenium

You will need to install Python, pip, and setuptools in order for this to work properly. For more information on the Selenium Python bindings, check out the API documentation.

Ruby

Use the following command to install the Selenium Ruby bindings:

    Shell
   
xxxxxxxxxx

gem install selenium-webdriver

You will need to install a current version of Ruby which comes with RubyGems. You can find instructions for that on the Ruby project website. For more information on the Selenium Ruby bindings, check out the API documentation.

C# (With NuGet)

Use the following commands from the Package Manager Console window in Visual Studio to install the Selenium C# bindings.

    Shell
   
xxxxxxxxxx

Install-Package Selenium.WebDriver
Install-Package Selenium.Support

You will need to install Microsoft Visual Studio and NuGet to install these libraries and build your project. For more information on the Selenium C# bindings, check out the API documentation.

The remaining examples will show Java demonstrations.

This is a preview of the Getting Started With Selenium Refcard. To read the entire Refcard, please download the PDF from the link above.

Section 3

Launching a Browser

Selenium requires a "browser driver" in order to launch your intended browser. In all cases (except Safari), this driver must be downloaded and installed separately from the browser itself. For each example below, the code snippet will do no more than launch a single browser on your local machine.

Chrome

To use Chrome, you must download the ChromeDriver binary for your operating system (the highest number is the latest version) and add it to your System Path or specify its location during your test setup.

    Java
   
xxxxxxxxxx

//Create a new instance of the ChromeDriverWebDriver driver = new ChromeDriver();

Note: For more information about ChromeDriver, check out the Chromium team's page for ChromeDriver.

This is a preview of the Getting Started With Selenium Refcard. To read the entire Refcard, please download the PDF from the link above.

Section 4

Commands and Operations

The most common operations you'll perform in Selenium are navigating to a page and examining WebElements. You can then perform actions with those elements (e.g., click, type text, etc.), ask questions about them (e.g., Is it clickable? Is it displayed?), or pull information out of the element (e.g., the text of an element or the text of a specific attribute within an element).

Visit a Page

    Java
   
xxxxxxxxxx

driver.get("http://the-internet.herokuapp.com");

Find an Element

    Java
   
xxxxxxxxxx

// find just one, the first one Selenium findsWebElement element = driver.findElement(locator);// find all instances of the element on the pageList<WebElement> elements = driver.findElements(locator);

Work With a Found Element

    Java
   
xxxxxxxxxx

// chain actions togetherdriver.findElement(locator).click();// store the element and then click itWebElement element = driver.findElement(locator);element.click();

This is a preview of the Getting Started With Selenium Refcard. To read the entire Refcard, please download the PDF from the link above.

Section 5

Locators

In order to find an element on the page, you need to specify a locator. There are several locator strategies supported by Selenium:

By Locator	Example (Java)
Class	driver.findElement(By.className("dues"));
CSS Selector	driver.findElement(By.cssSelector(".flash.success"));
ID	driver.findElement(By.id("username"));
Link Text	driver.findElement(By.linkText("Link Text"));
Name	driver.findElement(By.name("elementName"));
Partial Link Text	driver.findElement(By.partialLinkText("nk Text"));
Tag Name	driver.findElement(By.tagName("td"));
XPath	driver.findElement(By.xpath("//input[@id='username']"));

Note: Good locators are unique, descriptive, and unlikely to change. So it's best to start with ID and Class locators. These are the most performant locators available and the most likely ones to be helpfully named. If you need to access something that doesn't have a helpful ID or Class, then use CSS selectors or XPath. But be careful when using these approaches, since they can be very brittle (and slow).

This is a preview of the Getting Started With Selenium Refcard. To read the entire Refcard, please download the PDF from the link above.

Section 6

An Example Test

To tie these concepts together, here is a simple test written in Java that demonstrates how to use Selenium to exercise a common functionality (e.g., login) by launching a browser, visiting the target page, interacting with the necessary elements, and verifying the page is in the correct place. Note that this example is intended to familiarize users with manipulating elements and the WebDriver API. A better method for abstracting and combining commands follows.

    Java
   
xxxxxxxxxx

import org.junit.Test;
import org.junit.Before;
import org.junit.After;
import static org.junit.Assert.*;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.firefox.FirefoxDriver;
public class TestLogin{   
  private WebDriver driver;
  @Before   public void setUp() {       
    driver = new FirefoxDriver();
  }
  @After
  public void tearDown() {
    driver.quit();
  }
  @Test
  public void succeeded() {
    driver.get("http://the-internet.herokuapp.com/login");       
    driver.findElement(By.id("username")).sendKeys("tomsmith");       
    driver.findElement(By.id("password")).sendKeys("SuperSecretPassword!");       
    driver.findElement(By.cssSelector("button")).click();       
    assertTrue("success message not present",             
        driver.findElement(By.cssSelector(".flash.success")).isDisplayed());
  }
}

This is a preview of the Getting Started With Selenium Refcard. To read the entire Refcard, please download the PDF from the link above.

Section 7

Page Objects

Rather than integrate the calls to Selenium directly into your test methods, you can model your application's behavior in simple objects. This allows you to write your tests using user-centric language, rather than Selenium-centric language. This is called the "Page Object Model."

When your application changes and your tests break, you only have to update your Page Objects in one place in order to accommodate the changes. This gives us reusable functionality across our suite of tests, as well as more readable tests.

This is a preview of the Getting Started With Selenium Refcard. To read the entire Refcard, please download the PDF from the link above.

Section 8

Waiting

Waiting for the whole page to load should be a thing of the past. To make your tests work in an asynchronous, JavaScript-heavy world, we need to tell Selenium how to wait for particular elements more intelligently. There are two types of functions for this in Selenium: Implicit Waits and Explicit Waits.

The recommended approach from the Selenium project is to use Explicit Waits, or at the very least to choose either Implicit or Explicit Waits, and to not mix them in your code.

Explicit Waits

Recommended approach
Specify an amount of time and a Selenium action
Selenium will try that action repeatedly until either:
- the action can be accomplished, or...
- the amount of time specified has been reached, throwing a TimeoutException.

This is a preview of the Getting Started With Selenium Refcard. To read the entire Refcard, please download the PDF from the link above.

Section 9

Screenshots on Failure

Selenium can take screenshots of the browser window. We recommend taking a screenshot whenever a test fails. In JUnit, this done with a TestWatcher rule.

    Java
   
xxxxxxxxxx

@Rule
public TestRule watcher = new TestWatcher() {
  @Override
  protected void failed(Throwable th, Description desc) {
    File scrFile = ((TakesScreenshot)driver).getScreenshotAs(OutputType.FILE);     try {
      FileUtils.copyFile(scrFile,
          new File("failshot_"
              + desc.getClassName()
              + "_" + desc.getMethodName()
              + ".png"));
    } catch (IOException ex) {
      throw new RuntimeException(ex);
    }
  }
}

This is a preview of the Getting Started With Selenium Refcard. To read the entire Refcard, please download the PDF from the link above.

Section 10

Running Tests in Parallel

In order to run your tests on different browser/operating system combinations simultaneously, you need to initialize a special kind of WebDriver: a RemoteWebDriver. This allows you to execute your tests on a different machine that you maintain (using the Selenium Grid) or one of the many cloud providers (e.g., Sauce Labs, BrowserStack, etc). These providers allow you to pay for the use of cloud servers for test execution but in an environment that you don't have to spend time or resources to maintain.

Selenium Grid

There are two main elements to the Selenium Grid — a hub to manage the tests, and nodes to execute them. The hub ensures your tests end up on the right node and manages all communication between the nodes and your test code. Nodes host the browser/OS combinations and execute your test commands while providing constant feedback to the hub.

Selenium Grid comes built into the Selenium Standalone Server, which you can download here.

This is a preview of the Getting Started With Selenium Refcard. To read the entire Refcard, please download the PDF from the link above.

Section 11

Mobile Support

Within the WebDriver ecosystem, there are a few mobile testing solutions for both iOS and Android. Appium supports Selenium/ WebDriver as well as the W3C standard for WebDriver and has its own Refcard. To get started with mobile, explore the many resources online, most notably: