Drive QA's evolution—share your voice and win prizes up to $3,000
TAKE THE SURVEY
All All News Products Insights DevOps and CI/CD Community
Table of Contents

Selenium Testing: A Complete Guide

Selenium is widely loved by web testers around the world thanks to its versatility and simplicity. Testing with Selenium is quite straightforward, which is why it is commonly used for people who want to go from manual to automation testing.
 

In this article, we’ll show you how to do Selenium testing in depth.

History of Selenium

Selenium logo

Selenium started in 2004 when Jason Huggins, an engineer at ThoughtWorks, needed to test a web application frequently. To avoid the hassle of manual testing, he built a JavaScript tool called JavaScriptTestRunner to automate user actions like clicking and typing.

Later, this tool was renamed Selenium Core, and it became popular within ThoughtWorks. However, it had a major limitation: it couldn’t bypass a browser’s same-origin policy, which blocked interactions with different domains.

In 2005, Paul Hammant created Selenium Remote Control (RC) to solve this issue. It allowed tests to be written in various programming languages and run across different browsers using a server to inject JavaScript. This made Selenium more flexible and widely adopted.

In 2006, Simon Stewart from Google developed Selenium WebDriver, which directly controlled browsers using their native APIs, making automation faster and more reliable.

By 2024, Selenium 4 is the latest version. It offers a simpler API, better browser support, and native WebDriver protocol, making web automation easier and more efficient.

Understanding Selenium

Selenium is an open-source suite of tools designed for automating web browsers. It provides a way to interact with web pages through a programmatic interface, making it possible to perform tasks such as form submission, navigation, and data extraction automatically. 
 

You can write Selenium test scripts in Java, C#, Python, Ruby, JavaScript, and more.
 

Selenium has 3 core components:

1. Selenium WebDriver

This is the core of the Selenium suite. It provides a programming interface to create and execute browser-based test scripts.
 

Simply put, Selenium WebDriver allows you to communicate directly with the browser you want to test on using a dedicated driver for that browser. It controls the browser by sending it commands (like clicking a button or filling out a form) and receiving responses (like checking if a button is displayed or if a form was submitted successfully).
 

There are usually 4 steps in the process:

  1. A WebDriver object is created in the desired programming language. This object corresponds to a specific browser (e.g., ChromeDriver for Google Chrome).
  2. The WebDriver establishes a session with the browser, initiating a controlled instance of the browser.
  3. WebDriver sends commands to the browser (e.g., open a URL, click a button, enter text in a form field). These commands are executed in the browser just as if a human user were performing the actions.
  4. The browser sends responses back to the WebDriver. These responses contain information about the success or failure of the commands, as well as any data requested (e.g., the text of a web element, the status of a form submission).
  5. Once the test script has completed its tasks, the WebDriver terminates the session, closing the browser.

Each browser has a different driver, which is:

  • ChromeDriver for Google Chrome
  • GeckoDriver for Mozilla Firefox
  • EdgeDriver for Microsoft Edge
  • SafariDriver for Safari

Selenium WebDriver offers you a rich set of commands to perform on-web interactions with its elements. You can locate those elements using several strategies:

  • By ID
  • By Name
  • By Class Name
  • By Tag Name
  • By CSS Selector
  • By XPath

2. Selenium Grid

Selenium Grid is a tool designed to facilitate the parallel execution of test cases across multiple machines and browsers simultaneously. Running tests in parallel significantly reduces the overall time required to complete a test suite. Instead of waiting for one test to finish before starting another, multiple tests run at the same time.
 

Selenium Grid has 2 core components:

  • Hub: a server that receives test requests from clients (test scripts) and routes them to the appropriate nodes. It is responsible for managing the distribution of test requests to nodes based on the test requirements and available resources. 
  • Node: the machines that actually execute the test cases. Each node is connected to the hub and can run tests on a specific browser or operating system. Each node can be configured with various options like the type of browsers it supports, the version of the browser, and the operating system. Nodes can be dynamically added or removed from the grid based on the testing needs.

Here’s how Selenium Grid works:

Selenium mechanism

A typical test execution flow in Selenium happens in these steps:

  1. A test script is written using Selenium WebDriver in a supported programming language.
  2. The test script specifies details like the browser type, version, and operating system, then sends a test request to the Hub.
  3. The Hub receives the request and checks the specified details.
  4. The Hub looks through its list of available Nodes to find one that matches the request.
  5. The Hub picks a suitable Node and assigns the test to it.
  6. The Hub forwards the test request to the selected Node.
  7. The Node gets the test request and starts running the test.
  8. The Node opens the browser, performs the actions from the test script, and interacts with the web application.
  9. After the test finishes, the Node sends the results (pass/fail status, logs, screenshots) back to the Hub.
  10. The Hub combines the results from all Nodes and shares the final outcome with the test script or reporting tool.

Communication between the Hub and Nodes is usually done in JSON format, where test requests, details, and results are sent as JSON data.

3. Selenium IDE

The primary purpose of Selenium IDE is to make the process of creating and executing automated tests more accessible and less time-consuming, especially for users who may not have extensive programming experience.
 

Selenium IDE allows users to record their interactions with a web application, such as clicking buttons, entering text, and navigating through pages. During the recording session, every action is captured and converted into a test script.

How It Works: Users start the recording session by clicking a button in the IDE, then perform actions in the web application. Selenium IDE automatically generates a sequence of commands based on these actions. This eliminates the need for manual coding, making Selenium more accessible to non-programmers and speeding up test case creation.

Selenium testing tutorial

After recording, users can run the test script, and Selenium IDE will replay the recorded actions in the browser. The IDE displays the results of the test, including any errors or issues encountered.

Selenium IDE allows users to set breakpoints in their test scripts. When a breakpoint is reached during execution, the IDE pauses the test, enabling users to inspect the state of the application and script.

Selenium testing guide
 

Steps To Start Selenium Testing

1. Prepare your skills

To effectively use Selenium, you should have a basic understanding of at least one programming language. Java, Python, C#, and Ruby is the most popular choices. Knowledge of programming concepts such as variables, loops, conditionals, and functions is also essential.
 

After that, make sure you understand the basics of web technologies:

  • HTML
  • CSS
  • JavaScript
  • Web element mechanism (forms, buttons, links, etc.)

Get your IDE ready. Popular IDEs for Selenium are:

  • Eclipse: An open-source IDE that supports Java and other programming languages. It offers powerful features such as code completion, debugging, and plugin support.
  • IntelliJ IDEA: A versatile IDE with both a community (free) edition and a commercial edition. It is known for its advanced code analysis and user-friendly interface.
  • VSCodeA lightweight, open-source code editor developed by Microsoft. It supports numerous programming languages through extensions and offers features such as IntelliSense (smart code completion), debugging, integrated terminal, and version control integration.

2. Installing Selenium

2.1. Java

Step 1: Download Selenium Java bindings from the Selenium website.

Step 2: Add the downloaded JAR files to your project’s build path in your IDE.

Step 3: Set up a Maven or Gradle project to manage dependencies (if preferred).

Step 4: Add Selenium dependencies in pom.xml (for Maven) or build.gradle (for Gradle).
 

 

2.2. Python

Step 1: Install Python from the official website.

Step 2: Use pip (Python package manager) to install Selenium
 

pip install selenium


 

2.3. C#

Step 1: Install Visual Studio IDE

Step 2: Create a new project and add Selenium WebDriver using NuGet package
 

Install-Package Selenium.WebDriver

Setting Up WebDriver For Your Browsers

3.1. Google Chrome

Step 1: Visit the ChromeDriver download page and download the version of ChromeDriver that matches your installed version of Google Chrome.
 

Step 2: Extract the downloaded ZIP file to a directory of your choice. Note the path to the chromedriver executable.
 

Step 3: Add the ChromeDriver executable to your system PATH, or specify the path in your code:
 

System.setProperty("webdriver.chrome.driver", "path/to/chromedriver");

WebDriver driver = new ChromeDriver();
 

3.2. Mozilla Firefox

Step 1: Visit the GeckoDriver releases page and download the appropriate version for your operating system.

Step 2: Extract the downloaded ZIP file to a directory of your choice and note the path to the geckodriver executable.

Step 3: Add the GeckoDriver executable to your system PATH, or specify the path in your code:
 

System.setProperty("webdriver.gecko.driver", "path/to/geckodriver");

WebDriver driver = new FirefoxDriver();
 

3.3. Microsoft Edge

Step 1: Visit the Microsoft Edge WebDriver download page and download the version of EdgeDriver that matches your installed version of Microsoft Edge.

Step 2: Extract the downloaded ZIP file to a directory of your choice and note the path to the msedgedriver executable.

Step 3: Configure EdgeDriver in Your Code

Add the EdgeDriver executable to your system PATH, or specify the path in your code:

 

System.setProperty("webdriver.edge.driver", "path/to/msedgedriver");

WebDriver driver = new EdgeDriver();

3.4. Safari

Step 1: Enable Remote Automation

  • Open Safari and go to Preferences.
  • Navigate to the Advanced tab.
  • Check the box next to "Show Develop menu in menu bar."
  • In the Develop menu, select “Allow Remote Automation.”

Step 2: Install SafariDriver

  • SafariDriver comes pre-installed with Safari on macOS. No additional download is required.

Step 3: Configure SafariDriver in Your Code

  • You can directly use SafariDriver without setting any system property:

WebDriver driver = new SafariDriver();

 

Writing Your First Selenium Tests

A Selenium test is composed of several key components:

  1. WebDriver initialization: the first step is setting up the WebDriver instance. It is a bridge between your test script and the browser. 
  2. Navigate to the web page: you can use methods get() or navigate().to() to open the URL of the specific web page you want to test on
  3. Locate web elements: to identify and interact with on-web elements, use locators (e.g. ID, name, CSS selectors, XPath)
  4. Perform actions: these actions are performed using WebDriver's methods
    • sendKeys(): Sends input to an input field.
    • click(): Clicks on a web element.
    • getText(): Retrieves the text of a web element.
    • selectByVisibleText(): Selects an option from a dropdown by visible text.
    • get(): Navigates to a specified URL.
    • findElement(): Locates a single web element.
    • findElements(): Locates multiple web elements.
    • clear(): Clears the content of an input field.
    • isDisplayed(): Checks if a web element is visible.
    • isEnabled(): Checks if a web element is enabled.
    • isSelected(): Checks if a web element (e.g., checkbox) is selected.
    • submit(): Submits a form.
    • getTitle(): Retrieves the title of the current page.
    • getCurrentUrl(): Retrieves the URL of the current page.
    • navigate().to(): Navigates to a specified URL using the browser's navigation.
    • navigate().back(): Navigates back to the previous page.
    • navigate().forward(): Navigates forward to the next page.
    • navigate().refresh(): Refreshes the current page.
    • switchTo().frame(): Switches to a specific iframe.
    • switchTo().defaultContent(): Switches back to the main content from an iframe.
    • switchTo().alert(): Switches to an alert box.
    • accept(): Accepts an alert.
    • dismiss(): Dismisses an alert.
  5. Assertions: assertions verify that expected conditions or values match actual outcomes during test execution. 
  6. Synchronization Handling: not all web pages load their structure and content in the same way. There are many variables coming into play 
    • implicitlyWait(): Sets a default wait time for all element searches.
    • explicitlyWait (WebDriverWait): Waits for a specific condition to be true before proceeding.
    • pageLoadTimeout(): Sets the maximum time to wait for a page to load.
    • setScriptTimeout(): Sets the maximum time to wait for an asynchronous script to finish.
  7. Test Data Management: test data is a big part of any test case, especially for data-driven testing. Test data can be hard-coded within the test script, read from external files (e.g., CSV, Excel), or generated dynamically. 
  8. Reporting and Logging: logging test steps, actions, and results helps in troubleshooting and debugging. Frameworks or libraries like Log4j, SLF4J, or built-in logging capabilities of programming languages are valuable for this. 
  9. Cleanup: after individual test is completed, it is important to reset the environment to a clean slate to ensure that this test does not affect the results of other tests.

Read More: What is a Test Report? How To Build Good Test Reports?
 

Now let’s write a Selenium script for this test case:
 

Component

Details

Test Case ID

TC001

Description

Verify Login with Valid Credentials on Etsy

Preconditions

User is on the Etsy login popup

Test Steps

1. Enter a valid email address.

2. Enter the corresponding valid password.

3. Click the "Sign In" button.

Test Data

Emailvaliduser@example.com

Password: validpassword123

Expected Result

Users should be successfully logged in and redirected to the homepage or the previously intended page.

Actual Result

(To be filled in after execution)

Postconditions

User is logged in and the session is active

Pass/Fail Criteria

Pass: Test passes if the user is logged in and redirected correctly.

Fail: Test fails if an error message is displayed or the user is not logged in.

Comments

Ensure the test environment has network access and the server is operational.


 

Let’s write a Selenium script in Python for those steps:

  1. Import libraries and define test case details (ID, description, login credentials)
  2. Setup WebDriver for Chrome and specify the path to the ChromeDriver executable
  3. Navigate to Etsy login page
  4. Leverage WebDriverWait to wait until elements are visible and clickable.
  5. Enter the email and password using send_keys
  6. Check if user is redirected to the homepage
  7. Cleanup by quitting the driver

 

from selenium import webdriver

from selenium.webdriver.common.by import By

from selenium.webdriver.common.keys import Keys

from selenium.webdriver.chrome.service import Service

from selenium.webdriver.chrome.options import Options

from selenium.webdriver.support.ui import WebDriverWait

from selenium.webdriver.support import expected_conditions as EC

import time
 

# Test Case Details

test_case_id = "TC001"

description = "Verify Login with Valid Credentials on Etsy"

email = "validuser@example.com"

password = “validpassword123”
 

# Set up WebDriver

chrome_options = Options()

chrome_options.add_argument("--headless")  # Run in headless mode (optional)

chrome_service = Service("path/to/chromedriver")  # Replace with the path to your chromedriver

driver = webdriver.Chrome(service=chrome_service, options=chrome_options)
 

try:

    # Navigate to Etsy login page

    driver.get("https://www.etsy.com/signin")
 

    # Wait until the login fields are available

    wait = WebDriverWait(driver, 10)

    email_field = wait.until(EC.visibility_of_element_located((By.ID, "join_neu_email_field")))

    password_field = wait.until(EC.visibility_of_element_located((By.ID, "join_neu_password_field")))

    sign_in_button = wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[data-id='signin-btn']")))
 

    # Enter email and password

    email_field.send_keys(email)

    password_field.send_keys(password)
 

    # Click the Sign In button

    sign_in_button.click()
 

    # Wait for the page to load and verify successful login

    WebDriverWait(driver, 10).until(EC.url_changes("https://www.etsy.com/signin"))
 

    # Check if redirected to the homepage

    if "etsy.com" in driver.current_url:

        print(f"Test Case {test_case_id}: Pass - User is successfully logged in.")

    else:

        print(f"Test Case {test_case_id}: Fail - User is not redirected to the homepage.")

except Exception as e:

    print(f"Test Case {test_case_id}: Fail - {str(e)}")

finally:

    # Cleanup

    driver.quit()



banner5.png

Click