KEMBAR78
Selenium | PDF | Selenium (Software) | Mobile App
0% found this document useful (0 votes)
33 views49 pages

Selenium

Selenium is an open-source framework designed for automating web browsers, allowing developers to write scripts that simulate user interactions with web applications. It supports various types of applications, including web, mobile, and hybrid apps, and offers components like WebDriver, IDE, and Grid for efficient testing. Selenium's advantages include cross-browser compatibility, language agnosticism, and strong community support, making it a popular choice for both manual and automated testing in diverse environments.

Uploaded by

uishuman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views49 pages

Selenium

Selenium is an open-source framework designed for automating web browsers, allowing developers to write scripts that simulate user interactions with web applications. It supports various types of applications, including web, mobile, and hybrid apps, and offers components like WebDriver, IDE, and Grid for efficient testing. Selenium's advantages include cross-browser compatibility, language agnosticism, and strong community support, making it a popular choice for both manual and automated testing in diverse environments.

Uploaded by

uishuman
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 49

Selenium a powerful tool for web browser automation.

Introduction to Web Automation Testing

Selenium is an open-source framework primarily used for automating web browsers. It


allows testers and developers to write scripts that can interact with web applications as a user
would, performing actions like clicking buttons, filling forms, navigating pages, and validating
content. Its goal is to make it easier for people to develop software that leverages the power of
the web browser.

Types of Applications

Understanding the different types of applications is crucial for choosing the right testing tools
and strategies.

 Desktop Applications: These are programs installed and run directly on a computer's
operating system (e.g., Windows, macOS, Linux). Examples include Microsoft Word,
Adobe Photoshop, or a custom-built accounting software.

o Testing Focus: Usability, functionality, performance, security, compatibility


across different OS versions.

o Tools: Tools like WinAppDriver (for Windows), Appium (for desktop apps via its
desktop driver), or platform-specific automation frameworks are used.

 Selenium primarily focuses on web applications, but there are extensions and tools
that can work with desktop applications as well.

 Web Applications: These are applications accessed via a web browser over the internet
or an intranet. They are not installed on the user's machine but run on a server and are
delivered through the browser. Examples include Gmail, Facebook, online banking
portals, and e-commerce sites.

o Testing Focus: Functionality, user interface (UI) consistency, cross-browser


compatibility, responsiveness, performance, security.
o Tools: Selenium is the de facto standard for web application automation. Other
tools include Cypress, Playwright, and TestCafe.

 Selenium is best known for testing web applications by automating the browser
actions.

 Mobile Applications: These are applications designed to run on mobile devices like
smartphones and tablets. They can be native, web, or hybrid.

o Native Mobile Applications: Built specifically for a particular mobile operating


system (e.g., iOS or Android) using their native SDKs. Examples: A banking app
downloaded from the App Store or Google Play.

 Testing Focus: Performance, usability, device compatibility, offline


functionality, gestures.

 Tools: Appium, Espresso (Android), XCUITest (iOS).

o Web Mobile Applications: These are essentially websites designed and


optimized for mobile browsers. They are accessed through a mobile browser and
are not installed as separate apps.

 Testing Focus: Responsiveness, cross-browser compatibility on mobile,


usability on touch interfaces.

 Tools: Selenium (using mobile emulators/simulators or real devices),


Appium.

o Hybrid Mobile Applications: These are a blend of native and web applications.
They are typically built using web technologies (HTML, CSS, JavaScript) and
then wrapped in a native container. This allows them to be installed on a device
like a native app and often access device features. Examples: Apps built with
frameworks like React Native or Ionic.

 Testing Focus: Combines testing aspects of both native and web apps,
including UI, functionality, and device integration.

 Tools: Appium is a primary tool for hybrid app automation.


 Selenium can be used for mobile testing through frameworks like Appium, which
extends Selenium to automate native, hybrid, and mobile web apps.

 Hybrid Applications: This term can also refer to applications that combine elements of
different categories, but in the context of mobile, it specifically refers to the blend of
native and web technologies within a single application package.

 Selenium supports testing hybrid applications that combine web technologies with
native functionalities on mobile platforms.

Software Testing Methods or Ways


Software testing can be broadly categorized into two main approaches:

1) Manual Testing:

o Definition: This is a testing process where a human tester manually executes test
cases without the use of automated tools. Testers interact with the application,
follow predefined steps, and observe the results, comparing them against expected
outcomes.

 Manual Testing involves human testers manually conducting test cases without
automation. It requires more time and is prone to human error.

o Process: Testers design test cases, set up test data, execute tests by interacting
with the application's UI or API, record observations, and report defects.

o When it's used:

 Exploratory Testing: When testers explore the application to discover


unexpected behavior.

 Usability Testing: To gather feedback on user experience.

 Ad-hoc Testing: Unstructured testing without specific test cases.

 Early stages of development: When the application is highly unstable or


frequently changing.
 Testing UI elements: Where subtle visual discrepancies are important.

o Pros:

 Can uncover bugs that automation might miss (e.g., usability issues).

 More flexible for testing subjective aspects.

 Requires less upfront investment in tool setup and script development.

o Cons:

 Time-consuming and resource-intensive.

 Prone to human error and inconsistency.

 Difficult to perform repetitive tasks efficiently.

 Limited scalability for large regression suites.

2) Test Automation:

o Definition: This is a software testing technique that uses specialized software


tools to execute pre-scripted tests, compare actual outcomes to predicted
outcomes, and generate test reports. The goal is to automate the execution of tests
to improve efficiency, accuracy, and coverage.

 Test Automation: Involves using software tools like Selenium to automate the
execution of test cases. It offers faster feedback, wider test coverage, and
repeatability.

o Process: Testers write automated test scripts using programming languages and
testing frameworks. These scripts are then executed automatically, often in a
continuous integration (CI) environment.

o When it's used:

 Regression Testing: To ensure that new code changes haven't negatively


impacted existing functionality.

 Repetitive Test Cases: For tasks that need to be performed many times.
 Performance Testing: To simulate load and measure response times.

 Cross-Browser/Cross-Platform Testing: To efficiently test on multiple


environments.

 Data-Driven Testing: To test an application with various data sets.

o Pros:

 Increases speed and efficiency of testing.

 Reduces human error and provides consistent results.

 Enables testing of large test suites and frequent regression testing.

 Frees up manual testers for more complex tasks.

 Can be integrated into CI/CD pipelines for continuous feedback.

o Cons:

 Requires significant upfront investment in tools, infrastructure, and


training.

 Test scripts need to be maintained as the application evolves.

 May not be suitable for highly exploratory or usability testing.

 Identifying and fixing bugs in automation scripts can be challenging.

Synergy: In practice, a combination of manual and automated testing often yields the best
results. Manual testing handles the exploratory and usability aspects, while automation handles
the repetitive and regression testing tasks.

Selenium
Introduction
Selenium is not a single tool but rather a suite of tools and libraries designed for automating web
browsers. It provides a way for developers and testers to write scripts in various programming
languages that can then be executed across different browsers and operating systems.

Key aspects of Selenium:

 Open-Source: Free to use, modify, and distribute.

 Browser Automation: Its core purpose is to control web browsers programmatically.

 Language Agnostic: Supports popular programming languages like Java, Python, C#,
Ruby, JavaScript, etc.

 Platform Independent: Scripts can run on Windows, macOS, Linux, and various mobile
platforms (with additional tools).

 Cross-Browser Support: Works with major browsers like Chrome, Firefox, Safari,
Edge, Internet Explorer, etc.

Selenium Components
The Selenium project is composed of several key components, each serving a specific purpose:

1. Selenium WebDriver:

o Description: This is the core API and the most important component of
Selenium. It provides a programmatic interface to control web browsers.
WebDriver interacts directly with the browser's native automation support (often
through the W3C WebDriver standard) or through browser-specific drivers.

 Selenium WebDriver is a powerful tool for writing automated tests on various


browsers.

o Functionality: Allows you to write code to perform actions like finding elements
on a page, clicking buttons, entering text, navigating URLs, and retrieving
information from the DOM.
o Example: driver.get("https://www.google.com");

2. Selenium IDE (Integrated Development Environment):

o Description: A browser extension (available for Chrome and Firefox) that allows
for record-and-playback functionality. You can record your interactions with a
web application and generate simple test scripts.

 Selenium IDE is a record-and-playback tool for creating simple scripts without


programming.

o Functionality: Primarily used for creating simple test cases quickly or for
debugging. It can export scripts in various languages (Java, Python, etc.).

o Limitations: Less robust for complex scenarios, lacks advanced control flow, and
is not suitable for large-scale automation projects.

3. Selenium Grid:

o Description: A tool that allows you to run tests simultaneously across multiple
machines, browsers, and operating systems. This significantly speeds up test
execution, especially for large regression suites.

 Selenium Grid Enables running tests on multiple machines in parallel.

o Functionality: It works by distributing test executions to different "nodes"


(machines) that are set up to run specific browser/OS combinations. A "hub"
manages the distribution of tests.

o Use Case: Parallel execution of tests, cross-browser testing, cross-platform


testing.

4. Selenium RC (Remote Control - Deprecated):

o Description: An older component of Selenium that injected JavaScript into the


browser to perform automation. It required a Selenium Server to be running.
o Status: Selenium WebDriver has largely replaced Selenium RC due to its
greater efficiency, stability, and direct browser interaction. It's recommended to
use WebDriver for new projects.

Selenium vs. Other Testing Tools

Selenium is a popular choice, but it's important to understand how it compares to other testing
tools, especially in the web automation space.

Feature Selenium Cypress Playwright Postman

Web Application
API Testing
Modern Web Testing (End-to-End),
Web Browser (REST,
Primary Application Testing Cross-Browser &
Automation (End- GraphQL,
Focus (End-to-End, Cross-Platform
to-End Testing) SOAP), Manual
Integration) (including mobile
API Interaction
emulation)

WebDriver Runs within the Uses its own protocol


Client-Server
protocol, external browser, direct DOM over WebSocket,
Architecture interaction for
drivers for each access, no WebDriver direct browser
API requests.
browser. protocol dependency. control.

JavaScript (for
Java, Python, C#, JavaScript/ scripting), but
JavaScript/
Languages Ruby, JavaScript, TypeScript, Python, primarily
TypeScript only.
etc. Java, .NET. configuration-
based.

Browser Wide (Chrome, Limited (Chrome, Excellent (Chromium, Not applicable


Support Firefox, Safari, Firefox, Edge, Firefox, WebKit). (tests APIs, not
Edge, IE, etc.) Electron). Safari Supports Chrome, browsers).
Feature Selenium Cypress Playwright Postman

support is via Edge, Firefox, Safari,


Playwright. mobile emulation.

Generally good,
can be slower due
Execution Very fast due to in- Very fast, similar to or Fast for API
to WebDriver
Speed browser execution. faster than Cypress. calls.
communication
overhead.

Very stable and


Can be prone to
Known for its reliable, excellent
timing issues and Reliable for API
Reliability stability and built-in handling of
flakiness if not testing.
retry mechanisms. asynchronous
handled properly.
operations.

Moderate to high Very easy for


learning curve, Relatively easy to set Moderate learning manual API
Ease of Use especially for up and use, good curve, powerful but interaction and
setting up developer experience. more feature-rich. simple script
infrastructure. creation.

Request/response
Auto-waiting, parallel
inspection,
Cross-browser, Auto-waiting, time- execution, cross-
environment
cross-language, travel debugging, browser emulation
Key management,
large community, real-time reloads, (including mobile),
Features mock servers,
extensive network mocking, network interception,
test runners,
integrations. visual testing. screenshot/video
collaboration
recording.
features.

Use Cases Comprehensive Modern web app End-to-end testing Functional


end-to-end testing, testing, component across all modern testing of APIs,
Feature Selenium Cypress Playwright Postman

performance
testing, integration browsers and devices,
cross-browser testing of APIs,
testing, rapid API testing
compatibility. contract testing,
development cycles. integration.
mock services.

Library/API.
Requires external
API
Framework test All-in-one testing All-in-one testing
Development
Type runners/framework framework. framework.
and Testing Tool.
s (e.g., TestNG,
JUnit).

When to choose Selenium:

 You need to test across a wide range of browsers, including older versions like IE.

 You need to use programming languages other than JavaScript.

 You have a large existing test suite written in Selenium.

 You need deep integration with various CI/CD tools and reporting frameworks.

Advantages of Selenium

Selenium offers several compelling advantages, making it a popular choice for web automation:

1. Open Source and Free: This is a significant advantage, as it eliminates licensing costs
and allows for customization.

2. Cross-Browser Compatibility: Supports testing across major web browsers like


Chrome, Firefox, Safari, Edge, and Internet Explorer, ensuring your web application
functions consistently across different user environments.

3. Language Agnostic: Testers can write automation scripts in their preferred programming
languages (Java, Python, C#, Ruby, JavaScript), allowing teams to leverage existing
skills.
4. Platform Independence: Scripts can be executed on various operating systems
(Windows, macOS, Linux), providing flexibility in test execution environments.

5. Large Community Support: Being a popular open-source tool, Selenium benefits from
a vast and active community. This means readily available resources, tutorials, forums,
and quick solutions to common problems.

6. Flexibility and Extensibility: Selenium is a framework, not a monolithic tool. It can be


extended and integrated with other tools and frameworks to create robust test automation
solutions.

7. Integration Capabilities: Seamlessly integrates with popular testing frameworks


(TestNG, JUnit, Pytest), build automation tools (Maven, Gradle), CI/CD tools (Jenkins,
GitLab CI), and reporting tools (ExtentReports, Allure).

8. Cost-Effective: Eliminates the need for expensive commercial automation tools, making
it an attractive option for organizations of all sizes.

9. Supports Agile Development: Its ability to run tests quickly and frequently makes it
ideal for Agile and DevOps environments, providing rapid feedback on code changes.

10. Simulates Real User Behavior: By interacting with the browser like a real user,
Selenium provides realistic end-to-end testing.

Integration of Selenium with Other Tools

The real power of Selenium often comes from its ability to integrate with other tools in the
software development and testing ecosystem. This creates a comprehensive and efficient
automation framework.

1. Test Management Tools:

o Examples: Jira (with plugins like Zephyr or Xray), TestRail, qTest.

o Integration: Test cases can be managed in these tools, and Selenium test results
can be linked back, providing a centralized view of testing progress and defect
tracking.
2. Unit Testing Frameworks:

o Examples: JUnit (Java), TestNG (Java), Pytest (Python), NUnit (.NET).

o Integration: Selenium WebDriver acts as the API, but these frameworks are used
to structure the tests, define test methods, manage test execution order, and handle
assertions.

3. Build Automation & Dependency Management Tools:

o Examples: Maven, Gradle, npm.

o Integration: These tools manage project dependencies (including Selenium


libraries), compile code, and trigger test execution as part of the build process.

4. Continuous Integration/Continuous Deployment (CI/CD) Tools:

o Examples: Jenkins, GitLab CI, CircleCI, Travis CI, Azure DevOps.

o Integration: Selenium tests are automatically triggered whenever new code is


committed, ensuring continuous testing and rapid feedback. This is crucial for
DevOps practices.

5. Reporting Tools:

o Examples: ExtentReports, Allure Report, ReportNG.

o Integration: Provide more sophisticated and visually appealing test reports than
default framework reports. They can include screenshots, logs, and detailed
pass/fail status, which are invaluable for analysis.

6. Page Object Model (POM) and Data-Driven Frameworks:

o Integration: While not external tools, these are design patterns and testing
methodologies that are often implemented with Selenium to improve test
maintainability, readability, and reusability.

 POM: Encapsulates page elements and interactions into separate classes,


making tests cleaner and easier to update.
 Data-Driven: Separates test data from test logic, allowing tests to be run
with multiple data sets from external sources (like Excel, CSV, or
databases).

7. Behavior-Driven Development (BDD) Frameworks:

o Examples: Cucumber, SpecFlow.

o Integration: Selenium WebDriver can be used as the underlying automation


library for BDD frameworks. This allows tests to be written in a human-readable
format (like Gherkin) which is then mapped to Selenium code.

8. Cloud Testing Platforms:

o Examples: BrowserStack, Sauce Labs, LambdaTest.

o Integration: These platforms provide access to a vast array of real devices and
browser versions in the cloud. You can configure Selenium to run your tests on
these platforms, enabling extensive cross-browser and cross-device testing
without maintaining your own infrastructure.

9. API Testing Tools (for Hybrid Testing):

o Examples: Postman, RestAssured.

o Integration: For applications that have both a UI and an API, Selenium can be
used to automate UI tests, while tools like RestAssured or Postman can be used to
automate API tests. This provides comprehensive testing coverage.

By strategically integrating Selenium with these tools, organizations can build powerful,
efficient, and scalable test automation solutions that significantly improve the quality and
reliability of their web applications.

Selenium Components: Detailed Notes


Selenium is a powerful open-source automation testing framework used for web application
testing. It supports multiple programming languages (Java, Python, C#, etc.) and browsers.
Below is a detailed breakdown of its core components, their purposes, and use cases.

1. Purposes and Functionalities of Selenium Components

Component Purpose & Key Functionalities

Selenium IDE Record and playback of tests, beginner-friendly, Firefox/Chrome extension

Legacy tool (deprecated), allowed tests in any programming language via JS


Selenium RC
injection

Selenium Modern, direct communication with browsers, supports advanced


WebDriver interactions (clicks, typing, assertions)

Selenium Grid Distributed test execution across multiple machines (browsers, OS, devices)

2. Understanding the Components

🧩 Selenium IDE (Integrated Development Environment)

 What? A browser plugin for recording and playing back tests (like a macro tool).

 Functionality:

o No coding required for basic tests.

o Supports export to programming languages (Java, Python, etc.).

o Limited to simple test cases (not suitable for complex logic).

 Use Case: Quick prototyping, exploratory testing, or regression testing for small projects.

 Limitations:

o Doesn't support conditional logic or data-driven testing well.


o Deprecated in favor of Katalon Recorder or Playwright DevTools.

🚫 Selenium RC (Remote Control) [Deprecated]

 What? The original Selenium tool that allowed test scripting in multiple languages.

 How It Worked:

o JavaScript was injected into browsers to simulate user actions.

o Required an RC server to act as a proxy between the test and the browser.

 Why Deprecated?

o Slower and less reliable due to JS injection.

o Replaced by WebDriver, which communicates directly with browsers via native


APIs.

🚗 Selenium WebDriver

 What? The standard tool for browser automation, replacing RC.

 Key Features:

o Direct communication with browsers via native APIs (no JS injection).

o Supports multiple languages (Java, Python, C#, etc.).

o Handles dynamic content (AJAX, JavaScript-heavy apps).

o Supports advanced interactions (drag-and-drop, keyboard/mouse events).

 Architecture:

 Test Script → WebDriver API → Browser-specific Driver (ChromeDriver, GeckoDriver)


→ Browser

 Use Case: Most modern Selenium projects use WebDriver for reliable and efficient test
automation.

🔗 Selenium Grid
 What? A tool for running tests in parallel across multiple machines, browsers, and OS.

 Key Features:

o Hub-Node Architecture:

 Hub: Central point that receives test requests and routes them to nodes.

 Node: Machines (physical/virtual) that register with the hub and execute
tests.

o Supports cross-browser and cross-device testing.

o Reduces test execution time by running tests in parallel.

 Use Case:

o Testing on multiple browser versions (Chrome 120, Firefox 115, etc.).

o Testing on different OS (Windows, macOS, Linux).

o CI/CD pipelines where fast feedback is crucial.

3. When to Use Selenium Grid?

Scenario Use Grid?

❌ No (WebDriver alone
Testing on a single browser/OS
suffices)

Testing on multiple browsers (Chrome, Firefox, Safari) ✅ Yes

Testing on different OS versions (Windows 10/11, macOS


✅ Yes
Ventura/Sonoma)

Running tests in parallel to save time ✅ Yes

Cloud-based testing (e.g., BrowserStack, Sauce Labs) ✅ Yes (Grid is built-in)

Example Grid Command:


bash

Download

Copy code

Wrap

# Start Grid Hub

java -jar selenium-server-standalone.jar -role hub

# Start Grid Node (register with hub)

java -jar selenium-server-standalone.jar -role node -hub http://localhost:4444/grid/register

WebDriver Code for Grid:

python

Download

Copy code

Wrap

from selenium import webdriver

# Connect to Grid hub

capabilities = {

"browserName": "chrome",

"version": "latest",

"platform": "WINDOWS"

}
driver = webdriver.Remote(

command_executor='http://localhost:4444/wd/hub',

desired_capabilities=capabilities

driver.get("https://example.com")

print(driver.title)

driver.quit()

4. Comparison of Selenium Tools

Feature Selenium IDE Selenium RC WebDriver Grid

No No (but needs
Coding Required? Yes Yes
(record/playback) WebDriver)

No
Modern? No (limited) Yes (standard) Yes (for scaling)
(deprecated)

Parallel No (unless used


No No Yes
Execution? with Grid)

Cross-Browser
Limited Yes Yes Yes
Support?

Legacy Most automation Scalable, distributed


Best For? Quick tests
projects needs testing

5. Best Practices
✔ Prefer WebDriver over RC/IDE for production-grade tests.
✔ Use Grid for parallel and cross-browser testing.
✔ Leverage Page Object Model (POM) for better code maintainability.
✔ Integrate with CI/CD tools (Jenkins, GitHub Actions) for automated testing.
✔ Use cloud platforms (BrowserStack, LambdaTest) for scalable Grid testing without
infrastructure.

Conclusion

 Selenium IDE: Quick, no-code testing (but limited).

 Selenium RC: Legacy, do not use for new projects.

 Selenium WebDriver: The core tool for modern browser automation.

 Selenium Grid: Essential for large-scale, distributed testing.

Would you like code examples for any specific component? 🚀

Selenium WebDriver: Detailed Notes

1. What is WebDriver?

 Definition: WebDriver is a cross-platform testing framework that allows automation of


web browsers. It provides a programming interface to create and execute test cases.

 Key Features:

o Direct communication with browsers via native APIs (no JS injection like
Selenium RC).

o Supports multiple programming languages (Java, Python, C#, etc.).

o Works with modern browsers (Chrome, Firefox, Edge, Safari).

o Handles dynamic content (AJAX, JavaScript-heavy apps).


o Supports advanced interactions (mouse movements, keyboard inputs, drag-and-
drop).

2. Selenium Architecture (With WebDriver)

+-------------------+ +-----------------------+ +------------------+

| Test Script | | WebDriver API | | Browser-specific |

| (Java/Python/etc.)| --> | (Selenium Client Lib) | --> | Driver (ChromeDriver, GeckoDriver) |

+-------------------+ +-----------------------+ +------------------+

+----------------------+

| Browser (Chrome/Firefox) |

+----------------------+

 Test Script: Written in any supported language.

 WebDriver API: Acts as an interface between the script and the browser.

 Browser-specific Driver: Converts WebDriver commands to browser-native actions.

 Browser: Executes the commands (e.g., click, type, navigate).

3. Third-Party Drivers and Plugins

WebDriver requires browser-specific drivers to communicate with each browser. Some popular
ones:

Browser Driver Plugin/Extension

Chrome ChromeDriver Chrome DevTools Protocol


Browser Driver Plugin/Extension

Firefox GeckoDriver Marionette Protocol

Edge Microsoft EdgeDriver WebDriver W3C Standard

Safari SafariDriver Safari Technology Preview

Headless Browsers Puppeteer, Playwright Shadow DOM support

Mobile Browsers Appium Supports Android/iOS automation

Additional Plugins & Libraries

 Unified Driver (Selenium 4+): Single executable (selenium-manager) that auto-


downloads drivers.

 BrowserStack/Sauce Labs: Cloud-based testing platforms with built-in drivers.

 Katalon Studio, TestNG, JUnit: Testing frameworks that integrate with WebDriver.

4. Driver Requirements

To use WebDriver, you must:

1. Install the browser-specific driver (e.g., ChromeDriver for Chrome).

2. Ensure driver version matches the browser version (e.g., Chrome 120 needs
ChromeDriver 120).

3. Set the driver executable path in the code or system environment.

4. For headless mode, additional configurations may be required (e.g., disabling GPU
acceleration in Chrome).

Example (Setting Driver Path in Python):

python
Download

Copy code

Wrap

from selenium import webdriver

# Option 1: Direct path

driver = webdriver.Chrome(executable_path="/path/to/chromedriver")

# Option 2: Use service (Selenium 4+ recommended)

from selenium.webdriver.chrome.service import Service

service = Service("/path/to/chromedriver")

driver = webdriver.Chrome(service=service)

Selenium 4+ Auto-Driver Management (No Manual Download):

python

Download

Copy code

Wrap

from selenium import webdriver

# Selenium Manager auto-downloads ChromeDriver if not present

driver = webdriver.Chrome()

5. Simple Program in Selenium WebDriver (Python Example)


python

Download

Copy code

Wrap

from selenium import webdriver

from selenium.webdriver.common.by import By

import time

# Initialize WebDriver (Chrome)

driver = webdriver.Chrome()

# Open a webpage

driver.get("https://www.example.com")

# Maximize window

driver.maximize_window()

# Print page title

print("Page Title:", driver.title)

# Find an element by ID and click it

try:

element = driver.find_element(By.ID, "some-button-id")


element.click()

except Exception as e:

print("Element not found:", e)

# Wait for 3 seconds

time.sleep(3)

# Close the browser

driver.quit()

6. WebDriver Methods (Key APIs)

WebDriver provides numerous methods for browser interaction. Here are the most commonly
used ones:

✅ Navigation Methods

Method Description

driver.get("url") Loads a web page.

driver.back() Goes back to previous page.

driver.forward() Goes forward to next page.

driver.refresh() Reloads current page.

✅ Element Locators

Method Description

driver.find_element(By.ID, "id") Finds element by ID.


Method Description

driver.find_element(By.NAME, "name") Finds element by name.

driver.find_element(By.XPATH, "xpath") Finds element by XPath.

driver.find_element(By.CSS_SELECTOR, "css") Finds element by CSS selector.

driver.find_elements(By.TAG_NAME, "tag") Finds all elements with given tag.

✅ Element Interaction Methods

Method Description

element.click() Clicks on an element.

element.send_keys("text") Types text into an input field.

element.clear() Clears text from an input field.

element.submit() Submits a form.

element.get_attribute("attribute") Gets an attribute value (e.g., href, value).

element.is_displayed() Checks if element is visible.

element.is_enabled() Checks if element is enabled.

element.is_selected() Checks if element (checkbox/radio) is selected.

✅ Browser Window Methods

Method Description

driver.maximize_window() Maximizes browser window.

driver.minimize_window() Minimizes browser window.

driver.fullscreen_window() Puts browser in fullscreen mode.

driver.get_window_size() Gets current window size.

driver.set_window_size(width, height) Sets window size.


Method Description

✅ Wait Methods

Method Description

driver.implicitly_wait(10) Waits up to 10 seconds for elements to appear.

Explicit wait for a condition (e.g., element


WebDriverWait(driver, 10).until(condition)
visibility).

✅ JavaScript Execution

python

Download

Copy code

Wrap

driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

✅ Cookie & Session Management

Method Description

driver.get_cookies() Gets all cookies.

driver.add_cookie(cookie_dict) Adds a cookie.

driver.delete_cookie("name") Deletes a specific cookie.

driver.delete_all_cookies() Deletes all cookies.

7. Best Practices for WebDriver

✔ Use Explicit Waits (WebDriverWait) instead of time.sleep().


✔ Follow Page Object Model (POM) for maintainable code.
✔ Handle dynamic elements with retries or conditional waits.
✔ Quitting the driver properly (driver.quit()) to release resources.
✔ Use logging and screenshots for debugging failed tests.

Example: Explicit Wait in Python

python

Download

Copy code

Wrap

from selenium.webdriver.common.by import By

from selenium.webdriver.support.ui import WebDriverWait

from selenium.webdriver.support import expected_conditions as EC

# Wait up to 10 seconds for an element to be clickable

wait = WebDriverWait(driver, 10)

element = wait.until(EC.element_to_be_clickable((By.ID, "dynamic-button")))

element.click()

Would you like a more detailed example in another language (Java/C#)? 🚀

Types of Browser Launching in Selenium: Detailed Notes

1. Desired Capabilities
 Definition: Desired Capabilities are a set of key-value pairs that define the requirements
of the test environment (browser, OS, mobile device, etc.).

 Purpose:

o Specify browser options (e.g., headless mode, incognito, mobile emulation).

o Set platform-specific preferences (e.g., disabling images, enabling geolocation).

o Work with Selenium Grid for distributed testing.

 Example (Python):

python

Download

Copy code

Wrap

from selenium import webdriver

capabilities = {

"browserName": "chrome",

"platformName": "Windows 10",

"goog:chromeOptions": {

"args": ["--headless", "--disable-gpu"]

driver = webdriver.Chrome(desired_capabilities=capabilities)

Selenium 4+ Update:
 DesiredCapabilities is deprecated. Use Options classes instead:

python

Download

Copy code

Wrap

from selenium.webdriver.chrome.options import Options

chrome_options = Options()

chrome_options.add_argument("--headless")

chrome_options.add_argument("--disable-gpu")

driver = webdriver.Chrome(options=chrome_options)

2. Downloading Driver Files

Each browser requires a browser-specific driver to communicate with Selenium WebDriver.

Browser Driver Download Link Notes

Chrome ChromeDriver Must match Chrome browser version

Firefox GeckoDriver Works with Firefox ≥ 47

Edge Microsoft EdgeDriver Supports Chromium-based Edge

Safari SafariDriver Built into Safari 10+ (macOS only)

Internet Explorer IEDriverServer Requires IE 8+ (Deprecated)

Selenium 4+ Auto-Driver Management:


 Selenium Manager (built into Selenium 4+) automatically downloads drivers if not found
locally.

python

Download

Copy code

Wrap

from selenium import webdriver

driver = webdriver.Chrome() # No need to specify driver path

3. Downloading Selenium JAR Files (Java Projects)

 For Java-based projects, you need the Selenium JAR files.

 Download Links:

o Selenium Official Releases

o Maven dependency (recommended):

xml

Download

Copy code

Wrap

<!-- Selenium Java Dependency (Maven) -->

<dependency>

<groupId>org.seleniumhq.selenium</groupId>

<artifactId>selenium-java</artifactId>
<version>4.12.0</version> <!-- Check latest version -->

</dependency>

4. Chrome Browser Launching

Steps:

1. Install Chrome Browser.

2. Download ChromeDriver (or use Selenium Manager).

3. Launch via WebDriver.

Example (Python):

python

Download

Copy code

Wrap

from selenium import webdriver

from selenium.webdriver.chrome.options import Options

# Optional: Headless mode

chrome_options = Options()

chrome_options.add_argument("--headless=new") # Chrome 112+ recommended

driver = webdriver.Chrome(options=chrome_options)

driver.get("https://www.example.com")

print(driver.title)
driver.quit()

Chrome Options:

 --incognito: Launch in private mode.

 --start-maximized: Start browser maximized.

 --disable-extensions: Disable extensions.

5. Safari Browser Launching

 Requirements:

o macOS only.

o Safari 10 or later.

o Enable WebDriver in Safari Preferences > Advanced > Show Develop menu.

o Enable Remote Automation in Develop > Allow Remote Automation.

Example (Python):

python

Download

Copy code

Wrap

from selenium import webdriver

# SafariDriver is built into Safari (no separate driver needed)

driver = webdriver.Safari()

driver.get("https://www.example.com")
print(driver.title)

driver.quit()

Limitations:

 No headless mode support (as of 2023).

 Limited customization compared to Chrome/Firefox.

6. Internet Explorer (IE) Browser Launching

 Deprecated (Microsoft Edge is recommended).

 Requirements:

o IE 8 or later.

o IEDriverServer.exe (must match IE version).

Example (Java):

java

Download

Copy code

Wrap

import org.openqa.selenium.WebDriver;

import org.openqa.selenium.ie.InternetExplorerDriver;

import java.util.HashMap;

import java.util.Map;

public class IEExample {

public static void main(String[] args) {


Map<String, Object> ieOptions = new HashMap<>();

ieOptions.put("ignoreZoomSetting", true);

ieOptions.put("ie.ensureCleanSession", true);

System.setProperty("webdriver.ie.driver", "path/to/IEDriverServer.exe");

WebDriver driver = new InternetExplorerDriver(ieOptions);

driver.get("https://www.example.com");

System.out.println(driver.getTitle());

driver.quit();

Common IE Issues:

 Pop-up blockers may interfere.

 Zoom level must be 100%.

 Slow performance compared to modern browsers.

7. Installing FireBug and FirePath (Deprecated)

 FireBug: A legacy Firefox extension for inspecting HTML/CSS (replaced by Firefox


DevTools).

 FirePath: An add-on for FireBug to evaluate XPath expressions (now integrated into
DevTools).

 Modern Alternative:
o Use Firefox Developer Tools (F12) for inspecting elements and generating
locators.

o Use XPath Checker or SelectorGadget for XPath/CSS selectors.

8. Firefox Browser Launching

Steps:

1. Install Firefox Browser.

2. Download GeckoDriver (or use Selenium Manager).

3. Launch via WebDriver.

Example (Python):

python

Download

Copy code

Wrap

from selenium import webdriver

from selenium.webdriver.firefox.options import Options

# Optional: Headless mode

firefox_options = Options()

firefox_options.add_argument("--headless")

driver = webdriver.Firefox(options=firefox_options)

driver.get("https://www.example.com")
print(driver.title)

driver.quit()

Firefox Options:

 --private: Launch in private browsing mode.

 --width=1920 --height=1080: Set window size.

 --profile: Specify a custom Firefox profile.

Mozilla Profile Example:

python

Download

Copy code

Wrap

from selenium import webdriver

from selenium.webdriver.firefox.firefox_profile import FirefoxProfile

profile = FirefoxProfile()

profile.set_preference("dom.popup_maximum", 0) # Disable pop-up blocker

driver = webdriver.Firefox(firefox_profile=profile)

Summary of Browser Launching

Browser Driver Headless Support Notes

Chrome ChromeDriver Yes (--headless=new) Most widely used


Browser Driver Headless Support Notes

Firefox GeckoDriver Yes (--headless) Strong DevTools support

Edge MS EdgeDriver Yes (--headless=new) Chromium-based

Safari SafariDriver No (macOS only) Limited customization

IE IEDriverServer No Deprecated, avoid if possible

Would you like examples in another language (Java/C#)? 🚀

Comprehensive Guide to Web Locators in Selenium

Introduction to Locators

Locators are the backbone of web automation, allowing testers to identify and interact with
elements on a web page. Different locator strategies offer various ways to find elements, each
with its own advantages and limitations.

Types of Locators

1. ID Locator

python

Download

Copy code

Wrap

driver.find_element(By.ID, "element_id")

 Description: Finds elements by their unique ID attribute

 Characteristics:

o Most preferred locator (fastest and most reliable)

o IDs should be unique per HTML specification


o Rarely changes between page loads

 When to use: Always prefer ID when available

 Example: <input id="username" type="text">

2. Name Locator

python

Download

Copy code

Wrap

driver.find_element(By.NAME, "element_name")

 Description: Finds elements by their name attribute

 Characteristics:

o Not always unique (multiple elements can share same name)

o More likely to change than IDs

o Commonly used for form elements

 When to use: When ID isn't available but name is unique

 Example: <input name="email" type="text">

3. ClassName Locator

python

Download

Copy code

Wrap

driver.find_element(By.CLASS_NAME, "class_name")

 Description: Finds elements by their class attribute


 Characteristics:

o Rarely unique (classes are meant for styling groups)

o Very likely to change if UI is updated

 When to use: When no better options exist and class is stable

 Example: <div class="btn btn-primary">Submit</div>

4. TagName Locator

python

Download

Copy code

Wrap

driver.find_element(By.TAG_NAME, "tag_name")

 Description: Finds elements by their HTML tag

 Characteristics:

o Least specific locator

o Only useful when combined with other strategies

 When to use: For collecting groups of similar elements

 Example: <table>, <input>, <a>

5. LinkText Locator

python

Download

Copy code

Wrap

driver.find_element(By.LINK_TEXT, "exact_link_text")
 Description: Finds anchor (<a>) elements by exact text

 Characteristics:

o Works only with hyperlinks

o Text content must match exactly

o Vulnerable to UI text changes

 When to use: For navigation links with stable text

 Example: <a href="/about">About Us</a>

6. PartialLinkText Locator

python

Download

Copy code

Wrap

driver.find_element(By.PARTIAL_LINK_TEXT, "partial_text")

 Description: Finds anchor elements by partial text match

 Characteristics:

o More flexible than LinkText

o Still limited to anchor elements

 When to use: When you know part of the link text

 Example: <a href="/contact">Contact Support</a>

7. XPath Locator

python

Download
Copy code

Wrap

driver.find_element(By.XPATH, "xpath_expression")

 Description: Powerful XML path language for element location

 Characteristics:

o Most flexible locator strategy

o Can traverse DOM hierarchy

o Two types: Absolute (/) and Relative (//)

o Slower than CSS selectors

 When to use: When other locators fail or complex navigation needed

 Examples:

o //input[@id='username']

o //div[@class='container']//a[contains(text(),'Login')]

8. CSS Selector Locator

python

Download

Copy code

Wrap

driver.find_element(By.CSS_SELECTOR, "css_selector")

 Description: Uses CSS selector syntax to find elements

 Characteristics:

o Faster than XPath in most browsers


o More readable than XPath

o Can use combinations of attributes

 When to use: Preferred over XPath when possible

 Examples:

o input#username

o div.container > a.btn-primary

Locator Stability

Factors Affecting Locator Stability:

1. Dynamic Elements:

o Elements with auto-generated IDs/classes (common in modern frameworks)

o Example: id="input-12x8h9s3" changes on each page load

2. Application Changes:

o UI redesigns often change class names and structures

o Text content may be updated for localization or A/B testing

3. Context Changes:

o Elements may appear/disappear based on application state

o Modals, dynamic menus, and lazy-loaded content

Most Stable to Least Stable Locators:

1. ID (if properly implemented)

2. Name (for form elements)

3. CSS Selectors (well-designed)

4. XPath (relative paths)


5. LinkText/PartialLinkText

6. ClassName

7. TagName

Best Practices

1. Preference Order:

o ID > Name > CSS Selector > XPath > Others

2. Creating Robust Locators:

python

Download

Copy code

Wrap

# Good - uses stable ID

username = driver.find_element(By.ID, "login-username")

# Better CSS - combines tag and attribute

submit_btn = driver.find_element(By.CSS_SELECTOR, "button[type='submit']")

# Robust XPath - uses contains for partial matches

menu_item = driver.find_element(By.XPATH, "//a[contains(@class,'nav-item') and


text()='Products']")

3. Avoid:

o Absolute XPaths (/html/body/div[3]/div[2]) - breaks easily

o Overly generic selectors (.btn) - may match multiple elements


o Locators dependent on text that changes frequently

Browser DevTools for Locator Inspection

Chrome/Firefox/Edge:

1. Right-click element → "Inspect"

2. In Elements panel:

o Right-click element → "Copy" → Various locator options

o Hover to see element boundaries

3. Console testing:

javascript

Download

Copy code

Wrap

// Test CSS selector

document.querySelector("input#username")

// Test XPath

$x("//input[@id='username']")

Cross-Browser Differences:

 Generated XPaths/CSS may vary slightly between browsers

 Some attributes may be browser-specific

 Shadow DOM handling differs

Handling Dynamic Elements

1. Contains Matching:
python

Download

Copy code

Wrap

# CSS

driver.find_element(By.CSS_SELECTOR, "div[id*='partial-id']")

# XPath

driver.find_element(By.XPATH, "//div[contains(@id,'partial-id')]")

2. Starts/Ends With:

python

Download

Copy code

Wrap

# CSS

driver.find_element(By.CSS_SELECTOR, "div[id^='start-text']")

driver.find_element(By.CSS_SELECTOR, "div[id$='end-text']")

3. Text Matching:

python

Download

Copy code

Wrap

# XPath text contains


driver.find_element(By.XPATH, "//button[contains(text(),'Submit')]")

By understanding these locator strategies and their characteristics, you can create more reliable
and maintainable test automation scripts that withstand application changes and dynamic content.

XPath – Detailed Reference Notes


(Prepared: 27-Jul-2025)

────────────────────────────────────────

1. What is XPath?
• A query language for XML/HTML to navigate nodes (elements, attributes, text,
comments, etc.).
• In web-automation (Selenium, Cypress, Playwright) XPath is the primary locator when
id/class/css are insufficient.
• Absolute path starts with “/”; relative path starts with “//”.
• Node tests: element names, “*”, “text()”, “comment()”, “node()”.
• Predicates are enclosed in [] and can be chained //div[@id='main'][1].

────────────────────────────────────────
2. Contains (partial‐match)
Syntax: contains(string, substring)
Example:
//input[contains(@class,'btn-primary')] – matches <input class="btn btn-primary lg">.
Notes:
• Case-sensitive.
• Works on attributes, text, or combined: //span[contains(text(),'Save') and
contains(@class,'icon')].

────────────────────────────────────────
3. Text XPath (exact text)
Syntax: text()='Exact Text'
Example:
//a[text()='Sign in'] – exact match, fails if extra whitespace.
Combine with normalize-space(): //button[normalize-space(text())='Add to cart'].
────────────────────────────────────────
4. Text Contains XPath (substring in text)
Syntax: contains(text(),'partial')
Example:
//p[contains(text(),'Terms and Conditions')]
Note: Use normalize-space() to trim:
//label[contains(normalize-space(text()),'I agree')].

────────────────────────────────────────
5. Attribute with Contains
General form: //*[contains(@attr,'value')]
Useful when classes are dynamic: //div[contains(@class,'product-')] matches product-123,
product-456.
Chain multiple attributes:
//input[@type='text' and contains(@name,'user') and contains(@placeholder,'Search')].

────────────────────────────────────────
6. Axes Reference (quick cheat-sheet)

Axis (abbrev.) Direction / Target Example

following All nodes after the closing tag //tr[1]/following::td


ancestor All parent/grand-parent elements //span[@id='x']/ancestor::div
child (default) Direct children //ul[@id='menu']/child::li
preceding All nodes before the start tag //button[@id='b']/preceding::h2
following-sibling Siblings after current node //li[@class='active']/following-sibling::li
parent (..) One level up //img/parent::a
self (.) Current node itself //div[@role='main']/self::div
descendant Any level below //section/descendant::p

────────────────────────────────────────
7. Deep-dive & Practical Examples
A. Contains & Text together
//a[contains(text(),'Download') and contains(@href,'.pdf')]

B. Dynamic table row selection


//table[@id='orders']/tbody/tr[contains(.,'Pending')]/following-sibling::tr[1]
– finds the next row after a row whose text contains “Pending”.

C. Complex CSS-class matching


//div[contains(concat(' ',normalize-space(@class),' '), ' active ')]
– safest way to match exact class token.

D. Chained Axes
//label[text()='Email']/following-sibling::input[1]
– finds the first <input> after the label “Email”.

E. Parent + Ancestor
//span[text()='Error']/parent::div/ancestor::form[@id='loginForm']
– navigates up to the enclosing form.

────────────────────────────────────────
8. Tips & Tricks

1. Always prefer relative XPath (starts with “//”).

2. Avoid * when possible—use specific tag names to improve speed.

3. Use normalize-space() to handle whitespace in text/attributes.

4. Combine multiple predicates for robustness instead of long, brittle paths.

5. In DevTools (Chrome):
• Press F12 → Elements → Ctrl+F → enter XPath to test instantly.
• $x("//your/xpath") in console returns matching nodes.

6. Performance: XPath is slower than CSS selectors in some browsers; cache element when
reused.
7. Escape quotes:
• In Java: By.xpath("//div[contains(@id, \"'quote'\")]")
• In Python: '//div[contains(@id,\'quote\')]'.

────────────────────────────────────────
9. Quick Reference Card (printable)

//tag[@attr='value'] Exact attribute


//tag[contains(@attr,'val')] Contains attribute
//tag[text()='Exact'] Exact text
//tag[contains(text(),'part')] Partial text
//tag[@id='x']/following::input Following axis
//tag[@id='x']/ancestor::div Ancestor axis
//tag[@id='x']/child::span Child axis
//tag[@id='x']/preceding::h1 Preceding axis
//tag[@id='x']/following-sibling::li Following-sibling axis
//tag[@id='x']/parent::div Parent axis
//tag[@id='x']/self::tag Self axis
//tag[@id='x']/descendant::a Descendant axis

────────────────────────────────────────
End of notes

You might also like