KEMBAR78
Web Scraping 2 | PDF | World Wide Web | Internet & Web
0% found this document useful (0 votes)
10 views14 pages

Web Scraping 2

This document outlines a Python project focused on data science, specifically web scraping, data visualization, and dashboard creation using libraries like Pandas and Beautiful Soup. It covers the definition of web scraping, its applications, legal considerations, and the workflow involved in the process. The project aims to enhance participants' skills and provide a valuable addition to their job portfolios.

Uploaded by

doha419311
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views14 pages

Web Scraping 2

This document outlines a Python project focused on data science, specifically web scraping, data visualization, and dashboard creation using libraries like Pandas and Beautiful Soup. It covers the definition of web scraping, its applications, legal considerations, and the workflow involved in the process. The project aims to enhance participants' skills and provide a valuable addition to their job portfolios.

Uploaded by

doha419311
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Python Project for Data Science

17/04/2024 1
Web Scraping

2
Course Overview
You will perform specific data science and data analytics tasks such as extracting data,
web scraping, visualizing data and creating a dashboard. This project will showcase your
proficiency with Python and using libraries such as Pandas and Beautiful Soup within a
Jupyter Notebook. Upon completion you will have an impressive project to add to your job
portfolio.

3
Session Content
• What is Web Scraping?

• How do we do?

• Tools to use

• Ethics for scraping

• Demo

4
What is Web Scraping?
• Web scraping: is technique for gathering data or
information on web pages.

• Web scraping: is method to extract data from a website


that does not have an API, or we want to extract LOT of
data which we can not do through an API due to rate
limiting.

• Through web scraping we can extract any data which we


can see while browsing the web.

• You could revisit your favorite website every time it updates


for new information, Or you could write a web scraper to
have it

5
Web Scraping in Real Life
• Extract products information
• Extract job posting and internships
• Extract offers and discount from deal of the
day website
• Extract date to make search engine
• Gathering weather data
• Etc.

6
Advanced Web Scraping Vs. API

• Web scraping is not rate limited

• Anonymously access the website and gather data

• Some website don’t have API

• Some data is not accessible through an API

7
Workflow
• Web scraping follows this workflow:

• Get the website – using HTTP library

• Parse the html document – using any parsing library

• Store the results – either a db , csv, txt file etc.

8
Libraries
• BeautifulSoup
(bs4)
• Lxml
• Selenium
• Re
• scrapy

9
Is Web Scraping Legal?

In short, the action of web scraping is not illegal. However, some rules need to be

followed. Web scraping is illegal when non-publicly available data is extracted.

10
Demo 1
Beautiful Soup

11
Demo 2
Selenium

12
Questions & Answers

4/30/2024 34
Thank you!

4/30/2024 35

You might also like