KEMBAR78
Big Data Lecture # 1 | PDF | Big Data | Data
0% found this document useful (0 votes)
15 views15 pages

Big Data Lecture # 1

The document outlines the course 'Programming for Big Data' which covers the significance of big data, its characteristics, and the differences between big data and small data. It discusses the 5 Vs of big data (Volume, Velocity, Variety, Veracity, and Value), types of big data, and its impacts on business, science, and society. Additionally, it highlights various sources and examples of big data across different sectors.

Uploaded by

Taha Ahmad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views15 pages

Big Data Lecture # 1

The document outlines the course 'Programming for Big Data' which covers the significance of big data, its characteristics, and the differences between big data and small data. It discusses the 5 Vs of big data (Volume, Velocity, Variety, Veracity, and Value), types of big data, and its impacts on business, science, and society. Additionally, it highlights various sources and examples of big data across different sectors.

Uploaded by

Taha Ahmad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Course Name: Programming for Big Data

Course Code: CSDS-4423


Credit Hours: 3
Instructor: Prof. Khalid Rasheed Ch.
Lecture #: 1
Programming for Big Data

Today Agenda
Importance of Subject.
Data, Information, Data processing
What is big data?
Difference between big data and small data
Sources of big data
Technologies of Big data
5 Vs of big data
Types of Big data
Impacts of big data
Data
• A collection of raw facts and figures related to an object.
• Object → person (or student), an organization, an event or any other things
etc.
• Data → in the form of text, numbers, images, sounds, and videos.
• Processed to produce meaningful information.

Information
• The processed data.
• Provides useful meanings.
• Data is used as input for processing and information is the output of this
processing.
Data Processing
• A series of actions or activities performed on data to convert it
into meaningful information is called data processing.
• Also called operations on data.
Methods of data processing
▪Manual data processing
▪Mechanical data processing
▪Electronic data processing
Big data
▪ Big data refers to massive and complex
datasets that are difficult to store, process, and
analyze using traditional methods. These
datasets are growing at an ever-increasing rate,
driven by factors such as sensors, social media,
and the Internet of Things (IoT).
Big data
▪ Big data is a data that exceeds the processing
capacity of conventional database Systems.
▪ The data is too big, moves to fast, or does not
fit the structures of your database architecture
▪ To gain the value from this data, you must
choose an alternative way to proceed it.
Difference between big data and
small data
Small Data Big Data

Mostly structured Mosely unstructured

Store in MB, GB, TB Store in PB, EB

Increase gradually Increase exponentially

Locally present, centralized Globally present , Distributed

SQL Server, oracle, single node Hadoop, spark , multi node, cluster
Sources of big data
Examples of social networks and E commerce data
▪ Number of followers
▪ Follower trends
▪ Number of likes
▪ No of clicks
▪ Number of subscribers
▪ Number of customers
▪ Number of visitors
5 Vs of Big Data
1.Volume: This refers to the amount of data generated and collected. With the sensors,
devices, and digital systems, organizations are now dealing with massive volumes of data.
This includes structured data (such as databases) as well as unstructured data (like text,
images, videos, etc.).
2.Velocity: Velocity refers to the speed at which data is generated, processed, and analyzed.
With the rise of real-time analytics and streaming data technologies, organizations need to
handle data that streams in at high velocity. Examples include social media feeds, sensor
data from IoT devices, and financial transactions.
Variety: Variety refers to the diversity of data types and sources. Big data encompasses not
only structured data found in traditional databases but also unstructured and semi-
structured data like text, images, videos, social media posts, sensor data, log files, and more.
Dealing with this variety requires flexible storage, processing, and analysis techniques.
Veracity: Veracity refers to the accuracy and reliability of the data. Big data often comes from
disparate sources, which can lead to inconsistencies, inaccuracies, and errors. Ensuring the
veracity of data involves processes such as data cleaning, validation, and quality assurance to
make sure that the data is accurate and reliable for analysis.
Value: Finally, value refers to the ultimate goal of big data initiatives: extracting meaningful
insights and value from the data to drive decision-making, improve processes, innovate, and
gain competitive advantage. While handling the volume, velocity, variety, and veracity of
data is essential, the true measure of success lies in deriving actionable insights and value
from the data.
Types of Big Data
Structured data: This is the most traditional type of data, and
it is typically stored in relational databases. Structured data is
highly organized and easy to query and analyze. An example
of structured data would be a customer database that
includes columns for name, address, and phone number.
Semi-structured data: This type of data is less organized than
structured data, but it still has some internal structure.
Examples of semi-structured data include log files, emails, and
social media posts.
Unstructured data: This is the most challenging type of data
to work with, as it has no inherent structure. Examples of
unstructured data include text documents, images, and
videos.
Impacts of Big data
•Revolutionized Business: Big data analytics empower
businesses to make data-driven decisions, personalize
marketing campaigns, optimize operations, and develop
innovative products and services.
•Scientific Advancement: Big data is transforming research
in various fields like medicine, where analyzing vast
datasets can accelerate drug discovery and personalized
healthcare.
•Societal Benefits: Big data can be harnessed for social
good, such as optimizing traffic flow in cities, predicting
and mitigating natural disasters, and improving public
safety through crime analysis.
•Improved Decision-Making: By providing deeper insights
from customer behavior and market trends, big data
empowers individuals and organizations to make more
informed choices.
Examples of big data
▪ Transportation
▪ Advertising and marketing
▪ Government
▪ Banking and finance
▪ Medical, entertainment
▪ cybersecurity

You might also like