KEMBAR78
Facets of Data Science | PDF | Data Analysis | Data
0% found this document useful (0 votes)
26 views2 pages

Facets of Data Science

Data science involves data collection, cleaning, analysis, and visualization, focusing on structured, semi-structured, and unstructured data. It employs various analytical techniques including descriptive, diagnostic, predictive, and prescriptive analytics, alongside machine learning methods. The data science lifecycle encompasses problem definition, data preparation, exploration, model building, and deployment.

Uploaded by

Gulbir Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views2 pages

Facets of Data Science

Data science involves data collection, cleaning, analysis, and visualization, focusing on structured, semi-structured, and unstructured data. It employs various analytical techniques including descriptive, diagnostic, predictive, and prescriptive analytics, alongside machine learning methods. The data science lifecycle encompasses problem definition, data preparation, exploration, model building, and deployment.

Uploaded by

Gulbir Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Data science encompasses several key facets, including data collection, cleaning, analysis, and

visualization. It also involves understanding different data types, like structured, semi-structured,
and unstructured data, and applying various analytical techniques such as descriptive, diagnostic,
predictive, and prescriptive analytics.
Here's a more detailed breakdown:
1. Data Collection and Preparation:
Identifying the structure of data:
Recognizing the organization and format of data (e.g., structured, unstructured) is crucial for
effective processing.
Accessing and importing data:
Gathering data from various sources, including databases, APIs, and files, and importing it into a
usable format.
Cleaning, filtering, reorganizing, augmenting, and aggregating data:
Preparing data for analysis by removing errors, inconsistencies, and irrelevant information, as
well as transforming it into a suitable format for analysis.
2. Data Analysis:
Descriptive Analysis: Summarizing and describing the main features of a dataset, often using
measures like mean, median, and standard deviation.
Diagnostic Analysis: Investigating the reasons behind observed patterns and trends in the data.
Predictive Analysis: Building models to forecast future outcomes based on historical data.
Prescriptive Analysis: Using data to recommend optimal actions or decisions.
Statistical Analysis: Applying statistical methods to extract insights and knowledge from data.
Machine Learning: Utilizing algorithms to enable systems to learn from data without explicit
programming.
3. Data Visualization:
Presenting findings effectively:
Using charts, graphs, and other visual representations to communicate insights and trends to a
wider audience.
Data Visualization Techniques:
Selecting appropriate visualization methods to reveal patterns and relationships within the data.
4. Key Data Characteristics:
Volume: The sheer amount of data generated and processed.
Variety: The diversity of data types, including structured, unstructured, and semi-structured
data.
Velocity: The speed at which data is generated and processed.
Veracity: The quality and accuracy of the data.
Value: The usefulness of the data for decision-making.
5. Data Science Lifecycle:
Defining the problem: Clearly outlining the business or research question that needs to be
addressed.
Data collection and preparation: Gathering the necessary data and ensuring its quality.
Data exploration and analysis: Investigating the data to identify patterns, trends, and insights.
Model building and evaluation: Developing and testing predictive or analytical models.
Deployment and maintenance: Putting the model into production and ensuring its ongoing
performance.

You might also like