100% found this document useful (2 votes)

217 views45 pages

Business Analytics Essentials

The document provides an introduction to business analytics and data-driven decision making. It discusses how companies must shift to being data-driven to avoid disruption, with 72% being vulnerable within three years. It then covers internal and external threats companies face. The rest of the document defines key concepts in business analytics like data science, analytics, data analytics, and the importance of data collection, preprocessing, and analysis in transforming data into meaningful insights.

Uploaded by

kingshuk.india9790

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

100% found this document useful (2 votes)

217 views45 pages

Business Analytics Essentials

Uploaded by

kingshuk.india9790

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 45

Introduction to Business

Analytics

By
Dr. Kingshuk Srivastava
Changing life under Digital Age
Transformation is Critical…

Companies must shift to a

Data-Driven
Business

are vulnerable
72% to
disruption
within
three years
Why?….Suddenly!!..

Why we’re all

Internal Threats
vulnerable to seismic
Siloed data and systems
shifts Gaps in expertise and
skills Inability to react
quickly

External Threats
Born-on-digital companies that steal 274,000
market share or rewrite customer
Estimated
expectations
worldwide startups
New business models that reinvent our each day
industry and change the game altogether

4
The shift to a Data-Driven Organization

Operations Reporting & Self-Service New

Data Analytics Busines
Warehousin s
g Models
Valu
e

Data
Data Decision Science
Efficiency
Modernization
Monetizati
on
Uses of Data
What is Data Science?

Data science is a "concept to unify statistics, data analysis

and their related methods" in order to "understand and
analyze an actual phenomena" with data.
What
Why
Analytics?
• Process of collecting, organizing and analysing large sets of data to discover useful information
which is most important

• Organizations have far more data than ever before

• Analytics solutions help organizations take better & fast decisions.

• Identify the opportunities for improvement

• All companies are moving towards using Business Analytics to understand data to develop their
business goals

Analytics is the discovery and rich with recorded

information with meaningful patterns in data
Significance of Analytics

Convert extensive data into powerful insights which drive

into efficient decisions

You can now base your decisions or strategies on data

rather than intuition

Applying right analytics to your data for desired

improvements

Achieve breakthrough results

What is Data Analytics?

Analytics is the use of:

data,
information technology,
statistical analysis,
quantitative methods, and
mathematical or computer-based models
to help managers gain improved insight about their business operations and make better, fact-based
decisions.
Business Analytics & Business Intelligence is a subset of Data Analytics
Components leading to Analytics
Domain Knowledge
Supply Chain
Domain CRM
Expertise Financials
Networking

Engineering Research

Intelligence
Scripting, SQL
Python, R Scala Computer Math &
Data Pipelines Machine
Big Data/ Science Stats
Mathematics
Apache Spark
Learning Computational

Data Science Projects Require multiple Skills

Understanding Wisdom P
Another approach to differentiate
Types of analytics
Understanding; Decision; Action;
Segments
Components Of Data Analytics
Further Understanding
Application Areas

Domains Analytics
Sector Specific specializations
Graph Analytics
Types of Data
Types of Data
The V’s of Big Data
Data Collection Techniques

• Observations,
• Tests,
• Surveys,
• Document analysis
(the research literature)
Quantitative Methods

Experiment: Research situation with at least one independent

variable, which is manipulated by the researcher
Independent Variable: The variable in the study under
consideration. The cause for the outcome for the study.

Dependent Variable: The variable being affected by the

independent variable. The effect of the study

y = f(x)
Which is which here?
Key Factors for High Quality
Experimental Design
Data should not be contaminated by poor measurement or errors
in procedure.

Eliminate confounding variables from study or minimize effects

on variables.

Representativeness: Does your sample represent the population

you are studying? Must use random sample techniques.
What Makes a Good Quantitative
Research Design?
4 Key Elements

1. Freedom from Bias

2. Freedom from Confounding
3. Control of Extraneous Variables
4. Statistical Precision to Test Hypothesis
Bias: When observations favor some individuals in the
population over others.

Confounding: When the effects of two or more variables cannot

be separated.

Extraneous Variables: Any variable that has an effect on the

dependent variable.

Need to identify and minimize these variables.

e.g., Erosion potential as a function of clay content. rainfall
intensity, vegetation & duration would be considered extraneous
variables.
Precision versus accuracy

"Precise" means sharply defined or measured.

"Accurate" means truthful or correct.

Both Accurate Accurate
and Precise Not precise

Not accurate
But precise
Neither accurate
nor precise
Sampling
Sampling is the problem of accurately acquiring the necessary
data in order to form a representative view of the problem.

This is much more difficult to do than is generally realized.

Overall Methodology:
* State the objectives of the survey
* Define the target population
* Define the data to be collected
* Define the variables to be determined
* Define the required precision & accuracy
* Define the measurement `instrument'
* Define the sample size & sampling method, then
select the sample
Data Preprocessing
Data Quality: Why Preprocess the Data?

• Measures for data quality: A multidimensional view

• Accuracy: correct or wrong, accurate or not
• Completeness: not recorded, unavailable, …
• Consistency: some modified but some not, dangling, …
• Timeliness: timely update?
• Believability: how trustable the data are correct?
• Interpretability: how easily the data can be understood?

37
Major Tasks in Data Preprocessing

• Data cleaning
• Fill in missing values, smooth noisy data, identify or remove outliers, and
resolve inconsistencies
• Data integration
• Integration of multiple databases, data cubes, or files
• Data reduction
• Dimensionality reduction
• Numerosity reduction
• Data compression
• Data transformation and data discretization
• Normalization
• Concept hierarchy generation
38
Data Cleaning

• Data in the Real World Is Dirty: Lots of potentially incorrect data, e.g., instrument faulty, human or computer
error, transmission error
• incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate data
• e.g., Occupation=“ ” (missing data)
• noisy: containing noise, errors, or outliers
• e.g., Salary=“−10” (an error)
• inconsistent: containing discrepancies in codes or names, e.g.,
• Age=“42”, Birthday=“03/07/2010”
• Was rating “1, 2, 3”, now rating “A, B, C”
• discrepancy between duplicate records
• Intentional (e.g., disguised missing data)
• Jan. 1 as everyone’s birthday?

39
Incomplete (Missing) Data

• Data is not always available

• E.g., many tuples have no recorded value for several
attributes, such as customer income in sales data
• Missing data may be due to
• equipment malfunction
• inconsistent with other recorded data and thus deleted
• data not entered due to misunderstanding
• certain data may not be considered important at the time of
entry
• not register history or changes of the data
• Missing data may need to be inferred
40
How to Handle Missing Data?

• Ignore the tuple: usually done when class label is missing (when
doing classification)—not effective when the % of missing values per
attribute varies considerably
• Fill in the missing value manually: tedious + infeasible?
• Fill in it automatically with
• a global constant : e.g., “unknown”, a new class?!
• the attribute mean
• the attribute mean for all samples belonging to the same class:
smarter
• the most probable value: inference-based such as Bayesian
formula or decision tree
41
Noisy Data

• Noise: random error or variance in a measured variable

• Incorrect attribute values may be due to
• faulty data collection instruments
• data entry problems
• data transmission problems
• technology limitation
• inconsistency in naming convention
• Other data problems which require data cleaning
• duplicate records
• incomplete data
• inconsistent data

42
How to Handle Noisy Data?

• Binning
• first sort data and partition into (equal-frequency) bins
• then one can smooth by bin means, smooth by bin median,
smooth by bin boundaries, etc.
• Regression
• smooth by fitting the data into regression functions
• Clustering
• detect and remove outliers
• Combined computer and human inspection
• detect suspicious values and check by human (e.g., deal with
possible outliers)

43
Data Cleaning as a Process
• Data discrepancy detection
• Use metadata (e.g., domain, range, dependency, distribution)
• Check field overloading
• Check uniqueness rule, consecutive rule and null rule
• Use commercial tools
• Data scrubbing: use simple domain knowledge (e.g., postal code,
spell-check) to detect errors and make corrections
• Data auditing: by analyzing data to discover rules and relationship to
detect violators (e.g., correlation and clustering to find outliers)
• Data migration and integration
• Data migration tools: allow transformations to be specified
• ETL (Extraction/Transformation/Loading) tools: allow users to specify
transformations through a graphical user interface
• Integration of the two processes
• Iterative and interactive (e.g., Potter’s Wheels)
44
THANK YOU

Chapter 2 - Introduction To Data Science
No ratings yet
Chapter 2 - Introduction To Data Science
36 pages
A Big Data Analytics Study Challenges, Unresolved Research Issues, and Techniques
100% (1)
A Big Data Analytics Study Challenges, Unresolved Research Issues, and Techniques
8 pages
Library Management System Report
No ratings yet
Library Management System Report
41 pages
Fast Food Data Warehouse Case Study
No ratings yet
Fast Food Data Warehouse Case Study
5 pages
Understanding the Big Data Phenomenon
No ratings yet
Understanding the Big Data Phenomenon
52 pages
DM Case Studies
No ratings yet
DM Case Studies
24 pages
Tableau Tutorial For Beginners
No ratings yet
Tableau Tutorial For Beginners
8 pages
Big Data: by It Faculty Alttc Ghaziabad
No ratings yet
Big Data: by It Faculty Alttc Ghaziabad
26 pages
Power BI - Exam Prep - 29 - 3
No ratings yet
Power BI - Exam Prep - 29 - 3
40 pages
Introduction To Big Data
No ratings yet
Introduction To Big Data
5 pages
Big Data Capabilities Create Business Value - The Mediating Role of Decision-Making Impact
No ratings yet
Big Data Capabilities Create Business Value - The Mediating Role of Decision-Making Impact
11 pages
A Datamining Model For Detection of Fraudulent Behaviour in Water
No ratings yet
A Datamining Model For Detection of Fraudulent Behaviour in Water
36 pages
Data Science Process - Word Template
No ratings yet
Data Science Process - Word Template
3 pages
Data Visualization in Support of Executive Decision Making
No ratings yet
Data Visualization in Support of Executive Decision Making
14 pages
Data Analysis
No ratings yet
Data Analysis
22 pages
Big Educational Data & Analytics Survey
No ratings yet
Big Educational Data & Analytics Survey
23 pages
Project Report: Ludo Game'
0% (1)
Project Report: Ludo Game'
36 pages
DS ML - BROCHURE - Updated
No ratings yet
DS ML - BROCHURE - Updated
30 pages
Business Research for Students
No ratings yet
Business Research for Students
106 pages
Cloud Computing Projects for M.Phil
No ratings yet
Cloud Computing Projects for M.Phil
15 pages
Data Mining and Data Warehousing
100% (2)
Data Mining and Data Warehousing
11 pages
MIS Notes
No ratings yet
MIS Notes
23 pages
Fashion Stores 122
No ratings yet
Fashion Stores 122
26 pages
Https Raw - Githubusercontent.com Joelgrus Data-Science-From-Scratch Master Code Natural Language Processing
No ratings yet
Https Raw - Githubusercontent.com Joelgrus Data-Science-From-Scratch Master Code Natural Language Processing
5 pages
NOTES OF Python Ok
No ratings yet
NOTES OF Python Ok
73 pages
Data Analytics Lab Manual Using R Programming
No ratings yet
Data Analytics Lab Manual Using R Programming
27 pages
Erp Project Intro-Final
No ratings yet
Erp Project Intro-Final
75 pages
Employee Leave Management System: Certificate
100% (1)
Employee Leave Management System: Certificate
59 pages
Data Mining Concepts
No ratings yet
Data Mining Concepts
35 pages
Tips For Successful Thesis
No ratings yet
Tips For Successful Thesis
32 pages
Online Society Tracking: A Project Report On
No ratings yet
Online Society Tracking: A Project Report On
187 pages
For Seminar Presentation-Edited (Feb5)
No ratings yet
For Seminar Presentation-Edited (Feb5)
33 pages
L14-16 JavaFX
100% (1)
L14-16 JavaFX
169 pages
Data Analytics, Data Visualization and Big Data
No ratings yet
Data Analytics, Data Visualization and Big Data
25 pages
CBAP & CCBA Exam Prep Course Pakistan
No ratings yet
CBAP & CCBA Exam Prep Course Pakistan
3 pages
Challenges and Scope of Data Science Project
No ratings yet
Challenges and Scope of Data Science Project
21 pages
Web Development
No ratings yet
Web Development
87 pages
E-Commerce Churn Prediction
100% (1)
E-Commerce Churn Prediction
24 pages
Hardware Sales & Service Project
No ratings yet
Hardware Sales & Service Project
80 pages
Project Report SVVV
No ratings yet
Project Report SVVV
33 pages
L1 2 Introduction
No ratings yet
L1 2 Introduction
79 pages
Design and Analysis of Algorithms
No ratings yet
Design and Analysis of Algorithms
13 pages
Data Analytics III-i
No ratings yet
Data Analytics III-i
85 pages
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
No ratings yet
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
52 pages
Data Mining Unit 1
No ratings yet
Data Mining Unit 1
91 pages
Enterprise Resource Planning
No ratings yet
Enterprise Resource Planning
13 pages
Hospital Managemt FINAL PROJECT
No ratings yet
Hospital Managemt FINAL PROJECT
39 pages
Nosql Database Systems: M.Tech. (Iind, Sem Ce/Cn)
100% (1)
Nosql Database Systems: M.Tech. (Iind, Sem Ce/Cn)
135 pages
Role of Big Data Analytics in Banking
No ratings yet
Role of Big Data Analytics in Banking
6 pages
Main Project
No ratings yet
Main Project
122 pages
Dilla Universty Dilla University School of Mathematics and Computer Science
No ratings yet
Dilla Universty Dilla University School of Mathematics and Computer Science
8 pages
The Database Life Cycle: 1. Heading 1
No ratings yet
The Database Life Cycle: 1. Heading 1
10 pages
Data Visualization Mastery Course
No ratings yet
Data Visualization Mastery Course
2 pages
Flipkart Web Scraping Project Report
No ratings yet
Flipkart Web Scraping Project Report
25 pages
MSTR Architect Project Design Essentials: Course Contents: Basic and Advanced
No ratings yet
MSTR Architect Project Design Essentials: Course Contents: Basic and Advanced
3 pages
KMBN IT01 LM Consolidated
No ratings yet
KMBN IT01 LM Consolidated
123 pages
BIA 5000 Introduction To Analytics - Lesson 6
No ratings yet
BIA 5000 Introduction To Analytics - Lesson 6
59 pages
Data Preparation and Exploration: DSCI 5240 Data Mining and Machine Learning For Business Russell R. Torres
No ratings yet
Data Preparation and Exploration: DSCI 5240 Data Mining and Machine Learning For Business Russell R. Torres
28 pages
DTS Modul Data Science Methodology
100% (1)
DTS Modul Data Science Methodology
56 pages
Data Analytics 1
No ratings yet
Data Analytics 1
4 pages
Training Slides
No ratings yet
Training Slides
7 pages
Artificial Neural Networks: HCMC University of Technology Sep. 2008
No ratings yet
Artificial Neural Networks: HCMC University of Technology Sep. 2008
71 pages
Internet Intranet Hgghextranet
No ratings yet
Internet Intranet Hgghextranet
20 pages
Basic
No ratings yet
Basic
47 pages
Basic
No ratings yet
Basic
47 pages
PHD Thesis Big Data
100% (3)
PHD Thesis Big Data
7 pages
Journal Review 3
No ratings yet
Journal Review 3
20 pages
Benefits of Audio-Visual Aids
No ratings yet
Benefits of Audio-Visual Aids
10 pages
Sample Applicationsof Expert Systems
No ratings yet
Sample Applicationsof Expert Systems
25 pages
Historical Reflections: Five Lessons From Really Good History
No ratings yet
Historical Reflections: Five Lessons From Really Good History
4 pages
CCW331 Business Analytics Material Unit I Type2
No ratings yet
CCW331 Business Analytics Material Unit I Type2
43 pages
J Mss 19090201
No ratings yet
J Mss 19090201
8 pages
Instance Based Learning: Artificial Intelligence and Machine Learning 18CS71
No ratings yet
Instance Based Learning: Artificial Intelligence and Machine Learning 18CS71
19 pages
Aravind CR: Work Experience Skills
No ratings yet
Aravind CR: Work Experience Skills
1 page
ECE317 L10 - RouthHurwitzEx
No ratings yet
ECE317 L10 - RouthHurwitzEx
25 pages
Pattern Recognition and Machine Learning: Fuzzy Sets in Pattern Recognition Debrup Chakraborty Cinvestav
No ratings yet
Pattern Recognition and Machine Learning: Fuzzy Sets in Pattern Recognition Debrup Chakraborty Cinvestav
38 pages
Unit I
No ratings yet
Unit I
28 pages
Applsci 13 10063
No ratings yet
Applsci 13 10063
21 pages
Advanced Prompt Tuning for NLP Experts
No ratings yet
Advanced Prompt Tuning for NLP Experts
12 pages
AlexNet, ZFNet, Inception V1 Overview
No ratings yet
AlexNet, ZFNet, Inception V1 Overview
19 pages
W7 Weka Classification
No ratings yet
W7 Weka Classification
7 pages
Notes of Dbms Unit-2
No ratings yet
Notes of Dbms Unit-2
17 pages
Control Systems for Engineering Students
No ratings yet
Control Systems for Engineering Students
2 pages
De Saussure
No ratings yet
De Saussure
5 pages
NLP Techmax NLP
100% (2)
NLP Techmax NLP
137 pages
Unit - Guide - ELEC324 - 2018 - S1 Day
No ratings yet
Unit - Guide - ELEC324 - 2018 - S1 Day
21 pages
Discrete Time Systems Discrete Time Systems & Difference Equations
100% (1)
Discrete Time Systems Discrete Time Systems & Difference Equations
44 pages
Interlanguage Contrastive and Error Analysis: Seminar On Linguistics
No ratings yet
Interlanguage Contrastive and Error Analysis: Seminar On Linguistics
36 pages
Swarm and Evolutionary Computation - 2
No ratings yet
Swarm and Evolutionary Computation - 2
2 pages
IE474 Summer2022 Nise Ch1 PDF
No ratings yet
IE474 Summer2022 Nise Ch1 PDF
23 pages
Elements of Communication
No ratings yet
Elements of Communication
38 pages
Question1: Describe The Process of Communication in Detail. Answer: The Communication Process
No ratings yet
Question1: Describe The Process of Communication in Detail. Answer: The Communication Process
3 pages
Unit 1 - Understanding Business Communication
No ratings yet
Unit 1 - Understanding Business Communication
10 pages
Deep Learning
No ratings yet
Deep Learning
1 page
Object Classification Through Perceptron Model Using Labview
No ratings yet
Object Classification Through Perceptron Model Using Labview
4 pages

Business Analytics Essentials

Uploaded by

Business Analytics Essentials

Uploaded by

Introduction to Business

Companies must shift to a

Why we’re all

Operations Reporting & Self-Service New

Data science is a "concept to unify statistics, data analysis

• Organizations have far more data than ever before

• Analytics solutions help organizations take better & fast decisions.

• Identify the opportunities for improvement

Analytics is the discovery and rich with recorded

Convert extensive data into powerful insights which drive

You can now base your decisions or strategies on data

Applying right analytics to your data for desired

Achieve breakthrough results

Analytics is the use of:

Data Science Projects Require multiple Skills

Experiment: Research situation with at least one independent

Dependent Variable: The variable being affected by the

Eliminate confounding variables from study or minimize effects

Representativeness: Does your sample represent the population

1. Freedom from Bias

Confounding: When the effects of two or more variables cannot

Extraneous Variables: Any variable that has an effect on the

Need to identify and minimize these variables.

"Precise" means sharply defined or measured.

"Accurate" means truthful or correct.

This is much more difficult to do than is generally realized.

• Measures for data quality: A multidimensional view

• Data is not always available

• Noise: random error or variance in a measured variable

You might also like