INTRODUCTION TO DATA DRIVEN DECISION
MAKING WITH EMPHASIS IN ANALYTICS AND AI
Andreas Zaras, Data Scientist
CHARACTERISTICS OF TODAY’S BUSINESS ENVIRONMENT
It is characterized by:
Organizations strive to survive by acquiring competitive advantage.
THE DECISION MAKING PROCESS
Define the
Problem
Follow - Up Decide Who
& Should
Assessment Decide
Implement Collect Data
Identify &
Decide Evaluate
Alternatives
LIVING IN A DATA FLOODED WORLD! (1 / 6)
LIVING IN A DATA FLOODED WORLD! (2 / 6)
Primary Sources of Today’s Big Data
Social Machine Transactional
Data Data Data
Social Media Platforms Industrial Equipment Sensors In: On - Line & Offline Transactions:
Likes Medical Devices Invoices
Tweets & Retweets Smart Meters Payment Orders
Comments – Posts Road Cameras Delivery Receipts
Media Uploads Satellites Storage Records
Internet of Things
LIVING IN A DATA FLOODED WORLD! (3 / 6)
Some Big Data Statistics
90% of all data in the world has been created in the last two years
(Source: IBM).
Internet users generate about 2.5 quintillion bytes of data each day – equal to the total ants on the globe times 100
(Source: Data Never Sleeps).
Today it would take a person approximately 181 million years to download all the data from the internet
(Source: Physics.org).
In 2020 there will be 40 x more bytes of data than there are stars in the observable universe
(Source: Data Never Sleeps).
By 2020, every person will generate 1.7 megabytes in just a second.
(Source: Data Never Sleeps)
LIVING IN A DATA FLOODED WORLD! (4 / 6)
Source: Data Never Sleeps 8.0
LIVING IN A DATA FLOODED WORLD! (5 / 6)
Web/ email Call Center Survey
Customers Data Exploitation
Partners
Corporate Data
LIVING IN A DATA FLOODED WORLD! (6 / 6)
A NEW PROFESSION IS BORN: THE DATA SCIENTIST!
Data Data Business Business Business Presentation
Access Processing Intelligence Analytics Knowledge of Results
APPLICATIONS OF DATA SCIENCE
Fraud
Detection
Churn Promotional
Prediction Modeling
Inventory Data Science Demand
Optimization Forecasting
Applications
Recommender Customer
Systems Response
Models
Customer Credit
Segmentation Scoring
INFORMATION SYSTEMS
Information Systems
(Hardware, Software, Data, People, Networks)
Data Capturing Data Exploitation
Transactional Systems Decision Support Systems
(Support Day to Day Operations) (Support Decision Making)
STAGES OF DATA EXPLOITATION
Data Business
Management Intelligence
Business
Analytics
DATA MANAGEMENT
STAGES OF DATA EXPLOITATION
Data Business
Management Intelligence
Business
Analytics
BUSINESS INTELLIGENCE
What actions are needed? Alerts
OLAP
What exactly is the problem? Reporting
Ad-Hoc
Reports How many, how often, when?
Standard
Reports What happened?
Degree of Intelligence
Degree of Intelligence
STAGES OF DATA EXPLOITATION
Data Business
Management Intelligence
Business
Analytics
BUSINESS ANALYTICS
Business Analytics (BA), as defined by the International Institute of Analytics in 2010, is:
“The broad use of data and quantitative analysis to support the decision making process….”.
Business Analytics = BSI BMI
BII: BUSINESS BSI: BUSINESS
MODELING STATISTICAL
INTELLIGENCE INTELLIGENCE
Soft Systems Descriptive Stats
System Dynamics
Inferential Stats
Machine
Decision Analysis Learning
Data Reduction
Simulation Recommender
Systems Econometrics
Queuing Theory
Forecasting Time Series Analysis
Math Programming
Sampling Techniques
BUSINESS ANALYTICS
What’s the best that can happen? Optimization
Machine
What will happen next? Learning
Forecasting What if these trends continue?
Stats
Analysis What happened & why is this happening?
Degree of Intelligence
Degree of Intelligence
STATISTICAL ANALYSIS
Statistics is the field of study that is concerned with the
following activities:
Collecting, organizing and summarizing data.
Making inferences about a body of data when only a part of
the data is observed.
Interpreting and communicating the results of the first two
activities
FORECASTING
Demand
Date
(Units)
January 1993 2554
6000
February 1993 2890
March 1993 3240
5000 ……….. ………..
October 1997 3390
November 1997 3212
4000
December 1997 3019
3000
Forecast
2000 (Produced in 31/12/1997)
Ιαν-93 Ιαν-94 Ιαν-95 Ιαν-96 Ιαν-97 Ιαν-98
Actual Forecast UCLM LCLM January 1998 4480
February 1998 4670
……… ………
November 1998 4789
December 1998 4760
MACHINE LEARNING: THE CASE OF CUSTOMER CHURN
Customer Age SMS (#) Call (Min) Internet (MB) Churn
John 35 100 30 500 Yes
Sophie 18 200 60 300 No
Victor 38 50 120 400 No
Laura 44 25 80 600 Yes
Call < 80 Internet < 300
FORECASTING VS MACHINE LEARNING
Organization Forecasting Machine Learning
Forecast the demand for each SKU Predict which customers are likely
e - Commerce
for the next six months. to respond to a promotion.
Forecast how many new customers Predict which customers are likely
Telco
will sign contract the next quarter. to churn (change provider).
OPTIMIZATION EXAMPLE: THE CASE OF INVENTORY
Inventory
X Units
Safety
Stock
0 t1 t2
Time
When to order?
How much inventory should we order?
What should be the level of the safety stock?
OPTIMIZATION EXAMPLE: THE CASE OF INVENTORY (2)
Decision Variables
When to order?
How much should we order?
How much should the safety stock be?
Objectives
Minimize inventory cost
Min customer dissatisfaction
Min stock outs Min lost customers
Inventory Total Cost
Holding Cost e.g.. Insurance, Security, Obsolescence, Rent
Reorder Cost e.g. Transportation, Order, Inspection, Communication
SOCIAL NETWORK ANALYTICS
Social Network Definition
It can be any set of nodes connected by edges in a particular business setting.
Examples of social networks:
• Telephone calls between customers of a telco provider.
• E-mail traffic between people.
• Spread of illness between patients
• Research papers connected by citations
Social Network Analytics (SNA)
SNA comprise of a variety of mathematical and statistical metrics derived from the data
of a social network and that can provide insights and unhide useful information.
TEXT ANALYTICS
Text Analytics or Text Mining
It is a set of quantitative techniques that can help organizations derive
potentially valuable business insights from text - based content (e.g. word
documents, emails, postings on social media etc)
This can be achieved by transforming textual (unstructured) data to Text
structured data with the objective to identify patterns and associations Analytics
between words and phrases. Linguistics
Computer
Text analytics software can help by mapping text into numeric Science
representations which can then be linked with structured data in a database
and analyzed with traditional data mining techniques. Machine
Learning
Natural
Language
Processing
DEFINITION OF RECOMMENDER SYSTEMS
In a general way, recommender systems are algorithms aimed at suggesting relevant items to users
(items being movies to watch, text to read, products to buy or anything else depending on industries).
NETFLIX EXAMPLE
Movie Lord of the Game of
Rambo II Rocky IV Harry Potter
User Rings Thrones
John 5 5 ? ? 1
Sophie 4 1 1 1 ?
Victor 5 4 1 1 ?
Laura ? ? 4 4 4
Patrick ? ? 5 5 5
Clarisse 1 1 ? 4 ?
COMPUTER VISION
Applications of Computer Vision
Computer Vision, often abbreviated as CV, is defined as a field
of study that seeks to develop techniques to help computers • Face Recognition
Movie
• Number Plate Recognition
User “see” and understand the content of digital images
such as photographs and videos. • Autonomous Driving
• Cancer Detection (X – Rays)
• Tumor Detection
• Mask Detection
• Theft Detection
• Social Distance
• Waiting Time Analytics
• Customer Tracking
• Cancer Cell Classification
ARTIFICIAL INTELLIGENCE
Artificial Intelligence is the field of science that is occupied with transforming machines in a way so they can think
and make decisions as human beings, mainly by learning from huge amounts of past data. Analytics is in the heart
of AI which comprises of Machine Learning Techniques, Natural Language Processing and Evolutionary Algorithms.
Movie
User
AI
ML
NLP
DL
EA
STAGES OF DATA EXPLOITATION
Data Business
Management Intelligence
Business
Analytics
THE ANALYTICS PROCESS MODEL
Interpret,
Identify Identify Select Clean Transform Analyze Evaluate
Business Data the the the the and Deploy
Problem Sources Data Data Data Data The Model
Post-
Preprocessing Analytics Processing
THANK YOU!
Andreas Zaras, Data Scientist