Machine Learning
Louis Fippo Fitime
October 20, 2020
Introduction
I Age of automation and intelligent systems
I Enormous potential accross industries, enterprises
I Intelligent systems and data-driven organizations are
becoming a reality
I Advancements in tools and techniques is only helping it
expand further
Data is the new oil and Machine Learning is a powerful
concept and framework for making the best out of it.
Introduction
Aim of this course (1)
I The core idea is to give you enough background on why we
need Machine Learning,
I The fundamental building blocks of Machine Learning,
I What Machine Learning offers us presently.
I This will enable you to learn about how best you can leverage
Machine Learning to get the maximum from your data.
Introduction
Aim of this course (2)
I Understand formal definitions,
I Concepts, foundations with regard to learning algorithms,
I Data management,
I Model building, evaluation, and deployment.
Introduction
Aim of this course (3)
Practical aspect of the course
I Specific use cases,
I Specific problem,
I Real-world case.
ML Basics
The need for Machine Learning
Why make mmachine learn?
I Lack of sufficient human expertise in a domain,
I Scenarios and behavior can keep changing over time,
I Humans have sufficient expertise in the domain but it is
extremely difficult to formally explain or translate this
expertise into computational tasks,
I Addressing domain specific problems at scale with huge
volumes of data.
ML Basics
Traditional Programming Paradigm
Figure 1: Traditional Programming Paradigm
ML Basics
Why Machine Learning?
Figure 2: ML Programming Paradigm
ML Basics
Solve Machine Learning problem
I Leverage device data and logs,
I Decide key data attributes that could be useful for building a
model,
I Observe and capture device attributes and their behavior over
various time periods,
I Feed these input and output pairs to any specific Machine
Learning algorithm,
I Deploy this model such that for newer values of device
attributes it can predict if a specific device is behaving
normally or it might cause a potential outage.
ML Basics
General definition
The need for Machine Learning
I Making Data-Driven Decisions
I Efficiency and Scale
Machine Learning is the field of study that gives computers the
ability to learn without being explicitly programmed. Arthur
Samuel, 1959
ML Basics
Formal definition
A computer program is said to learn from experience E with
respect to some class of tasks T and performance measure P, if its
performance at tasks in T, as measured by P, improves with
experience E
Figure 3: ML Ven Diagramme
Task, Experience & Performance
A task, T, can usually be defined as a Machine Learning task
based on the process or workflow that the system should follow to
operate on data points or samples.
I Classification or categorization
I Regression
I Anomaly Detection
I Structured annotation
I Translation
I Clustering or grouping
I Transcriptions
Task, Experience & Performance
The process of consuming a dataset that consists of data samples
or data points such that a learning algorithm or model learns
inherent patterns is defined as the experience, E which is gained
by the learning algorithm.
The performance, P, is usually a quantitative measure or metric
that’s used to see how well the algorithm or model is performing
the task, T, with experience, E.
Machine Learning: a true multi-disciplinary field
Figure 4: Discipline Ven Diagramme
(Artificial Intelligence)
The art, science and engineering of making intelligent agents,
machines and programs.
Figure 5: Diverse major facets under the AI umbrella
ML Basics
Deep Learning
Deep Learning based approach tries to build machine intelligence
by representing data as a layered hierarchy of concepts, where each
layer of concepts is built from other simpler layers.
Figure 6: Performance comparison of Deep Learning and traditional
ML Basics
Important Concepts of DL
I Artificial Neural Networks
I Backpropagation
I Multilayer Perceptrons
I Convolutional Neural Networks
I Recurrent Neural Networks
I Long Short-Term Memory Networks
I Autoencoders
ML Basics
Artificial Neural Networks illustration
Figure 7: A typical artificial neural network
Machine Learning Methods
1. Methods based on the amount of human supervision in
the learning process
I Supervised learning
I Unsupervised learning
I Semi-supervised learning
I Reinforcement learning
2. Methods based on the ability to learn from incremental
data samples
I Batch learning
I Online learning
3. Methods based on their approach to generalization from
data samples
I Instance based learning
I Model based learning
Supervised Learning
Supervised learning methods or algorithms include learning
algorithms that take in data samples (known as training data) and
associated outputs (known as labels or responses) with each data
sample during the model training process. The main objective is to
learn a mapping or association between input data samples x and
their corresponding outputs y based on multiple training data
instances. Main supervised learning methods :
I Classification
I Regression
Supervised Learning : Classification
predict output labels or classes or responses that are categorical in
nature for input data based on what the model has learned in the
training phase.
Figure 8: Illustration of Classification
Supervised Learning : Regression
Machine Learning tasks where the main objective is value
estimation can be termed as regression tasks.
Figure 9: Supervised learning: regression models for house price prediction
Unsupervised Learning
the model or algorithm tries to learn inherent latent structures,
patterns and relationships from given data without any help or
supervision. Unsupervised learning methods can be categorized
under the following broad areas of ML tasks
I Clustering
I Dimensionality reduction
I Anomaly detection
I Association rule-mining
Unsupervised Learning : Clustering
Clustering methods are Machine Learning methods that try to find
patterns of similarity and relationships among data samples in our
dataset and then cluster these samples into various groups, such
that each group or cluster of data samples has some similarity,
based on the inherent attributes or features.
Figure 10: Unsupervised learning: clustering log messages
Unsupervised Learning : Dimensionality Reduction
These methods reduce the number of feature variables by
extracting or selecting a set of principal or representative features.
Figure 11: Unsupervised learning: dimensionality reduction
Unsupervised Learning : Anomaly Detection
We are interested in finding out occurrences of rare events or
observations that typically do not occur normally based on
historical data samples.
Figure 12: Unsupervised learning: anomaly detection
Unsupervised learning : Association Rule-Mining
Typically association rule-mining is a data mining method use to
examine and analyze large transactional datasets to find patterns
and rules of interest.
Figure 13: Unsupervised learning: association rule-mining
Semi-Supervised Learning : Reinforcement Learning
We have an agent that we want to train over a period of time to
interact with a specific environment and improve its performance
over a period of time with regard to the type of actions it performs
on the environment.
1. Prepare agent with set of initial policies and strategy
2. Observe environment and current state
3. Select optimal policy and perform action
4. Get corresponding reward (or penalty)
5. Update policies if needed
6. Repeat Steps 2 - 5 iteratively until agent learns the most
optimal policies
Semi-Supervised Learning : Reinforcement Learning
We have an agent that we want to train over a period of time to
interact with a specific environment and improve its performance
over a period of time with regard to the type of actions it performs
on the environment.
Figure 14: Reinforcement learning: training a robot to play chess
Machine Learning overview
Figure 15: Machine learning overview
Machine Learning pipeline
Figure 16: Machine learning pipeline
Machine Learning Challenges (1)
I Data quality issues lead to problems,
I Data acquisition, extraction, and retrieval is an extremely
tedious and time consuming process,
I Lack of good quality and sufficient training data in many
scenarios,
I Formulating business problems clearly with well-defined goals
and objectives,
Machine Learning Challenges (2)
I Feature extraction and engineering, especially hand-crafting
features,
I Overfitting or underfitting models,
I The curse of dimensionality: too many features can be a real
hindrance,
I Complex models can be difficult to deploy in the real world.
Real-World Applications of Machine Learning (1)
I Product recommendations in online shopping platforms,
I Sentiment and emotion analysis,
I Anomaly detection,
I Fraud detection and prevention,
I Content recommendation (news, music, movies, and so on)
Real-World Applications of Machine Learning (2)
I Weather forecasting
I Stock market forecasting
I Market basket analysis
I Customer segmentation
I Object and scene recognition in images and video
Real-World Applications of Machine Learning (3)
I Speech recognition
I Churn analytics
I Click through predictions
I Failure/defect detection and prevention
I E-mail spam filtering
Python ML Ecosystem