Machine Learning 2 deep Learning: An Intro

Machine Learning to Deep
Learning: An Introduction
S. I. Krishan
Integrated Knowledge Solutions
www.iksinc.online/home

Agenda
• What is machine learning?
• Machine learning terminology
• Overview of machine learning methods
• Machine learning to deep learning
• Summary and Q & A
iksinc@yahoo.com

What is Machine Learning?
iksinc@yahoo.com

What is Machine
Learning?
• Machine learning deals with making computers learn
to make predictions/decisions without explicitly
programming them. Rather a large number of
examples of the underlying task are shown to
optimize a performance criterion to achieve learning.
iksinc@yahoo.com

Why Machine
Learning?
iksinc@yahoo.com

Buzz about Machine
Learning
"Every company is now a data company,
capable of using machine learning in the cloud
to deploy intelligent apps at scale, thanks to
three machine learning trends: data flywheels,
the algorithm economy, and cloud-hosted
intelligence."
Three factors are making machine learning hot. These are cheap data,
algorithmic economy, and cloud-based solutions.
iksinc@yahoo.com

Data is Getting Cheaper
For example, Tesla has 780 million miles of driving
data, and adds another million every 10 hoursiksinc@yahoo.com

Algorithmic Economy
iksinc@yahoo.com

Algorithm Economy
Players in ML
iksinc@yahoo.com

Cloud-Based
Intelligence
Emerging machine intelligence
platforms hosting pre-trained
machine learning models-as-a-
service are making it easy for
companies to get started with ML,
allowing them to rapidly take their
applications from prototype to
production.
Many open source machine learning and
deep learning frameworks running in the
cloud allow easy leveraging of pre-
trained, hosted models to tag images,
recommend products, and do general
natural language processing tasks.
iksinc@yahoo.com

Apps for Excel
iksinc@yahoo.com

Machine Learning Terminology
iksinc@yahoo.com

Feature Vectors in ML
• A machine learning system builds models using properties of objects being
modeled. These properties are called features or attributes and the process of
measuring/obtaining such properties is called feature extraction. It is common to
represent the properties of objects as feature vectors.
Sepal width
Sepal length
Petal width
Petal length
iksinc@yahoo.com

Learning Styles
• Supervised Learning
– Training data comes with answers, called labels
– The goal is to produce labels for new data
iksinc@yahoo.com

Supervised Learning
Models
• Classification models
– Predict customer
churning
– Tag objects in a given
image
– Determine whether
an incoming email is
spam or not
iksinc@yahoo.com

Supervised Learning
Models
• Regression models
– Predict credit card
balance of customers
– Predict the number
of 'likes' for a posting
– Predict peak load for
a utility given
weather information
iksinc@yahoo.com

Learning Styles
• Unsupervised Learning
– Training data comes without labels
– The goal is to group data into different categories based on similarities
Grouped Data
iksinc@yahoo.com

Unsupervised Learning
Models
• Segment/ cluster
customers into
different groups
• Organize a collection
of documents based
on their content
• Make product
Recommendations
iksinc@yahoo.com

Learning Styles
• Reinforcement Learning
– Training data comes without labels
– The learning system receives feedback from its operating
environment to know how well it is doing
– The goal is to perform better
iksinc@yahoo.com

Reinforcement
Learning
iksinc@yahoo.com

Overview of Machine Learning Methods
iksinc@yahoo.com

Walk Through An Example:
Flower Classification
• Build a classification model
to differentiate between
two classes of flower
iksinc@yahoo.com

How Do We Go
About It?
• Collect a large number of both types of flowers with the help
of an expert
• Measure some attributes that can help differentiate between
the two types of flowers. Let those attributes be petal area
and sepal area.
iksinc@yahoo.com

Scatter plot of 100 examples of flowers
iksinc@yahoo.com

We can separate the flower
types using the linear boundary
shown next. The parameters of
the line represent the learned
classification model.
iksinc@yahoo.com

Another possible boundary. This
boundary cannot be expressed
via an equation. However, a tree
structure can be used to express
this boundary. Note, this
boundary does better prediction
of the collected data
iksinc@yahoo.com

Yet another possible boundary.
This boundary does prediction
without any error. Is this a better
boundary?
iksinc@yahoo.com

Model Complexity
• There are tradeoffs between the complexity of models and
their performance in the field. A good design (model choice)
weighs these tradeoffs.
• A good design should avoid overfitting. How?
– Divide the entire data into three sets
• Training set (about 70% of the total data). Use this set to build the model
• Test set (about 20% of the total data). Use this set to estimate the model
accuracy after deployment
• Validation set (remaining 10% of the total data). Use this set to determine
the appropriate settings for free parameters of the model. May not be
required in some cases.
iksinc@yahoo.com

Measuring Model
Performance
• True Positive: Correctly identified as relevant
• True Negative: Correctly identified as not relevant
• False Positive: Incorrectly labeled as relevant
• False Negative: Incorrectly labeled as not relevant
Image:
True
Positive
True
Negative
Cat vs. No Cat
False
Negative
False
Positive
iksinc@yahoo.com

Precision, Recall, and
Accuracy
• Precision
– Percentage of positive labels that are correct
– Precision = (# true positives) / (# true positives + # false positives)
• Recall
– Percentage of positive examples that are correctly labeled
– Recall = (# true positives) / (# true positives + # false negatives)
• Accuracy
– Percentage of correct labels
– Accuracy = (# true positives + # true negatives) / (# of samples)
iksinc@yahoo.com

Sum-of-Squares Error
for Regression Models
For regression model, the error is measured by taking the square of the
difference between the predicted output value and the target value for each
training (test) example and adding this number over all examples as shown
iksinc@yahoo.com

Bias and Variance
• Bias: expected difference between model’s
prediction and truth
• Variance: how much the model differs among
training sets
• Model Scenarios
– High Bias: Model makes inaccurate predictions on training
data
– High Variance: Model does not generalize to new datasets
– Low Bias: Model makes accurate predictions on training
data
– Low Variance: Model generalizes to new datasets
iksinc@yahoo.com

The Guiding Principle
for Model Selection:
Occam’s Razor
iksinc@yahoo.com

Model Building
Algorithms
• Supervised learning algorithms
– Linear methods
– k-NN classifier
– Neural networks
– Support vector machine
– Decision tree
– Ensemble method
iksinc@yahoo.com

Illustration of k-NN Model
Predicted label of test
example with 1-NN model
: Versicolor
Predicted label of text
example with 3-NN
model: Virginica
Test example
iksinc@yahoo.com

Illustration of Decision
Tree Model
Petal width <= 0.8
Setosa
Yes
Petal length <= 4.75
Versicolor Virginica
Yes No
No
The decision tree is automatically generated by a machine learning algorithm.
iksinc@yahoo.com

Model Building
Algorithms
• Unsupervised learning
– k-means clustering
– Agglomerative clustering
– Self organization feature maps
– Recommendation system
iksinc@yahoo.com

K-means Clustering
K-
“by far th
cluste
nowadays
industri
Choose the number of
clusters, k, and initial
cluster centers

K-means Clustering
K-
“by far th
cluste
nowadays
industri
K-means clusterinK-means clusterinK-means clusterin
Assign data points to
clusters based on
distance to cluster
centers

K-means Clustering
K-
“by far th
cluste
nowadays
industri
K-means clusterinK-means clusterinK-means clusterin
Update cluster centers
and reassign data
points.

Illustration of
Recommendation
System
iksinc@yahoo.com

Machine Learning to Deep Learning
iksinc@yahoo.com

Machine Learning
Limitation
• Machine learning methods operate on manually
designed features.
• The design of such features for tasks involving
computer vision, speech understanding, natural
language processing is extremely difficult. This puts a
limit on the performance of the system.
iksinc@yahoo.com
Feature Extractor
Trainable
Classifier

Processing Sensory Data
is Hard
How do we bridge this gap
between the pixels and
meaning via machine
learning?

Sensory Data Processing
is Challenging

End-to-End Learning
Coming up with features is often difficult, time consuming, and requires
expert knowledge.
So why not build integrated learning systems that perform end-to-end
learning, i.e. learn the representation as well as classification from raw
data without any engineered features.
Feature Learner
Trainable
Classifier
An approach performing end-to-end learning, typically performed through
a series of successive abstractions, is in a nutshell deep learning

Deep Learning
• It’s a subfield of machine learning that has shown
remarkable success in dealing with applications
requiring processing of pictures, videos, speech, and
text.
• Deep learning is characterized by:
– Extremely large amount of data for training
– Neural networks with exceedingly large number of layers
– Training time running into weeks in many instances
– End to end learning (No human designed rules/features are
used)
iksinc@yahoo.com

Deep Learning Models
iksinc@yahoo.com
Convolutional Neural Networks

Deep Learning Models
iksinc@yahoo.com
Recurrent Neural Networks

Deep Q Network (DQN)
iksinc@yahoo.com

CNN Application Example:
Object Detection and Labeling
iksinc@yahoo.com

RNN Application
Example
iksinc@yahoo.com

DQN Application
Example
iksinc@yahoo.com

Training Deep Neural
Networks
• The work horse is the backpropagation algorithm based on
chain-rule differentiation
• Training consists of optimizing a suitable loss function using
the stochastic gradient descent (SGD) algorithm to adjust
networks parameters typically running into millions.
iksinc@yahoo.com

Deep Learning Examples:
Automatic Description
Generation of Images
iksinc@yahoo.com
Training Data Examples
Generated Captions

Deep Learning Examples:
Predicting Heart Attacks
iksinc@yahoo.com

Why Deep Learning Now?
iksinc@yahoo.com

Transfer Learning
• Trained models for one task can be used for
another task via transfer learning
– Faster training
– Data needs are low
iksinc@yahoo.com

Learning Bias
iksinc@yahoo.com
There is an urban legend that back in the 90’s, the US government commissioned for a
project to detect tanks in a picture. The researchers built a neural network and used it
classify the images. Once the product was actually put to test, it did not perform at all.
On further inspection they noticed that the model had learnt the weather patterns
instead of the tanks. The trained images with tanks were taken on a cloudy day and
images with no tanks were taken on a sunny day. This is a prime example of how we
need to understand the learning by a neural net.

Learning Bias
iksinc@yahoo.com

Architecture Deficiencies
iksinc@yahoo.com

Summary
• Machine learning and deep learning are growing in their
usage
• Many of the ideas have been around for a long time but are
being put into practice now because of technology progress
• Several open source software resources (R, Rapid Miner, and
Scikit-learn, PyTorch, TensorFlow etc.) to learn via
experimentation
• Applications based on vision, speech, and natural language
processing are excellent candidates for deep learning
• Need to filter hype from reality
iksinc@yahoo.com

https://iksinc.online/home/
iksinc@yahoo.com
iksinc@yahoo.com

Machine Learning 2 deep Learning: An Intro

More Related Content

What's hot

Similar to Machine Learning 2 deep Learning: An Intro

More from Si Krishan

Recently uploaded

Machine Learning 2 deep Learning: An Intro