KEMBAR78
Machine Learning 2 deep Learning: An Intro | PPTX
Machine Learning to Deep
Learning: An Introduction
S. I. Krishan
Integrated Knowledge Solutions
www.iksinc.online/home
Agenda
• What is machine learning?
• Machine learning terminology
• Overview of machine learning methods
• Machine learning to deep learning
• Summary and Q & A
iksinc@yahoo.com
What is Machine Learning?
iksinc@yahoo.com
What is Machine
Learning?
• Machine learning deals with making computers learn
to make predictions/decisions without explicitly
programming them. Rather a large number of
examples of the underlying task are shown to
optimize a performance criterion to achieve learning.
iksinc@yahoo.com
Why Machine
Learning?
iksinc@yahoo.com
Buzz about Machine
Learning
"Every company is now a data company,
capable of using machine learning in the cloud
to deploy intelligent apps at scale, thanks to
three machine learning trends: data flywheels,
the algorithm economy, and cloud-hosted
intelligence."
Three factors are making machine learning hot. These are cheap data,
algorithmic economy, and cloud-based solutions.
iksinc@yahoo.com
Data is Getting Cheaper
For example, Tesla has 780 million miles of driving
data, and adds another million every 10 hoursiksinc@yahoo.com
Algorithmic Economy
iksinc@yahoo.com
Algorithm Economy
Players in ML
iksinc@yahoo.com
Cloud-Based
Intelligence
Emerging machine intelligence
platforms hosting pre-trained
machine learning models-as-a-
service are making it easy for
companies to get started with ML,
allowing them to rapidly take their
applications from prototype to
production.
Many open source machine learning and
deep learning frameworks running in the
cloud allow easy leveraging of pre-
trained, hosted models to tag images,
recommend products, and do general
natural language processing tasks.
iksinc@yahoo.com
Apps for Excel
iksinc@yahoo.com
Machine Learning Terminology
iksinc@yahoo.com
Feature Vectors in ML
• A machine learning system builds models using properties of objects being
modeled. These properties are called features or attributes and the process of
measuring/obtaining such properties is called feature extraction. It is common to
represent the properties of objects as feature vectors.
Sepal width
Sepal length
Petal width
Petal length
iksinc@yahoo.com
Learning Styles
• Supervised Learning
– Training data comes with answers, called labels
– The goal is to produce labels for new data
iksinc@yahoo.com
Supervised Learning
Models
• Classification models
– Predict customer
churning
– Tag objects in a given
image
– Determine whether
an incoming email is
spam or not
iksinc@yahoo.com
Supervised Learning
Models
• Regression models
– Predict credit card
balance of customers
– Predict the number
of 'likes' for a posting
– Predict peak load for
a utility given
weather information
iksinc@yahoo.com
Learning Styles
• Unsupervised Learning
– Training data comes without labels
– The goal is to group data into different categories based on similarities
Grouped Data
iksinc@yahoo.com
Unsupervised Learning
Models
• Segment/ cluster
customers into
different groups
• Organize a collection
of documents based
on their content
• Make product
Recommendations
iksinc@yahoo.com
Learning Styles
• Reinforcement Learning
– Training data comes without labels
– The learning system receives feedback from its operating
environment to know how well it is doing
– The goal is to perform better
iksinc@yahoo.com
Reinforcement
Learning
iksinc@yahoo.com
Overview of Machine Learning Methods
iksinc@yahoo.com
Walk Through An Example:
Flower Classification
• Build a classification model
to differentiate between
two classes of flower
iksinc@yahoo.com
How Do We Go
About It?
• Collect a large number of both types of flowers with the help
of an expert
• Measure some attributes that can help differentiate between
the two types of flowers. Let those attributes be petal area
and sepal area.
iksinc@yahoo.com
Scatter plot of 100 examples of flowers
iksinc@yahoo.com
We can separate the flower
types using the linear boundary
shown next. The parameters of
the line represent the learned
classification model.
iksinc@yahoo.com
Another possible boundary. This
boundary cannot be expressed
via an equation. However, a tree
structure can be used to express
this boundary. Note, this
boundary does better prediction
of the collected data
iksinc@yahoo.com
Yet another possible boundary.
This boundary does prediction
without any error. Is this a better
boundary?
iksinc@yahoo.com
Model Complexity
• There are tradeoffs between the complexity of models and
their performance in the field. A good design (model choice)
weighs these tradeoffs.
• A good design should avoid overfitting. How?
– Divide the entire data into three sets
• Training set (about 70% of the total data). Use this set to build the model
• Test set (about 20% of the total data). Use this set to estimate the model
accuracy after deployment
• Validation set (remaining 10% of the total data). Use this set to determine
the appropriate settings for free parameters of the model. May not be
required in some cases.
iksinc@yahoo.com
Measuring Model
Performance
• True Positive: Correctly identified as relevant
• True Negative: Correctly identified as not relevant
• False Positive: Incorrectly labeled as relevant
• False Negative: Incorrectly labeled as not relevant
Image:
True
Positive
True
Negative
Cat vs. No Cat
False
Negative
False
Positive
iksinc@yahoo.com
Precision, Recall, and
Accuracy
• Precision
– Percentage of positive labels that are correct
– Precision = (# true positives) / (# true positives + # false positives)
• Recall
– Percentage of positive examples that are correctly labeled
– Recall = (# true positives) / (# true positives + # false negatives)
• Accuracy
– Percentage of correct labels
– Accuracy = (# true positives + # true negatives) / (# of samples)
iksinc@yahoo.com
Sum-of-Squares Error
for Regression Models
For regression model, the error is measured by taking the square of the
difference between the predicted output value and the target value for each
training (test) example and adding this number over all examples as shown
iksinc@yahoo.com
Bias and Variance
• Bias: expected difference between model’s
prediction and truth
• Variance: how much the model differs among
training sets
• Model Scenarios
– High Bias: Model makes inaccurate predictions on training
data
– High Variance: Model does not generalize to new datasets
– Low Bias: Model makes accurate predictions on training
data
– Low Variance: Model generalizes to new datasets
iksinc@yahoo.com
The Guiding Principle
for Model Selection:
Occam’s Razor
iksinc@yahoo.com
Model Building
Algorithms
• Supervised learning algorithms
– Linear methods
– k-NN classifier
– Neural networks
– Support vector machine
– Decision tree
– Ensemble method
iksinc@yahoo.com
Illustration of k-NN Model
Predicted label of test
example with 1-NN model
: Versicolor
Predicted label of text
example with 3-NN
model: Virginica
Test example
iksinc@yahoo.com
Illustration of Decision
Tree Model
Petal width <= 0.8
Setosa
Yes
Petal length <= 4.75
Versicolor Virginica
Yes No
No
The decision tree is automatically generated by a machine learning algorithm.
iksinc@yahoo.com
Model Building
Algorithms
• Unsupervised learning
– k-means clustering
– Agglomerative clustering
– Self organization feature maps
– Recommendation system
iksinc@yahoo.com
K-means Clustering
K-
“by far th
cluste
nowadays
industri
Choose the number of
clusters, k, and initial
cluster centers
K-means Clustering
K-
“by far th
cluste
nowadays
industri
K-means clusterinK-means clusterinK-means clusterin
Assign data points to
clusters based on
distance to cluster
centers
K-means Clustering
K-
“by far th
cluste
nowadays
industri
K-means clusterinK-means clusterinK-means clusterin
Update cluster centers
and reassign data
points.
Illustration of
Recommendation
System
iksinc@yahoo.com
iksinc@yahoo.com
Machine Learning to Deep Learning
iksinc@yahoo.com
Machine Learning
Limitation
• Machine learning methods operate on manually
designed features.
• The design of such features for tasks involving
computer vision, speech understanding, natural
language processing is extremely difficult. This puts a
limit on the performance of the system.
iksinc@yahoo.com
Feature Extractor
Trainable
Classifier
Processing Sensory Data
is Hard
How do we bridge this gap
between the pixels and
meaning via machine
learning?
Sensory Data Processing
is Challenging
Sensory Data Processing
is Challenging
Sensory Data Processing
is Challenging
End-to-End Learning
Coming up with features is often difficult, time consuming, and requires
expert knowledge.
So why not build integrated learning systems that perform end-to-end
learning, i.e. learn the representation as well as classification from raw
data without any engineered features.
Feature Learner
Trainable
Classifier
An approach performing end-to-end learning, typically performed through
a series of successive abstractions, is in a nutshell deep learning
Deep Learning
• It’s a subfield of machine learning that has shown
remarkable success in dealing with applications
requiring processing of pictures, videos, speech, and
text.
• Deep learning is characterized by:
– Extremely large amount of data for training
– Neural networks with exceedingly large number of layers
– Training time running into weeks in many instances
– End to end learning (No human designed rules/features are
used)
iksinc@yahoo.com
Deep Learning Models
iksinc@yahoo.com
Convolutional Neural Networks
Deep Learning Models
iksinc@yahoo.com
Recurrent Neural Networks
Deep Q Network (DQN)
iksinc@yahoo.com
CNN Application Example:
Object Detection and Labeling
iksinc@yahoo.com
RNN Application
Example
iksinc@yahoo.com
DQN Application
Example
iksinc@yahoo.com
Training Deep Neural
Networks
• The work horse is the backpropagation algorithm based on
chain-rule differentiation
• Training consists of optimizing a suitable loss function using
the stochastic gradient descent (SGD) algorithm to adjust
networks parameters typically running into millions.
iksinc@yahoo.com
Deep Learning Examples:
Automatic Description
Generation of Images
iksinc@yahoo.com
Training Data Examples
Generated Captions
Deep Learning Examples:
Predicting Heart Attacks
iksinc@yahoo.com
iksinc@yahoo.com
Why Deep Learning Now?
iksinc@yahoo.com
Transfer Learning
• Trained models for one task can be used for
another task via transfer learning
– Faster training
– Data needs are low
iksinc@yahoo.com
Learning Bias
iksinc@yahoo.com
There is an urban legend that back in the 90’s, the US government commissioned for a
project to detect tanks in a picture. The researchers built a neural network and used it
classify the images. Once the product was actually put to test, it did not perform at all.
On further inspection they noticed that the model had learnt the weather patterns
instead of the tanks. The trained images with tanks were taken on a cloudy day and
images with no tanks were taken on a sunny day. This is a prime example of how we
need to understand the learning by a neural net.
Learning Bias
iksinc@yahoo.com
Architecture Deficiencies
iksinc@yahoo.com
Summary
• Machine learning and deep learning are growing in their
usage
• Many of the ideas have been around for a long time but are
being put into practice now because of technology progress
• Several open source software resources (R, Rapid Miner, and
Scikit-learn, PyTorch, TensorFlow etc.) to learn via
experimentation
• Applications based on vision, speech, and natural language
processing are excellent candidates for deep learning
• Need to filter hype from reality
iksinc@yahoo.com
https://iksinc.online/home/
iksinc@yahoo.com
iksinc@yahoo.com

Machine Learning 2 deep Learning: An Intro

  • 1.
    Machine Learning toDeep Learning: An Introduction S. I. Krishan Integrated Knowledge Solutions www.iksinc.online/home
  • 2.
    Agenda • What ismachine learning? • Machine learning terminology • Overview of machine learning methods • Machine learning to deep learning • Summary and Q & A iksinc@yahoo.com
  • 3.
    What is MachineLearning? iksinc@yahoo.com
  • 4.
    What is Machine Learning? •Machine learning deals with making computers learn to make predictions/decisions without explicitly programming them. Rather a large number of examples of the underlying task are shown to optimize a performance criterion to achieve learning. iksinc@yahoo.com
  • 5.
  • 6.
    Buzz about Machine Learning "Everycompany is now a data company, capable of using machine learning in the cloud to deploy intelligent apps at scale, thanks to three machine learning trends: data flywheels, the algorithm economy, and cloud-hosted intelligence." Three factors are making machine learning hot. These are cheap data, algorithmic economy, and cloud-based solutions. iksinc@yahoo.com
  • 7.
    Data is GettingCheaper For example, Tesla has 780 million miles of driving data, and adds another million every 10 hoursiksinc@yahoo.com
  • 8.
  • 9.
    Algorithm Economy Players inML iksinc@yahoo.com
  • 10.
    Cloud-Based Intelligence Emerging machine intelligence platformshosting pre-trained machine learning models-as-a- service are making it easy for companies to get started with ML, allowing them to rapidly take their applications from prototype to production. Many open source machine learning and deep learning frameworks running in the cloud allow easy leveraging of pre- trained, hosted models to tag images, recommend products, and do general natural language processing tasks. iksinc@yahoo.com
  • 11.
  • 12.
  • 13.
    Feature Vectors inML • A machine learning system builds models using properties of objects being modeled. These properties are called features or attributes and the process of measuring/obtaining such properties is called feature extraction. It is common to represent the properties of objects as feature vectors. Sepal width Sepal length Petal width Petal length iksinc@yahoo.com
  • 14.
    Learning Styles • SupervisedLearning – Training data comes with answers, called labels – The goal is to produce labels for new data iksinc@yahoo.com
  • 15.
    Supervised Learning Models • Classificationmodels – Predict customer churning – Tag objects in a given image – Determine whether an incoming email is spam or not iksinc@yahoo.com
  • 16.
    Supervised Learning Models • Regressionmodels – Predict credit card balance of customers – Predict the number of 'likes' for a posting – Predict peak load for a utility given weather information iksinc@yahoo.com
  • 17.
    Learning Styles • UnsupervisedLearning – Training data comes without labels – The goal is to group data into different categories based on similarities Grouped Data iksinc@yahoo.com
  • 18.
    Unsupervised Learning Models • Segment/cluster customers into different groups • Organize a collection of documents based on their content • Make product Recommendations iksinc@yahoo.com
  • 19.
    Learning Styles • ReinforcementLearning – Training data comes without labels – The learning system receives feedback from its operating environment to know how well it is doing – The goal is to perform better iksinc@yahoo.com
  • 20.
  • 21.
    Overview of MachineLearning Methods iksinc@yahoo.com
  • 22.
    Walk Through AnExample: Flower Classification • Build a classification model to differentiate between two classes of flower iksinc@yahoo.com
  • 23.
    How Do WeGo About It? • Collect a large number of both types of flowers with the help of an expert • Measure some attributes that can help differentiate between the two types of flowers. Let those attributes be petal area and sepal area. iksinc@yahoo.com
  • 24.
    Scatter plot of100 examples of flowers iksinc@yahoo.com
  • 25.
    We can separatethe flower types using the linear boundary shown next. The parameters of the line represent the learned classification model. iksinc@yahoo.com
  • 26.
    Another possible boundary.This boundary cannot be expressed via an equation. However, a tree structure can be used to express this boundary. Note, this boundary does better prediction of the collected data iksinc@yahoo.com
  • 27.
    Yet another possibleboundary. This boundary does prediction without any error. Is this a better boundary? iksinc@yahoo.com
  • 28.
    Model Complexity • Thereare tradeoffs between the complexity of models and their performance in the field. A good design (model choice) weighs these tradeoffs. • A good design should avoid overfitting. How? – Divide the entire data into three sets • Training set (about 70% of the total data). Use this set to build the model • Test set (about 20% of the total data). Use this set to estimate the model accuracy after deployment • Validation set (remaining 10% of the total data). Use this set to determine the appropriate settings for free parameters of the model. May not be required in some cases. iksinc@yahoo.com
  • 29.
    Measuring Model Performance • TruePositive: Correctly identified as relevant • True Negative: Correctly identified as not relevant • False Positive: Incorrectly labeled as relevant • False Negative: Incorrectly labeled as not relevant Image: True Positive True Negative Cat vs. No Cat False Negative False Positive iksinc@yahoo.com
  • 30.
    Precision, Recall, and Accuracy •Precision – Percentage of positive labels that are correct – Precision = (# true positives) / (# true positives + # false positives) • Recall – Percentage of positive examples that are correctly labeled – Recall = (# true positives) / (# true positives + # false negatives) • Accuracy – Percentage of correct labels – Accuracy = (# true positives + # true negatives) / (# of samples) iksinc@yahoo.com
  • 31.
    Sum-of-Squares Error for RegressionModels For regression model, the error is measured by taking the square of the difference between the predicted output value and the target value for each training (test) example and adding this number over all examples as shown iksinc@yahoo.com
  • 32.
    Bias and Variance •Bias: expected difference between model’s prediction and truth • Variance: how much the model differs among training sets • Model Scenarios – High Bias: Model makes inaccurate predictions on training data – High Variance: Model does not generalize to new datasets – Low Bias: Model makes accurate predictions on training data – Low Variance: Model generalizes to new datasets iksinc@yahoo.com
  • 33.
    The Guiding Principle forModel Selection: Occam’s Razor iksinc@yahoo.com
  • 34.
    Model Building Algorithms • Supervisedlearning algorithms – Linear methods – k-NN classifier – Neural networks – Support vector machine – Decision tree – Ensemble method iksinc@yahoo.com
  • 35.
    Illustration of k-NNModel Predicted label of test example with 1-NN model : Versicolor Predicted label of text example with 3-NN model: Virginica Test example iksinc@yahoo.com
  • 36.
    Illustration of Decision TreeModel Petal width <= 0.8 Setosa Yes Petal length <= 4.75 Versicolor Virginica Yes No No The decision tree is automatically generated by a machine learning algorithm. iksinc@yahoo.com
  • 37.
    Model Building Algorithms • Unsupervisedlearning – k-means clustering – Agglomerative clustering – Self organization feature maps – Recommendation system iksinc@yahoo.com
  • 38.
    K-means Clustering K- “by farth cluste nowadays industri Choose the number of clusters, k, and initial cluster centers
  • 39.
    K-means Clustering K- “by farth cluste nowadays industri K-means clusterinK-means clusterinK-means clusterin Assign data points to clusters based on distance to cluster centers
  • 40.
    K-means Clustering K- “by farth cluste nowadays industri K-means clusterinK-means clusterinK-means clusterin Update cluster centers and reassign data points.
  • 41.
  • 42.
  • 43.
    Machine Learning toDeep Learning iksinc@yahoo.com
  • 44.
    Machine Learning Limitation • Machinelearning methods operate on manually designed features. • The design of such features for tasks involving computer vision, speech understanding, natural language processing is extremely difficult. This puts a limit on the performance of the system. iksinc@yahoo.com Feature Extractor Trainable Classifier
  • 45.
    Processing Sensory Data isHard How do we bridge this gap between the pixels and meaning via machine learning?
  • 46.
  • 47.
  • 48.
  • 49.
    End-to-End Learning Coming upwith features is often difficult, time consuming, and requires expert knowledge. So why not build integrated learning systems that perform end-to-end learning, i.e. learn the representation as well as classification from raw data without any engineered features. Feature Learner Trainable Classifier An approach performing end-to-end learning, typically performed through a series of successive abstractions, is in a nutshell deep learning
  • 50.
    Deep Learning • It’sa subfield of machine learning that has shown remarkable success in dealing with applications requiring processing of pictures, videos, speech, and text. • Deep learning is characterized by: – Extremely large amount of data for training – Neural networks with exceedingly large number of layers – Training time running into weeks in many instances – End to end learning (No human designed rules/features are used) iksinc@yahoo.com
  • 51.
  • 52.
  • 53.
    Deep Q Network(DQN) iksinc@yahoo.com
  • 54.
    CNN Application Example: ObjectDetection and Labeling iksinc@yahoo.com
  • 55.
  • 56.
  • 57.
    Training Deep Neural Networks •The work horse is the backpropagation algorithm based on chain-rule differentiation • Training consists of optimizing a suitable loss function using the stochastic gradient descent (SGD) algorithm to adjust networks parameters typically running into millions. iksinc@yahoo.com
  • 58.
    Deep Learning Examples: AutomaticDescription Generation of Images iksinc@yahoo.com Training Data Examples Generated Captions
  • 59.
    Deep Learning Examples: PredictingHeart Attacks iksinc@yahoo.com
  • 60.
  • 61.
    Why Deep LearningNow? iksinc@yahoo.com
  • 62.
    Transfer Learning • Trainedmodels for one task can be used for another task via transfer learning – Faster training – Data needs are low iksinc@yahoo.com
  • 63.
    Learning Bias iksinc@yahoo.com There isan urban legend that back in the 90’s, the US government commissioned for a project to detect tanks in a picture. The researchers built a neural network and used it classify the images. Once the product was actually put to test, it did not perform at all. On further inspection they noticed that the model had learnt the weather patterns instead of the tanks. The trained images with tanks were taken on a cloudy day and images with no tanks were taken on a sunny day. This is a prime example of how we need to understand the learning by a neural net.
  • 64.
  • 65.
  • 66.
    Summary • Machine learningand deep learning are growing in their usage • Many of the ideas have been around for a long time but are being put into practice now because of technology progress • Several open source software resources (R, Rapid Miner, and Scikit-learn, PyTorch, TensorFlow etc.) to learn via experimentation • Applications based on vision, speech, and natural language processing are excellent candidates for deep learning • Need to filter hype from reality iksinc@yahoo.com
  • 67.