0% found this document useful (0 votes)

23 views89 pages

Introduction To Machine Learning

The document presents an overview of Machine Learning, defining it as the ability of computer systems to learn and adapt from experience. It discusses various learning schemes, including supervised, unsupervised, and hybrid learning, along with their applications in fields like healthcare and data analytics. Additionally, it covers classification, clustering, and prediction techniques, emphasizing the importance of pattern recognition in complex tasks.

Uploaded by

jayanths.242sp014

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views89 pages

Introduction To Machine Learning

Uploaded by

jayanths.242sp014

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 89

Machine Learning & Application

Ashok Rao
Former Head, Network Project
CEDT, IISc, Bangalore
< ashokrao.mys@gmail.com >

.
Presentation Outline
 What is Machine Learning ?
 Why Machine Learning ?

 Common Learning Schemes / Models / structure

 Supervised learning

 Unsupervised learning

 Hybrid Learning, Semi-Supervised Learning,

 Classifiers

 Hybrid Classifiers

 Panel of classifiers

 One example using Subspace (PCA) methods

 Discussion and Conclusions.

2
What would have happened if
there is no learning?
Our dictionary defines “to learn” as

 Toget knowledge of something by

study, experience, or being taught.

 Tobecome aware by information or

from observation
Learning = Improving with
experience at some task
 Improve over Task T
 with respect to performance measure P
 based on experience E

5
What is Machine Learning?

6
The goal of machine learning is to
build computer systems that can
adapt and learn from their
experience.

 Tom Dietterich

7
Machine Learning
 What is Machine Learning ?
Ability to form rules automatically and subsequent use
(decision) by exposing a system (Algorithm, structure,
data, sensors, etc.,) to input data (information).
 Why Machine Learning ?

Society and Information related tasks are getting

increasingly complex, data is of exponential order,
responses are quickly and consistently required. (eg:
Genome and Genetic data, Pharma Industry)
Structure, pattern and “rules’ if can be extracted would be
very valuable and useful in finding appropriate responses.
Eg. Data mining, Text to Speech conversion, etc.,

8
Machine Learning in Medical Domain
 Health care is among the most critical of needs.
 While Technology has improved (CT Scan, MRI, PET, in
3-D too, CA Surgery, Tele-Medicine, etc)
 Still about 70% of the worlds population do not have
Quality and Reliable health care.
 Automating diagnosis, Testing and if possible doing it
remote (more effective if portable) would help.
 Rather than fully Automating health care, better is
providing Computer Assisted options.
 One such example is radiological diagnosis of scan data.

 This helps in complex and borderline cases and allows

for Second (and Multiple “opinions” easily)
9
Machine Learning
Where does it fit? What is it not?

Artificial Intelligence Statistics / Mathematics

Data
Mining

Machine
Computer Vision Learning

Robotics

(No definition of a field is perfect – the diagram above is just one

interpretation)
Applications of machine learning
Applications of machine learning
• identify the words in handwritten text
• understand a spoken language
• predict risks in safety-critical systems
• detect errors in a network
• Fraud detection
• Price and market prediction
• Credit card approval
Many applications are immensely hard to program
directly.
These almost always turn out to be “pattern
recognition” tasks.

1. Program the computer to do the pattern

recognition task directly.

2. Program the computer to be able to learn

from
examples. (“training” data)
Human Machine

Evolved (in a large part) Designed to solve logic

for pattern recognition and arithmetic
problems
Can solve gazillion of Can solve gazillion of
PR problems in an hour arithmetic and logical
problems in an hour
Huge number of Absolute precision
parallel but relatively
slow and unreliable
processors
Not perfectly precise Usually one very fast
processor

Not perfectly reliable High reliable

Application Dependent

Classificati
on

Feature Clusterin
Sensor Preprocessing
extraction
g

Prediction
Data Analytics Model

OneR,
Frequency Naïve Bayesian,
Decision tree
Classification and
regression

Similarity K-NN, SVM, GA

Modeling Hierarchical Agglomerative

Clustering

Partitional K-Mean, SOM

Historical Data Prediction

Count, Pie chart, Bar

Categorical
chart, Entropy

Exploration
Min, Max, Mean,
Variance, Histogram,
Numerical
Correlation, Plot,
Skewness.
Classification
• Data: A set of data records (also called
examples, instances or cases) described
by
• k attributes: A1, A2, … Ak.
• a class: Each example is labelled
with a pre-defined class.
• Goal: To learn a classification model
from the data that can be used to predict
the classes of new (future, or test)
cases/instances.
Prediction
Prediction

Quantitative Qualitative

Causal Model

Regression

Time series

Moving Average
Exponential
Smoothing
ARIMA

Kalman Filter
Clustering
• The goal of clustering is to
• group data points that are close (or similar) to each
other
• identify such groupings (or clusters) in an
unsupervised manner
• Unsupervised: no information is provided to the
algorithm on which data points belong to which
clusters
• Example

x x
What should
the clusters be
for these data
x points?
x x
x
x x
x
Supervised learning
yes no

Color Shape Size Output

Blue Torus Big Y
Blue Square Small Y
Blue Star Small Y
Red Arrow Small N
Learn to approximate function F(x1, x2, x3) -> t
from a training set of (x,t) pairs 22
Supervised learning
Training data

X1 X2 X3 T
B T B Y Learner
B S S Y
B S S Y Prediction
R A S N
T
Testing data
Y
X1 X2 X3 T
B A S ? N
Hypothesis

Y C S ?

23
Key issue: generalization
yes no

? ?
Rich (not exhaustive) training set (over fitting, Ch reg. A-Z)
24
Unsupervised learning
 What if there are no output labels?

25
Supervised vs. unsupervised learning
Supervised Unsupervised
Learning based on a training set where learning based on un-annotated instances
labeling of instances represents the target (the training data doesn’t specify what we
(categorization) Function are trying to learn)
Each data in the dataset analyzed has been Each data in the dataset analyzed is not
classified classified
Needs help from the data Doesn’t need help from the data
Needs great amount of data Great amount of data are not necessarily
needed
Outcome: a classification decision Outcome: a grouping of objects (instances
and
groups of instances)
Examples: Examples:
Neural Networks (NN) Clustering (Mixture Modeling)
Decision Trees Self Organizing Map (SOM)
Support Vector Machine (SVM)
(Humans are good at creating
groups/categories/clusters from data)
Supervised learning success stories

 Face detection
 Steering an autonomous car across the US
 Detecting credit card fraud
 Medical diagnosis
 …

28
Hypothesis spaces
 Decision trees
 Neural networks

 K-nearest neighbors

 Naïve Bayes classifier

 Support vector machines (SVMs)

 Boosted decision stumps (Ada-Boost)

…

29
Perceptron
(Single layer neural net )

Linearly separable data

30
Which separating hyperplane?

31
The best linear separator is the one with
the largest margin

margin

32
What if the data is not linearly separable?

33
Multilayer Perceptrons (hidden & output layers)

Number of layers can be identified, Feedback or Feed forward?

34
Kernel trick

 x2 
 x  
    2 xy 
 y   y 2 
  z3

x2 kernel
x1
z2
z1

Kernel implicitly maps from 2D to 3D,

making problem linearly separable
35
Occam’s Razor
In Latin:
“Entities should not be multiplied unnecessarily”

- William of Occum, 1320 AD.

- What does this mean?

36
Implications of Occam’s Razor
 Simplicity is the order of things.

 Simple Explanation.
 Simple Model.
 Simple Structure.

What if facts are “Complex”

- Combination of “Simple”

37
Boosting

Simple classifiers (weak learners) can have their performance

boosted by taking weighted combinations

Boosting maximizes the margin

38
What is Cluster Analysis?
• Finding groups of objects such that the objects
in a group will be similar (or related) to one
another and different from (or unrelated to)
the objects in other groups
Inter-
Intra- cluster
cluster distances
distances are
are maximized
minimized
Difficulties of
Representation
Hierarchical Clustering
• Build a tree-based hierarchical taxonomy
(dendrogram) from a set of documents.
animal

vertebrate invertebrate

fish reptile amphib. mammal worm insect crustacean

Hierarchical Clustering

1 5
4
3
6
9

8
Hierarchical Clustering
Ste Ste Ste Ste Ste
agglomerative
p0 p1 p2 p3 p4
a
ab
b
abcde
c
cde
d
de
e
divisive
Ste Ste Ste Ste Ste
p4 p3 p2 p1 p0
Partitional Clustering

Original Points A Partitional

Clustering
Partitional Clustering
• Partitioning method: Construct
a partition of n objects into a
set of K clusters
• Given: a set of objects and the
number K
• Find: a partition of K clusters
that optimizes the chosen
partitioning criterion
– Globally optimal: exhaustively
enumerate all partitions
– Effective heuristic methods: K-
K-means Clustering

• Partitional clustering approach

• Each cluster is associated with a centroid
(center point)
• Each point is assigned to the cluster with the
closest centroid
• Number of clusters, K, must be specified
• The basic algorithm is very simple
K-Means Clustering

Step 1: Select k
random seeds s.t. Initial
d(ki,kj) > dmin Seeds (if
k=3)
K-Means Clustering:

Initial Seeds
K-Means Clustering:

New
Centroids
K-Means Clustering:

Centroids
K-Means Clustering: Iterate
Until Stability

New
Centroids
ML enabling technologies
 Faster computers
 More data
 The web

 Parallel corpora (machine translation)

 Multiple sequenced genomes

 Gene expression arrays

 New ideas
 Kernel trick

 Large margins

 Boosting

 Graphical models

 …

58
Some Select references

 The web
 Kevin Murphy, MIT, AI Lab, PPT slides
 Avrim Blum, Carnegie Mellon University, PPT Slides
 Bishop C. Pattern Recognition and Machine Learning.
Springer, 2006.

59
m
Principal Component
Analysis (PCA)
PCA seeks a projection that best represents the
data in a least-squares sense.

PCA reduces the

dimensionality of
feature space by
restricting attention to
those directions along
which the scatter of the
data cloud is greatest.

.
We shall Build on This idea next
 Subspace Methods
 Training and Classification

 Complex Models (GMM)

 Statistical Methods (Monte Carlo Methods)

 What NOT.

Introduction To Machine Learning
100% (1)
Introduction To Machine Learning
119 pages
Mlintro 2
No ratings yet
Mlintro 2
28 pages
AI Chapter 3 Part 1
No ratings yet
AI Chapter 3 Part 1
33 pages
Machine Learning Overview & Techniques
No ratings yet
Machine Learning Overview & Techniques
30 pages
Machine Learning for Beginners
No ratings yet
Machine Learning for Beginners
27 pages
CS3491-AI ML-Chapter 1
No ratings yet
CS3491-AI ML-Chapter 1
19 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
Topic 1
No ratings yet
Topic 1
39 pages
Mlintro 4
No ratings yet
Mlintro 4
28 pages
01-Introduction To Machine Learning
No ratings yet
01-Introduction To Machine Learning
89 pages
01 - ML - Introduction
No ratings yet
01 - ML - Introduction
65 pages
105 Machine Learning Paper
No ratings yet
105 Machine Learning Paper
6 pages
Mlintro 3
No ratings yet
Mlintro 3
28 pages
Machine Learning
No ratings yet
Machine Learning
28 pages
Unit1 2
No ratings yet
Unit1 2
101 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
68 pages
Machine Learning
No ratings yet
Machine Learning
51 pages
Chapter-1 ML Intro
No ratings yet
Chapter-1 ML Intro
36 pages
COMP323 - Topic C - Introduction To Machine Learning 1
No ratings yet
COMP323 - Topic C - Introduction To Machine Learning 1
20 pages
Lecture 1 - Introduction
No ratings yet
Lecture 1 - Introduction
49 pages
Introduction ML
No ratings yet
Introduction ML
25 pages
Pattern Recognition With Semi-Supervised Learning Algorithm
No ratings yet
Pattern Recognition With Semi-Supervised Learning Algorithm
57 pages
MyChap1 - Introduction
No ratings yet
MyChap1 - Introduction
28 pages
Chapter 1
No ratings yet
Chapter 1
27 pages
4 DL
No ratings yet
4 DL
81 pages
Data Science & ML Course Guide
No ratings yet
Data Science & ML Course Guide
83 pages
MLDM Lect1 Introduction
No ratings yet
MLDM Lect1 Introduction
40 pages
Chapter 1 Introduction
No ratings yet
Chapter 1 Introduction
21 pages
ML Chap1
No ratings yet
ML Chap1
26 pages
Lecture01 Introduction To Machine Learning (Chapter1)
No ratings yet
Lecture01 Introduction To Machine Learning (Chapter1)
64 pages
Tirth PDF
No ratings yet
Tirth PDF
19 pages
Day 2 Part 1
No ratings yet
Day 2 Part 1
52 pages
Lecture 1 - Introduction To Machine Learning
No ratings yet
Lecture 1 - Introduction To Machine Learning
35 pages
Intro to Machine Learning Course
No ratings yet
Intro to Machine Learning Course
83 pages
13,14 Lecture
No ratings yet
13,14 Lecture
41 pages
Advanced Machine Learning Mastering Level Learning With Python
No ratings yet
Advanced Machine Learning Mastering Level Learning With Python
81 pages
Machine Learning Basics for Beginners
100% (5)
Machine Learning Basics for Beginners
134 pages
Unit 3
No ratings yet
Unit 3
62 pages
Machine Learning Concepts Guide
No ratings yet
Machine Learning Concepts Guide
39 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
225 pages
Class10-Introduction To ML
No ratings yet
Class10-Introduction To ML
32 pages
Ml-Unit 1
No ratings yet
Ml-Unit 1
53 pages
Module2 ch2
No ratings yet
Module2 ch2
36 pages
Unit 3
No ratings yet
Unit 3
33 pages
01 LecIntro
No ratings yet
01 LecIntro
23 pages
I2ml3e Chap1
No ratings yet
I2ml3e Chap1
20 pages
Machine Learning Syllabus Overview
No ratings yet
Machine Learning Syllabus Overview
70 pages
AI Notes Week 11
No ratings yet
AI Notes Week 11
68 pages
Data Classification - Algorithms and Applications-Chapman and Hall - CRC (2014) - (Chapman & Hall - CRC Data Mining and Knowledge Discovery Series) Charu C. Aggarwal PDF
100% (1)
Data Classification - Algorithms and Applications-Chapman and Hall - CRC (2014) - (Chapman & Hall - CRC Data Mining and Knowledge Discovery Series) Charu C. Aggarwal PDF
704 pages
Week 01
No ratings yet
Week 01
37 pages
Machine Learning
No ratings yet
Machine Learning
74 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
19 pages
Supervised Learning (WWW - Anuupdates.org)
No ratings yet
Supervised Learning (WWW - Anuupdates.org)
60 pages
Bia Unit-3 Part-2
No ratings yet
Bia Unit-3 Part-2
43 pages
Machine Learning
No ratings yet
Machine Learning
42 pages
2024 Assessment Handbook
No ratings yet
2024 Assessment Handbook
20 pages
Aircraft Dji Enterprise Mavic 3 Thermal
No ratings yet
Aircraft Dji Enterprise Mavic 3 Thermal
19 pages
Blockchain Tech Seminar Report
No ratings yet
Blockchain Tech Seminar Report
27 pages
Learning Plan in EPP 6
No ratings yet
Learning Plan in EPP 6
4 pages
Kartilya & 1898 Philippine Independence
No ratings yet
Kartilya & 1898 Philippine Independence
7 pages
Keymaker
No ratings yet
Keymaker
3 pages
Rem Koolhaas
100% (1)
Rem Koolhaas
7 pages
Petroleum Basin Classifications
No ratings yet
Petroleum Basin Classifications
21 pages
KV 27TS27
No ratings yet
KV 27TS27
10 pages
English 7 Curriuculum Map Quarter 1-3
100% (4)
English 7 Curriuculum Map Quarter 1-3
15 pages
Eapp Survey Report Finals
No ratings yet
Eapp Survey Report Finals
13 pages
WLP Q1 G11-Philosophy
No ratings yet
WLP Q1 G11-Philosophy
8 pages
G4000+G5000+Miele+Service+Manual
No ratings yet
G4000+G5000+Miele+Service+Manual
159 pages
Mock Job Interview Sample Questions Score Sheet
No ratings yet
Mock Job Interview Sample Questions Score Sheet
2 pages
Prototype Approach To Semantic Structure
No ratings yet
Prototype Approach To Semantic Structure
34 pages
Micrometer
No ratings yet
Micrometer
6 pages
Ati:F:Ht1: Service Bulletin
No ratings yet
Ati:F:Ht1: Service Bulletin
42 pages
Mfe - Module 1
No ratings yet
Mfe - Module 1
48 pages
ETAP 16 Keyboard Shortcuts Guide
No ratings yet
ETAP 16 Keyboard Shortcuts Guide
1 page
Agsc QP
No ratings yet
Agsc QP
15 pages
Limitation of Science
No ratings yet
Limitation of Science
3 pages
I564 E1 01 3g3ax PG Users Manual
No ratings yet
I564 E1 01 3g3ax PG Users Manual
69 pages
TSS HD Suspension
No ratings yet
TSS HD Suspension
2 pages
General Engineering PDF
No ratings yet
General Engineering PDF
12 pages
Certificate of Test and Thorough Examination of Mobile Crane
No ratings yet
Certificate of Test and Thorough Examination of Mobile Crane
2 pages
Ultrasonic Humidifier
No ratings yet
Ultrasonic Humidifier
3 pages
Extn-111 Pyqs Last 4 Yrs
No ratings yet
Extn-111 Pyqs Last 4 Yrs
7 pages
Klüber Lubricants for Glass Industry
No ratings yet
Klüber Lubricants for Glass Industry
12 pages
English FAL P3 Grade 11 Nov 2019 Memo
No ratings yet
English FAL P3 Grade 11 Nov 2019 Memo
12 pages
43 To 49 - 2025 - Notice-NE-4
No ratings yet
43 To 49 - 2025 - Notice-NE-4
4 pages

Introduction To Machine Learning

Uploaded by

Introduction To Machine Learning

Uploaded by

Machine Learning & Application

 Common Learning Schemes / Models / structure

 Hybrid Learning, Semi-Supervised Learning,

 One example using Subspace (PCA) methods

 Discussion and Conclusions.

 Toget knowledge of something by

 Tobecome aware by information or

Society and Information related tasks are getting

 This helps in complex and borderline cases and allows

Artificial Intelligence Statistics / Mathematics

(No definition of a field is perfect – the diagram above is just one

1. Program the computer to do the pattern

2. Program the computer to be able to learn

Evolved (in a large part) Designed to solve logic

Not perfectly reliable High reliable

Similarity K-NN, SVM, GA

Modeling Hierarchical Agglomerative

Partitional K-Mean, SOM

Historical Data Prediction

Count, Pie chart, Bar

Color Shape Size Output

 Naïve Bayes classifier

 Support vector machines (SVMs)

 Boosted decision stumps (Ada-Boost)

Linearly separable data

Number of layers can be identified, Feedback or Feed forward?

Kernel implicitly maps from 2D to 3D,

- William of Occum, 1320 AD.

- What does this mean?

What if facts are “Complex”

Simple classifiers (weak learners) can have their performance

Boosting maximizes the margin

fish reptile amphib. mammal worm insect crustacean

Original Points A Partitional

• Partitional clustering approach

 Parallel corpora (machine translation)

 Multiple sequenced genomes

 Gene expression arrays

PCA reduces the

 Complex Models (GMM)

 Statistical Methods (Monte Carlo Methods)

You might also like