0% found this document useful (0 votes)

30 views33 pages

Machine Learning

Uploaded by

Chandranath Kapat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views33 pages

Machine Learning

Uploaded by

Chandranath Kapat

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Learning Associations

• Basket analysis:
P (Y | X ) probability that somebody who buys X also
buys Y where X and Y are products/services.
Market-Basket transactions

Example: P ( chips | beer ) = 0.7

TID Items
1 Bread, Milk
2 Bread, Diaper, Beer, Eggs
3 Milk, Diaper, Beer, Coke
4 Bread, Milk, Diaper, Beer
5 Bread, Milk, Diaper, Coke
Training and testing

Data Practical
acquisition usage

Universal
set
(unobserve
d)

Training Testing set

set (unobserve
(observed) d)
Training and testing
• Training is the process of making the system able to learn.

• No free lunch rule:

– Training set and testing set come from the same distribution
– Need to make some assumptions or bias
Performance
• There are several factors affecting the performance:
– Types of training provided
– The form and extent of any initial background
knowledge
– The type of feedback provided
– The learning algorithms used

• Two important factors:

– Modeling
– Optimization
Classification: Applications
• Face recognition: Pose, lighting, occlusion (glasses,
beard), make-up, hair style
• Character recognition: Different handwriting styles.
• Speech recognition: Temporal dependency.
– Use of a dictionary or the syntax of the language.
– Sensor fusion: Combine multiple modalities; eg,
visual (lip image) and acoustic for speech
• Medical diagnosis: From symptoms to illnesses
• Web Advertising: Predict if a user clicks on an ad on the
Internet.
Steps
Training Training
Labels
Training
Images
Image Learned
Training
Features model

Testing

Image Learned
Prediction
Features model
Test Image
Prediction: Regression

• Example: Price of
a used car
y = wx+w0
• x : car attributes
y : price
y = g (x | θ )
g ( ) model,
θ parameters
Regression Applications
• Navigating a car: Angle of the steering
wheel (CMU NavLab)
• Kinematics of a robot arm
(x,y)

α2
α1= g1(x,y)
α2= g2(x,y)
α1
Inductive Learning
• Given examples of a function (X, F(X))
• Predict function F(X) for new examples X
– Discrete F(X): Classification
– Continuous F(X): Regression
– F(X) = Probability(X): Probability estimation
Supervised Learning: Uses
Example: decision trees tools that create rules
• Prediction of future cases: Use the rule to
predict the output for future inputs
• Knowledge extraction: The rule is easy to
understand
• Compression: The rule is simpler than the
data it explains
• Outlier detection: Exceptions that are not
covered by the rule, e.g., fraud
Algorithms
• The success of machine learning system also depends
on the algorithms.

• The algorithms control the search to find and build the

knowledge structures.

• The learning algorithms should extract useful information

from training examples.
Algorithms
• Supervised learning ( )
– Prediction
– Classification (discrete labels), Regression (real values)
• Unsupervised learning ( )
– Clustering
– Probability distribution estimation
– Finding association (in features)
– Dimension reduction
• Semi-supervised learning
• Reinforcement learning
– Decision making (robot, chess machine)
Algorithms

Supervised Unsupervised
learning learning

37
Semi-supervised
learning
What are we seeking?
• Supervised: Low E-out or maximize probabilistic terms

E-in: for training

set
E-out: for testing
set

• Unsupervised: Minimum quantization error, Minimum

distance, MAP, MLE(maximum likelihood estimation)
Machine learning structure
• Supervised learning
Unsupervised Learning
• Learning “what normally happens”
• No output
• Clustering: Grouping similar instances
• Other applications: Summarization,
Association Analysis
• Example applications
– Customer segmentation in CRM
– Image compression: Color quantization
– Bioinformatics: Learning motifs
Machine learning structure
• Unsupervised learning
Clustering Analysis
• Definition
Grouping unlabeled data into clusters, for the purpose of
inference of hidden structures or information

• Dissimilarity measurement
– Distance : Euclidean(L2), Manhattan(L1), …
– Angle : Inner product, …
– Non-metric : Rank, Intensity, …
• Types of Clustering
– Hierarchical
• Agglomerative or divisive
– Partitioning
• K-means, VQ, MDS, …

43
(Matlab helppage)
 Find K partitions with the total
intra-cluster variance minimized

 Iterative method
 Initialization : Randomized yi
 Assignment of x (yi fixed)
 Update of yi (x fixed)

 Problem?
 Trap in local minima

(MacKay, 2003)44
 Deterministically avoid local minima
 No stochastic process (random walk)
 Tracing the global solution by changing
level of randomness
 Statistical Mechanics (Maxima and Minima, Wikipedia)

 Gibbs distribution

 Helmholtz free energy F = D – TS

▪ Average Energy D = < Ex>
▪ Entropy S = - P(Ex) ln P(Ex)
▪ F = – T ln Z

 In DA, we make F minimized

45
 Analogy to physical annealing process
 Control energy (randomness) by temperature (high  low)
 Starting with high temperature (T = 1)
▪ Soft (or fuzzy) association probability
▪ Smooth cost function with one global minimum
 Lowering the temperature (T ! 0)
▪ Hard association
▪ Revealing full complexity, clusters are emerged

 Minimization of F, using E(x, yj) = ||x-yj||2

Iteratively,

46
 Definition
Process to transform high-dimensional data into low-
dimensional ones for improving accuracy, understanding, or
removing noises.

 Curse of dimensionality
 Complexity grows exponentially
in volume by adding extra
dimensions
 Types (Koppen, 2000)

 Feature selection : Choose representatives (e.g., filter,…)

 Feature extraction : Map to lower dim. (e.g., PCA, MDS, … )

47
Machine Learning in a
Nutshell
• Tens of thousands of machine learning
algorithms
• Hundreds new every year
• Every machine learning algorithm has
three components:
– Representation
– Evaluation
– Optimization
Generative vs. Discriminative
Classifiers
Generative Models Discriminative Models
• Represent both the data and • Learn to directly predict the labels
the labels from the data
• Often, makes use of conditional • Often, assume a simple boundary
independence and priors (e.g., linear)
• Examples • Examples
– Naïve Bayes classifier – Logistic regression
– Bayesian network – SVM
– Boosted decision trees
• Models of data may apply to • Often easier to predict a label from
future prediction problems the data than to model the data

Slide credit: D. Hoiem

Classifiers: Logistic Regression
Maximize likelihood of
label given data,
assuming a log-linear male
model

female

x2
x1 Pitch of voice
P( x1 , x2 | y  1)
log  wT x
P( x1 , x2 | y  1)

P ( y  1 | x1 , x2 )  1 / 1  exp w T x 
Classifiers: Nearest neighbor

Test Training
Training examples
examples example
from class 2
from class 1

f(x) = label of the training example nearest to x

 All we need is a distance function for our inputs
 No training required!

Slide credit: L. Lazebnik

Nearest Neighbor Classifier
• Assign label of nearest training data point to each test data point

partitioning of feature space for two-category 2D and 3D data

Source: D. Lowe
K-nearest neighbor
• It can be used for both classification and regression problems.
• However, it is more widely used in classification problems in the
industry.
• K nearest neighbours is a simple algorithm
– stores all available cases and
– classifies new cases by a majority vote of its k neighbours.
– The case being assigned to the class is most common amongst its K
nearest neighbours measured by a distance function.
– These distance functions can be Euclidean, Manhattan, Minkowski
and Hamming distance.
• First three functions are used for continuous function and
• Fourth one (Hamming) for categorical variables.
– If K = 1, then the case is simply assigned to the class of its nearest
neighbour.
– At times, choosing K turns out to be a challenge while performing KNN
modelling.
Naïve Bayes
• Bayes theorem provides a way of calculate
– posterior probability P(c|x) from P(c), P(x) and P(x|c).
• Look at the equation below:

– Here,
– P(c|x) is the posterior probability of class (target)
given predictor (attribute).
– P(c) is the prior probability of class.
– P(x|c) is the likelihood which is the probability of predictor given class.
– P(x) is the prior probability of predictor
Naïve Bayes Example
• Let’s understand it using an example.
– Have a training data set of weather and corresponding target variable
‘Play’.
– Now, we need to classify whether players will play or not based on
weather condition.
– Let’s follow the below steps to perform it.
• Step 1: Convert the data set to frequency table
• Step 2: Create Likelihood table by finding the probabilities like
– Overcast probability = 0.29 and
– probability of playing is 0.64.
Naïve Bayes
– Step 3: Now, use Naive Bayesian equation to calculate the posterior
probability for each class.
• The class with the highest posterior probability is the outcome of prediction.

• Problem:
– Players will pay if weather is sunny, is this statement is correct?
– We can solve it using above discussed method,
• so P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny)
• Here we have, P (Sunny |Yes) = 3/9 = 0.33,
P(Sunny) = 5/14 = 0.36,
P( Yes)= 9/14 = 0.64
• Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60,
• which has higher probability.
• Naive Bayes uses a similar method to
– predict the probability of different class based on various attributes.
– This algorithm is mostly used in text classification and
– with problems having multiple classes.
EM algorithm
• Problems in ML estimation
– Observation X is often not complete
– Latent (hidden) variable Z exists
– Hard to explore whole parameter space

• Expectation-Maximization algorithm
– Object : To find ML, over latent distribution P(Z |X,)
– Steps
0. Init – Choose a random old
1. E-step – Expectation P(Z |X, old)
2. M-step – Find new which maximize likelihood.
3. Go to step 1 after updating old Ã new

60
 Problem
Estimate hidden parameters (={, })
from the given data extracted from
k Gaussian distributions
 Gaussian distribution (Mitchell , 1997)

 Maximum Likelihood

With Gaussian (P = N),

Solve either brute-force or numeric method

Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
19 pages
Day 4 Content
No ratings yet
Day 4 Content
35 pages
Module 1 ML Mumbai University
No ratings yet
Module 1 ML Mumbai University
47 pages
ML RUSA Module 6 Probablistic EM KNN SVM
No ratings yet
ML RUSA Module 6 Probablistic EM KNN SVM
51 pages
ML & Cloud Computing for IoT
No ratings yet
ML & Cloud Computing for IoT
149 pages
Session 5
No ratings yet
Session 5
36 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
4 DL
No ratings yet
4 DL
81 pages
Machine Learning Basics for Beginners
100% (5)
Machine Learning Basics for Beginners
134 pages
Introduction 1175
No ratings yet
Introduction 1175
58 pages
A Preliminary Idea On Machine Learning
No ratings yet
A Preliminary Idea On Machine Learning
40 pages
Sec 1630
No ratings yet
Sec 1630
145 pages
Accelerated Data Science Introduction To Machine Learning Algorithms
No ratings yet
Accelerated Data Science Introduction To Machine Learning Algorithms
37 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
AI Chapter 5
No ratings yet
AI Chapter 5
31 pages
Unit 3
No ratings yet
Unit 3
16 pages
Financial Machine Learning-Unit-1: Dr. J.Dhanalakshmi
No ratings yet
Financial Machine Learning-Unit-1: Dr. J.Dhanalakshmi
70 pages
ML & Cloud Computing For Iot: Topics in Module-3
No ratings yet
ML & Cloud Computing For Iot: Topics in Module-3
38 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
Intro to Machine Learning Concepts
100% (1)
Intro to Machine Learning Concepts
58 pages
DAC ML Tutorial Final Deck
No ratings yet
DAC ML Tutorial Final Deck
150 pages
PR & ML: CS5691: Machine Learning
No ratings yet
PR & ML: CS5691: Machine Learning
42 pages
Chapter Introduction
No ratings yet
Chapter Introduction
7 pages
Week 09 Lesson 1 Intro Machine Learning 1 To 32
No ratings yet
Week 09 Lesson 1 Intro Machine Learning 1 To 32
61 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Machine Learning Algorithms Laiki
No ratings yet
Machine Learning Algorithms Laiki
123 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
316 pages
Intro - Types of Machine Learning
No ratings yet
Intro - Types of Machine Learning
24 pages
ML Unit 2
No ratings yet
ML Unit 2
33 pages
Topic: Machine Learning
No ratings yet
Topic: Machine Learning
35 pages
Machine Learning Types & Algorithms
No ratings yet
Machine Learning Types & Algorithms
29 pages
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
100% (1)
Introduction To Basics of Machine Learning Algorithms: Pankaj Oli
13 pages
CH 4
No ratings yet
CH 4
106 pages
6.867 Lecture Notes: Section 1: Introduction: 1 Intro 2 2 Problem Class 3
No ratings yet
6.867 Lecture Notes: Section 1: Introduction: 1 Intro 2 2 Problem Class 3
10 pages
Unit 3 - Supervise Learning Classification
No ratings yet
Unit 3 - Supervise Learning Classification
23 pages
6th - SEM Machine Learning Notes PDF
100% (1)
6th - SEM Machine Learning Notes PDF
36 pages
ML-Unit - 3 & 4
No ratings yet
ML-Unit - 3 & 4
33 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
48 pages
Notes
No ratings yet
Notes
125 pages
Unit 3
No ratings yet
Unit 3
99 pages
AI & ML Classification Lecture
No ratings yet
AI & ML Classification Lecture
69 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
68 pages
Ch3-Machine Learning
No ratings yet
Ch3-Machine Learning
124 pages
Machine Learning
No ratings yet
Machine Learning
28 pages
86 37 196 Mod 5
No ratings yet
86 37 196 Mod 5
52 pages
Module 3
No ratings yet
Module 3
63 pages
Unit 1
100% (1)
Unit 1
13 pages
Aiya Session 4
No ratings yet
Aiya Session 4
42 pages
Datamining Lect12
No ratings yet
Datamining Lect12
75 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
14 pages
Chapter Four - Part One
No ratings yet
Chapter Four - Part One
44 pages
DW&M Unit 3 Part I
No ratings yet
DW&M Unit 3 Part I
101 pages
Chapter7 Unit V2024 Up
No ratings yet
Chapter7 Unit V2024 Up
58 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Week 8
No ratings yet
Week 8
70 pages
Machine - Learning (Unit 3)
No ratings yet
Machine - Learning (Unit 3)
9 pages
L02 Fundamentals of ML
No ratings yet
L02 Fundamentals of ML
46 pages
Classification
No ratings yet
Classification
50 pages
Gamma-Gamma Fading in FSO MIMO Systems
No ratings yet
Gamma-Gamma Fading in FSO MIMO Systems
12 pages
Injection Molding Control Plan
100% (1)
Injection Molding Control Plan
3 pages
Student Visa SOP for Canada
100% (2)
Student Visa SOP for Canada
3 pages
Parallel
No ratings yet
Parallel
8 pages
Christ Apostolic Church
No ratings yet
Christ Apostolic Church
2 pages
PDF
No ratings yet
PDF
4 pages
Linear Equation in Two Unknowns PDF
No ratings yet
Linear Equation in Two Unknowns PDF
16 pages
Biomass 1
No ratings yet
Biomass 1
22 pages
IC Scrum Project Management Gantt Chart Template 10578 Excel 2000 2004
No ratings yet
IC Scrum Project Management Gantt Chart Template 10578 Excel 2000 2004
6 pages
HRM Assignment
No ratings yet
HRM Assignment
13 pages
Theory of Utilitarianism by Jeremy Bentham
No ratings yet
Theory of Utilitarianism by Jeremy Bentham
13 pages
NIS - Daily - Lesson - Plan - English - Ali Grade 10 2020 - Double (Lessons 1-2)
No ratings yet
NIS - Daily - Lesson - Plan - English - Ali Grade 10 2020 - Double (Lessons 1-2)
4 pages
DPP 305
No ratings yet
DPP 305
35 pages
An Analysis of The Wood Sugar Assay Using HPLC PDF
No ratings yet
An Analysis of The Wood Sugar Assay Using HPLC PDF
7 pages
MP Material by Sravan
No ratings yet
MP Material by Sravan
189 pages
Datasheet Inverter 180VA 1200VA en
No ratings yet
Datasheet Inverter 180VA 1200VA en
2 pages
B.Tech Semester 7 Results
No ratings yet
B.Tech Semester 7 Results
2 pages
Careers Project - 13
No ratings yet
Careers Project - 13
22 pages
Prestressed Slab Design Calculation
No ratings yet
Prestressed Slab Design Calculation
35 pages
Listening Compre and Dictation Grade 3
No ratings yet
Listening Compre and Dictation Grade 3
3 pages
Engaging Young Learners with Freeze Framing
No ratings yet
Engaging Young Learners with Freeze Framing
14 pages
All You Need To Know About Vascular Surgery
No ratings yet
All You Need To Know About Vascular Surgery
33 pages
Maccaferri Solution Guide
No ratings yet
Maccaferri Solution Guide
32 pages
Hajvery University
No ratings yet
Hajvery University
1 page
Accounting Exercises for Students
No ratings yet
Accounting Exercises for Students
3 pages
Teamwork Principles
No ratings yet
Teamwork Principles
16 pages
Science Ramban Part 1
100% (5)
Science Ramban Part 1
85 pages
Epson SureLab OrderController Operation Guide en
No ratings yet
Epson SureLab OrderController Operation Guide en
196 pages
Jadavpur Ph.D. Provisional List
No ratings yet
Jadavpur Ph.D. Provisional List
6 pages
Disks - RouterOS - MikroTik Documentation
No ratings yet
Disks - RouterOS - MikroTik Documentation
1 page

Machine Learning

Uploaded by

Machine Learning

Uploaded by

Learning Associations

Example: P ( chips | beer ) = 0.7

Training Testing set

• No free lunch rule:

• Two important factors:

• The algorithms control the search to find and build the

• The learning algorithms should extract useful information

E-in: for training

• Unsupervised: Minimum quantization error, Minimum

 Helmholtz free energy F = D – TS

 In DA, we make F minimized

 Minimization of F, using E(x, yj) = ||x-yj||2

 Feature selection : Choose representatives (e.g., filter,…)

Slide credit: D. Hoiem

f(x) = label of the training example nearest to x

Slide credit: L. Lazebnik

partitioning of feature space for two-category 2D and 3D data

With Gaussian (P = N),

Solve either brute-force or numeric method

You might also like