0% found this document useful (0 votes)

36 views43 pages

Lecture 16 - Classification

Classification

Uploaded by

raoseshu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views43 pages

Lecture 16 - Classification

Classification

Uploaded by

raoseshu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 43

Transfer Functions

Supervised Learning - Classification

Techniques used in ML applications

Supervised learning:

 Defined by its use of labeled datasets.

 The datasets are designed to train or “supervise” algorithms into
classifying data or predicting outcomes accurately.
 Using labeled inputs and outputs, the model can measure its accuracy
and learn over time.
Regression
Classification

• Logistic regression, support

vector machines, decision trees
and random forest
Background
 There are three methods to establish a classifier
a) Model a classification rule directly
Examples: k-NN, decision trees, perceptron, SVM
b) Model the probability of class memberships given input data
Example: feedforward ANN (multi-layered perceptron)
c) Make a probabilistic model of data within each class
Examples: naive Bayes, model based classifiers

 a) and b) are examples of discriminative classification

 c) is an example of generative classification
 b) and c) are both examples of probabilistic classification

5
Learning a Logistic Regression Model
• How to learn a logistic regression model 𝒉𝜽 𝒙 = 𝒈(𝜽𝑻 𝒙), where 𝞱 =
[𝜽𝟎 , … , 𝜽𝒎 ] and 𝒙 = [𝒙𝟎, … , 𝒙𝒎 ]?
• By minimizing the following cost function:
1 1
Cost(𝒉𝜽 𝒙 , y) = −𝑦 log 𝑇 − (1 − 𝑦) log 1 − 𝑇
1+𝑒 −𝜽 𝑥 1 + 𝑒 −𝜽 𝑥
• That is: 𝑛
1 𝑖
minimize ෍ Cost(𝒉𝜽 𝒙 ,𝑦 𝑖 )
𝜃 𝑛
𝑖=1

≣
𝑛
1 1 1 Cost function
minimize ෍ −𝑦 𝑖 log − (1 − 𝑦) log 1 −
𝜃 𝑛 1+
𝑇
𝑒 −𝜽 𝑥
𝑖
1 + 𝑒 −𝜽
𝑇
𝑥𝑖 𝑱(𝞱)
𝑖=1
Gradient Descent For Logistic Regression
• Outline:
• Have cost function 𝑱 𝞱 , where 𝞱 = [𝜽𝟎 , … , 𝜽𝒎 ]
• Start off with some guesses for 𝜃0 , … , 𝜃𝑚
• It does not really matter what values you start off with, but a common
choice is to set them all initially to zero
• Repeat until convergence Partial derivative

𝜕𝑱 𝞱
𝜃𝑗 = 𝜃𝑗 − α Note: Update all 𝜽𝒋 simulatenously
𝜕 𝜃𝑗

Learing rate, which controls how big a step we take

when we update 𝜃𝑗
Gradient Descent For Logistic Regression
• Outline:
• Have cost function 𝑱 𝞱 , where 𝞱 = [𝜽𝟎 , … , 𝜽𝒎 ]
• Start off with some guesses for 𝜃0 , … , 𝜃𝑚
• It does not really matter what values you start off with, but a common
choice is to set them all initially to zero
• Repeat until convergence
𝑛 The final formula
1 𝑖
𝜃𝑗 = 𝜃𝑗 − α ෍ 𝑇 𝑖
−𝑦 𝑥𝑗 (𝑖) after applying
1+ 𝑒 −𝜽 𝑥 partial derivatives
𝑖=1
Inference After Learning
• After learning the parameters 𝞱 = [𝜃0 , … , 𝜃𝑚 ], we can predict the
output of any new unseen 𝒙 = [𝒙𝟎, … , 𝒙𝒎 ] as follows:

1
𝒊𝒇 𝒉𝜽 𝒙 = −𝜽𝑇
𝒙
< 𝟎. 𝟓 𝐩𝐫𝐞𝐝𝐢𝐜𝐭 𝟎
1+𝑒

1
𝑬𝒍𝒔𝒆 𝒊𝒇 𝒉𝜽 𝒙 = −𝜽𝑇
𝒙
≥ 𝟎. 𝟓 𝐩𝐫𝐞𝐝𝐢𝐜𝐭 𝟏
1+𝑒
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0]

and vaccine the of nigeria y

Email a 1 1 0 1 1 1
Email b 0 0 1 1 0 0
Email c 0 1 1 0 0 1
Email d 1 0 0 1 0 0
Email e 1 0 1 0 1 1
Email f 1 0 1 1 0 0

A Training Dataset
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0]

and vaccine the of nigeria y

Email a 1 1 0 1 1 1
Email b 0 0 1 1 0 0
Email c 0 1 1 0 0 1
Email d 1 0 0 1 0 0
Email e 1 0 1 0 1 1
Email f 1 0 1 1 0 0

1 entails that a word (i.e., “and”) is present in an email (i.e., “Email a”)
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0]

and vaccine the of nigeria y

Email a 1 1 0 1 1 1
Email b 0 0 1 1 0 0
Email c 0 1 1 0 0 1
Email d 1 0 0 1 0 0
Email e 1 0 1 0 1 1
Email f 1 0 1 1 0 0

0 entails that a word (i.e., “and”) is abscent in an email (i.e., “Email b”)
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0] We define 6 parameters
(the first one, i.e., 𝜃0 ,
5 words (or features) = [𝒙𝟏 , 𝒙𝟐 , 𝒙𝟑 , 𝒙𝟒 , 𝒙𝟓 ] is the intercept)

𝒙𝟏 = and 𝒙𝟐 = vaccine 𝒙𝟑 = the 𝒙𝟒 = of 𝒙𝟓 = nigeria y

Email a 1 1 0 1 1 1
Email b 0 0 1 1 0 0
Email c 0 1 1 0 0 1
Email d 1 0 0 1 0 0
Email e 1 0 1 0 1 1
Email f 1 0 1 1 0 0
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0] The parameter vector:
𝜽 = [𝜽𝟎 , 𝜽𝟏 , 𝜽𝟐 , 𝜽𝟑 , 𝜽𝟒 , 𝜽𝟓 ]
x = [𝒙𝟎 , 𝒙𝟏 , 𝒙𝟐 , 𝒙𝟑 , 𝒙𝟒 , 𝒙𝟓 ] The feature vector

𝒙𝟎 = 𝟏 𝒙𝟏 = and 𝒙𝟐 = vaccine 𝒙𝟑 = the 𝒙𝟒 = of 𝒙𝟓 = nigeria y

Email a 1 1 1 0 1 1 1
Email b 1 0 0 1 1 0 0
Email c 1 0 1 1 0 0 1
Email d 1 1 0 0 1 0 0
Email e 1 1 0 1 0 1 1
Email f 1 1 0 1 1 0 0

To account for the intercept

Recap: Gradient Descent For Logistic
Regression
• Outline:
• Have cost function 𝑱 𝞱 , where 𝞱 = [𝜽𝟎 , … , 𝜽𝒎 ]
• Start off with some guesses for 𝜃0 , … , 𝜃𝑚
• It does not really matter what values you start off with, but a common
choice is to set them all initially to zero
• Repeat until convergence
𝑛
1 𝑖
𝜃𝑗 = 𝜃𝑗 − α ෍ 𝑇 𝑖
−𝑦 𝑥𝑗 (𝑖)
1+ 𝑒 −𝜽 𝑥
𝑖=1 First, let us calculate this factor
for every example in our
training dataset
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0]
𝒙 𝒚 𝜽𝑻 𝒙 1
(1+𝑒 −𝜽 𝒙 − 𝒚)𝒙𝟎
𝑇

[1,1,1,0,1,1] 1 [0,0,0,0,0,0]×[1,1,1,0,1,1]=0 -0.5

[1,0,0,1,1,0] 0 [0,0,0,0,0,0]×[1,0,0,1,1,0]=0 0.5
[1,0,1,1,0,0] 1 [0,0,0,0,0,0]×[1,0,1,1,0,0]=0 -0.5
[1,1,0,0,1,0] 0 [0,0,0,0,0,0]×[1,1,0,0,1,0]=0 0.5
[1,1,0,1,0,1] 1 [0,0,0,0,0,0]×[1,1,0,1,0,1]=0 -0.5
[1,1,0,1,1,0] 0 [0,0,0,0,0,0]×[1,1,0,1,1,0]=0 0.5
Recap: Gradient Descent For Logistic
Regression
• Outline:
• Have cost function 𝑱 𝞱 , where 𝞱 = [𝜽𝟎 , … , 𝜽𝒎 ]
• Start off with some guesses for 𝜃0 , … , 𝜃𝑚
• It does not really matter what values you start off with, but a common
choice is to set them all initially to zero
• Repeat until convergence Second, let us calculate
this equation for every
𝑛 example in our training
1 𝑖 dataset and for every 𝜽𝒋 ,
𝜃𝑗 = 𝜃𝑗 − α ෍ 𝑇 𝑖
−𝑦 𝑥𝑗 (𝑖) where j is between 0
1+ 𝑒 −𝜽 𝑥
𝑖=1 and m
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0]
𝒙 𝒚 𝜽𝑻 𝒙 1
(1+𝑒 −𝜽 𝒙 − 𝒚)𝒙𝟎
𝑇

[1,1,1,0,1,1] 1 [0,0,0,0,0,0]×[1,1,1,0,1,1]=0 ( 1 − 𝟏) × 𝟏 = -0.5

1+𝑒 −𝟎
[1,0,0,1,1,0] 0 [0,0,0,0,0,0]×[1,0,0,1,1,0]=0 1
(1+1 − 𝟎) × 𝟏 = 0.5
[1,0,1,1,0,0] 1 [0,0,0,0,0,0]×[1,0,1,1,0,0]=0 1
(1+1 − 𝟏) × 𝟏 = -0.5
[1,1,0,0,1,0] 0 [0,0,0,0,0,0]×[1,1,0,0,1,0]=0 1
(1+1 − 𝟎) × 𝟏 = 0.5
[1,1,0,1,0,1] 1 [0,0,0,0,0,0]×[1,1,0,1,0,1]=0 1
(1+1 − 𝟏) × 𝟏 = -0.5
[1,1,0,1,1,0] 0 [0,0,0,0,0,0]×[1,1,0,1,1,0]=0 1
(1+1 − 𝟎) × 𝟏 = 0.5
Recap: Gradient Descent For Logistic
Regression
• Outline:
• Have cost function 𝑱 𝞱 , where 𝞱 = [𝜽𝟎 , … , 𝜽𝒎 ]
• Start off with some guesses for 𝜃0 , … , 𝜃𝑚
• It does not really matter what values you start off with, but a common
choice is to set them all initially to zero
• Repeat until convergence
𝑛
1 𝑖 (𝑖) Third, let us compute
𝜃𝑗 = 𝜃𝑗 − α ෍ 𝑇 𝑖
−𝑦 𝑥𝑗 every 𝜽𝒋
1+ 𝑒 −𝜽 𝑥
𝑖=1
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0]
𝑛
𝒙 𝒚 𝑻
𝜽 𝒙 1 1
(1+𝑒 −𝜽𝑇𝒙 − 𝒚)𝒙𝟎 ෍ 𝑇 𝑖
−𝑦 𝑖
𝑥0 (𝑖) = 𝟎
𝑖=1
1 + 𝑒 −𝜽 𝑥
[1,1,1,0,1,1] 1 [0,0,0,0,0,0]×[1,1,1,0,1,1]=0 -0.5
[1,0,0,1,1,0] 0 [0,0,0,0,0,0]×[1,0,0,1,1,0]=0 𝑻𝒉𝒆𝒏,
0.5
[1,0,1,1,0,0] 1 [0,0,0,0,0,0]×[1,0,1,1,0,0]=0 𝜃0 = 𝜃0 − α ×0
-0.5
[1,1,0,0,1,0] 0 [0,0,0,0,0,0]×[1,1,0,0,1,0]=0 New 𝜽𝟎
0.5
[1,1,0,1,0,1] 1 [0,0,0,0,0,0]×[1,1,0,1,0,1]=0 -0.5
[1,1,0,1,1,0] 0 [0,0,0,0,0,0]×[1,1,0,1,1,0]=0 0.5
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0]
𝑛
𝒙 𝒚 𝑻
𝜽 𝒙 1 1
(1+𝑒 −𝜽𝑇𝒙 − 𝒚)𝒙𝟎 ෍ 𝑇 𝑖
−𝑦 𝑖
𝑥0 (𝑖) = 𝟎
𝑖=1
1 + 𝑒 −𝜽 𝑥
[1,1,1,0,1,1] 1 [0,0,0,0,0,0]×[1,1,1,0,1,1]=0 -0.5
[1,0,0,1,1,0] 0 [0,0,0,0,0,0]×[1,0,0,1,1,0]=0 𝑻𝒉𝒆𝒏,
0.5
[1,0,1,1,0,0] 1 [0,0,0,0,0,0]×[1,0,1,1,0,0]=0 𝜃0 = 𝜃0 − α ×0
-0.5
[1,1,0,0,1,0] 0 [0,0,0,0,0,0]×[1,1,0,0,1,0]=0 0.5 Old 𝜽𝟎

[1,1,0,1,0,1] 1 [0,0,0,0,0,0]×[1,1,0,1,0,1]=0 -0.5

[1,1,0,1,1,0] 0 [0,0,0,0,0,0]×[1,1,0,1,1,0]=0 0.5
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0]
𝑛
𝒙 𝒚 𝑻
𝜽 𝒙 1 1
(1+𝑒 −𝜽𝑇𝒙 − 𝒚)𝒙𝟎 ෍ 𝑇 𝑖
−𝑦 𝑖
𝑥0 (𝑖) = 𝟎
𝑖=1
1 + 𝑒 −𝜽 𝑥
[1,1,1,0,1,1] 1 [0,0,0,0,0,0]×[1,1,1,0,1,1]=0 -0.5
[1,0,0,1,1,0] 0 [0,0,0,0,0,0]×[1,0,0,1,1,0]=0 𝑻𝒉𝒆𝒏,
0.5
[1,0,1,1,0,0] 1 [0,0,0,0,0,0]×[1,0,1,1,0,0]=0 𝜃0 = 𝜃0 − α ×0
-0.5
[1,1,0,0,1,0] 0 [0,0,0,0,0,0]×[1,1,0,0,1,0]=0 0.5 = 0 − 0.5 × 𝟎 = 𝟎

[1,1,0,1,0,1] 1 [0,0,0,0,0,0]×[1,1,0,1,0,1]=0 -0.5

[1,1,0,1,1,0] 0 [0,0,0,0,0,0]×[1,1,0,1,1,0]=0 0.5 New Paramter Vector:
𝜽 = [𝟎, 𝜽𝟏 , 𝜽𝟐 , 𝜽𝟑 , 𝜽𝟒 , 𝜽𝟓 ]
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0]
𝒙 𝒚 𝜽𝑻 𝒙 1
(1+𝑒 −𝜽 𝒙 − 𝒚)𝒙𝟏
𝑇

[1,1,1,0,1,1] 1 [0,0,0,0,0,0]×[1,1,1,0,1,1]=0 -0.5

[1,1,0,1,0,1] 1 [0,0,0,0,0,0]×[1,1,0,1,0,1]=0 -0.5

[1,1,0,1,1,0] 0 [0,0,0,0,0,0]×[1,1,0,1,1,0]=0 0.5 New Paramter Vector:
𝜽 = [𝟎, 𝟎, 𝜽𝟐 , 𝜽𝟑 , 𝜽𝟒 , 𝜽𝟓 ]
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0]
𝒙 𝒚 𝜽𝑻 𝒙 1
(1+𝑒 −𝜽 𝒙 − 𝒚)𝒙𝟐
𝑇

[1,1,1,0,1,1] 1 [0,0,0,0,0,0]×[1,1,1,0,1,1]=0 -0.5

[1,1,0,1,0,1] 1 [0,0,0,0,0,0]×[1,1,0,1,0,1]=0 0
[1,1,0,1,1,0] 0 [0,0,0,0,0,0]×[1,1,0,1,1,0]=0 0 New Paramter Vector:
𝜽 = [𝟎, 𝟎, 𝟎. 𝟓, 𝜽𝟑 , 𝜽𝟒 , 𝜽𝟓 ]
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0]
𝒙 𝒚 𝜽𝑻 𝒙 1
(1+𝑒 −𝜽 𝒙 − 𝒚)𝒙𝟑
𝑇

[1,1,1,0,1,1] 1 [0,0,0,0,0,0]×[1,1,1,0,1,1]=0 -0.5

[1,1,0,1,0,1] 1 [0,0,0,0,0,0]×[1,1,0,1,0,1]=0 -0.5

[1,1,0,1,1,0] 0 [0,0,0,0,0,0]×[1,1,0,1,1,0]=0 0.5 New Paramter Vector:
𝜽 = [𝟎, 𝟎, 𝟎. 𝟓, 𝟎, 𝜽𝟒 , 𝜽𝟓 ]
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0]
𝒙 𝒚 𝜽𝑻 𝒙 1
(1+𝑒 −𝜽 𝒙 − 𝒚)𝒙𝟒
𝑇

[1,1,1,0,1,1] 1 [0,0,0,0,0,0]×[1,1,1,0,1,1]=0 -0.5

[1,1,0,1,0,1] 1 [0,0,0,0,0,0]×[1,1,0,1,0,1]=0 0
[1,1,0,1,1,0] 0 [0,0,0,0,0,0]×[1,1,0,1,1,0]=0 0.5 New Paramter Vector:
𝜽 = [𝟎, 𝟎, 𝟎. 𝟓, 𝟎, −𝟎. 𝟓, 𝜽𝟓 ]
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0]
𝒙 𝒚 𝜽𝑻 𝒙 1
(1+𝑒 −𝜽 𝒙 − 𝒚)𝒙𝟓
𝑇

[1,1,1,0,1,1] 1 [0,0,0,0,0,0]×[1,1,1,0,1,1]=0 -0.5

[1,0,0,1,1,0] 0 [0,0,0,0,0,0]×[1,0,0,1,1,0]=0 0
[1,0,1,1,0,0] 1 [0,0,0,0,0,0]×[1,0,1,1,0,0]=0 0
[1,1,0,0,1,0] 0 [0,0,0,0,0,0]×[1,1,0,0,1,0]=0 0.5
[1,1,0,1,0,1] 1 [0,0,0,0,0,0]×[1,1,0,1,0,1]=0 -0.5
[1,1,0,1,1,0] 0 [0,0,0,0,0,0]×[1,1,0,1,1,0]=0 0.5
A Concrete Example: The Training Phase
• Let us apply logistic regression on the spam email recognition problem,
assuming 𝞪 = 0.5 and starting with 𝜽 = [0, 0, 0, 0, 0, 0]
𝑛
𝒙 𝒚 𝑻
𝜽 𝒙 1 1
(1+𝑒 −𝜽𝑇𝒙 − 𝒚)𝒙𝟓 ෍ −𝑦 𝑖
𝑥5 (𝑖) = −𝟏
𝑇 𝑖
𝑖=1
1 + 𝑒 −𝜽 𝑥
[1,1,1,0,1,1] 1 [0,0,0,0,0,0]×[1,1,1,0,1,1]=0
-0.5
[1,0,0,1,1,0] 0 [0,0,0,0,0,0]×[1,0,0,1,1,0]=0 𝑻𝒉𝒆𝒏,
0
[1,0,1,1,0,0] 1 [0,0,0,0,0,0]×[1,0,1,1,0,0]=0 𝜃5 = 𝜃5 − α × (−𝟏)
0
[1,1,0,0,1,0] 0 [0,0,0,0,0,0]×[1,1,0,0,1,0]=0 = 0 − 0.5 × (−1) = 𝟎. 𝟓
0
[1,1,0,1,0,1] 1 [0,0,0,0,0,0]×[1,1,0,1,0,1]=0
-0.5
[1,1,0,1,1,0] 0 [0,0,0,0,0,0]×[1,1,0,1,1,0]=0 New Paramter Vector:
0
𝜽 = [𝟎, 𝟎, 𝟎. 𝟓, 𝟎, −𝟎. 𝟓, 𝟎. 𝟓]
A Concrete Example: Testing
• Let us now test logistic regression on the spam email recognition
problem, using the just learnt 𝜽 = [𝟎, 𝟎, 𝟎. 𝟓, 𝟎, −𝟎. 𝟓,𝟎. 𝟓]
• Note: Testing is typically done over a portion of the dataset that is not used
during training, but rather kept only for testing the accuracy of the algorithm’s
predictions thus far
• In this example, we will test over all the examples that we used during training,
just for illustrative purposes
A Concrete Example: Testing
• Let us test logistic regression on the spam email recognition problem,
using the just learnt 𝜽 = [𝟎, 𝟎, 𝟎. 𝟓, 𝟎, −𝟎. 𝟓,𝟎. 𝟓]
𝒙 𝒚 𝜽𝑻 𝒙 1
𝒉𝜽 𝒙 = (1+𝑒 −𝜽 𝒙 )
𝑇
Predicted Class

[1,1,1,0,1,1] 1 [0,0,0.5,0,-0.5,0.5]×[1,1,1,0,1,1]=0.5 0.622459331

[1,0,0,1,1,0] 0 [0,0,0.5,0,-0.5,0.5]×[1,0,0,1,1,0]=-0.5 0.377540669
[1,0,1,1,0,0] 1 [0,0,0.5,0,-0.5,0.5]×[1,0,1,1,0,0]=0.5 0.622459331
[1,1,0,0,1,0] 0 [0,0,0.5,0,-0.5,0.5]×[1,1,0,0,1,0]=-0.5 0.377540669
[1,1,0,1,0,1] 1 [0,0,0.5,0,-0.5,0.5]×[1,1,0,1,0,1]=0.5 0.622459331
[1,1,0,1,1,0] 0 [0,0,0.5,0,-0.5,0.5]×[1,1,0,1,1,0]=-0.5 0.377540669
A Concrete Example: Testing
• Let us test logistic regression on the spam email recognition problem,
using the just learnt 𝜽 = [𝟎, 𝟎, 𝟎. 𝟓, 𝟎, −𝟎. 𝟓,𝟎. 𝟓]
𝒙 𝒚 𝜽𝑻 𝒙 1
𝒉𝜽 𝒙 = (1+𝑒 −𝜽 𝒙 )
𝑇
Predicted Class

[1,1,1,0,1,1] 1 [0,0,0.5,0,-0.5,0.5]×[1,1,1,0,1,1]=0.5 0.622459331

[1,0,0,1,1,0] 0 [0,0,0.5,0,-0.5,0.5]×[1,0,0,1,1,0]=-0.5 0.377540669
[1,0,1,1,0,0] 1 [0,0,0.5,0,-0.5,0.5]×[1,0,1,1,0,0]=0.5 0.622459331
[1,1,0,0,1,0] 0 [0,0,0.5,0,-0.5,0.5]×[1,1,0,0,1,0]=-0.5 0.377540669
[1,1,0,1,0,1] 1 [0,0,0.5,0,-0.5,0.5]×[1,1,0,1,0,1]=0.5 0.622459331
[1,1,0,1,1,0] 0 [0,0,0.5,0,-0.5,0.5]×[1,1,0,1,1,0]=-0.5 0.377540669
A Concrete Example: Testing
• Let us test logistic regression on the spam email recognition problem,
using the just learnt 𝜽 = [𝟎, 𝟎, 𝟎. 𝟓, 𝟎, −𝟎. 𝟓,𝟎. 𝟓]
(𝒊𝒇 𝒉𝜽 𝒙 ≥ 𝟎. 𝟓, 𝒚’ = 𝟏; 𝒆𝒍𝒔𝒆 𝒚’ = 𝟎)
𝒙 𝒚 𝜽𝑻 𝒙 1
𝒉𝜽 𝒙 = (1+𝑒 −𝜽 𝒙 )
𝑇
Predicted Class (or 𝒚’)

[1,1,1,0,1,1] 1 [0,0,0.5,0,-0.5,0.5]×[1,1,1,0,1,1]=0.5 0.622459331

[1,0,0,1,1,0] 0 [0,0,0.5,0,-0.5,0.5]×[1,0,0,1,1,0]=-0.5 0.377540669
[1,0,1,1,0,0] 1 [0,0,0.5,0,-0.5,0.5]×[1,0,1,1,0,0]=0.5 0.622459331
[1,1,0,0,1,0] 0 [0,0,0.5,0,-0.5,0.5]×[1,1,0,0,1,0]=-0.5 0.377540669
[1,1,0,1,0,1] 1 [0,0,0.5,0,-0.5,0.5]×[1,1,0,1,0,1]=0.5 0.622459331
[1,1,0,1,1,0] 0 [0,0,0.5,0,-0.5,0.5]×[1,1,0,1,1,0]=-0.5 0.377540669
A Concrete Example: Testing
• Let us test logistic regression on the spam email recognition problem,
using the just learnt 𝜽 = [𝟎, 𝟎, 𝟎. 𝟓, 𝟎, −𝟎. 𝟓,𝟎. 𝟓]
(𝒊𝒇 𝒉𝜽 𝒙 ≥ 𝟎. 𝟓, 𝒚’ = 𝟏; 𝒆𝒍𝒔𝒆 𝒚’ = 𝟎)
𝒙 𝒚 𝜽𝑻 𝒙 1
𝒉𝜽 𝒙 = (1+𝑒 −𝜽 𝒙 )
𝑇
Predicted Class (or 𝒚’)

[1,1,1,0,1,1] 1 [0,0,0.5,0,-0.5,0.5]×[1,1,1,0,1,1]=0.5 0.622459331 1

[1,0,0,1,1,0] 0 [0,0,0.5,0,-0.5,0.5]×[1,0,0,1,1,0]=-0.5 0.377540669 0
[1,0,1,1,0,0] 1 [0,0,0.5,0,-0.5,0.5]×[1,0,1,1,0,0]=0.5 0.622459331 1
[1,1,0,0,1,0] 0 [0,0,0.5,0,-0.5,0.5]×[1,1,0,0,1,0]=-0.5 0.377540669 0
[1,1,0,1,0,1] 1 [0,0,0.5,0,-0.5,0.5]×[1,1,0,1,0,1]=0.5 0.622459331 1
[1,1,0,1,1,0] 0 [0,0,0.5,0,-0.5,0.5]×[1,1,0,1,1,0]=-0.5 0.377540669 0
A Concrete Example: Testing
• Let us test logistic regression on the spam email recognition problem,
using the just learnt 𝜽 = [𝟎, 𝟎, 𝟎. 𝟓, 𝟎, −𝟎. 𝟓,𝟎. 𝟓]
(𝒊𝒇 𝒉𝜽 𝒙 ≥ 𝟎. 𝟓, 𝒚’ = 𝟏; 𝒆𝒍𝒔𝒆 𝒚’ = 𝟎)
𝒙 𝒚 𝜽𝑻 𝒙 1
𝒉𝜽 𝒙 = (1+𝑒 −𝜽 𝒙 )
𝑇
Predicted Class (or 𝒚’)

[1,1,1,0,1,1] 1 [0,0,0.5,0,-0.5,0.5]×[1,1,1,0,1,1]=0.5 0.622459331 1

[1,0,0,1,1,0] 0 [0,0,0.5,0,-0.5,0.5]×[1,0,0,1,1,0]=-0.5 0.377540669 0
[1,0,1,1,0,0] 1 [0,0,0.5,0,-0.5,0.5]×[1,0,1,1,0,0]=0.5 0.622459331 NO
1
[1,1,0,0,1,0] 0 [0,0,0.5,0,-0.5,0.5]×[1,1,0,0,1,0]=-0.5 Mispredictions!
0.377540669 0
[1,1,0,1,0,1] 1 [0,0,0.5,0,-0.5,0.5]×[1,1,0,1,0,1]=0.5 0.622459331 1
[1,1,0,1,1,0] 0 [0,0,0.5,0,-0.5,0.5]×[1,1,0,1,1,0]=-0.5 0.377540669 0
A Concrete Example: Inference
• Let us infer whether a given new email, say, k = [1, 0, 1, 0, 0, 1] is a spam
or not, using logistic regression with the just learnt parameter vector
𝜽 = [𝟎, 𝟎, 𝟎. 𝟓, 𝟎, −𝟎. 𝟓,𝟎. 𝟓]
𝒙𝟎 = 𝟏 𝒙𝟏 = and 𝒙𝟐 = vaccine 𝒙𝟑 = the 𝒙𝟒 = of 𝒙𝟓 = nigeria y
Email a 1 1 1 0 1 1 1
Email b 1 0 0 1 1 0 0
Email c 1 0 1 1 0 0 1
Email d 1 1 0 0 1 0 0
Email e 1 1 0 1 0 1 1
Email f 1 1 0 1 1 0 0
Email k 1 0 1 0 0 1 ?
Our Training Dataset
A Concrete Example: Inference
• Let us infer whether a given new email, say, k = [1, 0, 1, 0, 0, 1] is a spam
or not, using logistic regression with the just learnt parameter vector
𝜽 = [𝟎, 𝟎, 𝟎. 𝟓, 𝟎, −𝟎. 𝟓,𝟎. 𝟓]
𝒙𝟎 = 𝟏 𝒙𝟏 = and 𝒙𝟐 = vaccine 𝒙𝟑 = the 𝒙𝟒 = of 𝒙𝟓 = nigeria y
Email a 1 1 1 0 1 1 1
Email b 1 0 0 1 1 0 0
Email c 1 0 1 1 0 0 1
Email d 1 1 0 0 1 0 0
Email e 1 1 0 1 0 1 1
Email f 1 1 0 1 1 0 0
Email k 1 0 1 0 0 1 ?
A Concrete Example: Inference
• Let us infer whether a given new email, say, k = [1, 0, 1, 0, 0, 1] is a spam
or not, using logistic regression with the just learnt parameter vector
𝜽 = [𝟎, 𝟎, 𝟎. 𝟓, 𝟎, −𝟎. 𝟓,𝟎. 𝟓]
𝟎
1 𝟎
𝒉𝜽 𝒙 = 𝑇
𝟎. 𝟓
1+𝑒 𝒙
−𝜽 𝟏, 𝟎, 𝟏, 𝟎, 𝟎, 𝟏 = 𝟎. 𝟓 × 𝟏 + 𝟎. 𝟓 × 𝟏 = 𝟏
𝟎
1 −𝟎. 𝟓
= 𝟎. 𝟓
1 + 𝑒 −𝟏
= 0.731
≥ 0.5  Class 1 (i.e., Spam)
A Concrete Example: Inference
• Let us infer whether a given new email, say, k = [1, 0, 1, 0, 0, 1] is a spam
or not, using logistic regression with the just learnt parameter vector
𝜽 = [𝟎, 𝟎, 𝟎. 𝟓, 𝟎, −𝟎. 𝟓,𝟎. 𝟓]
𝒙𝟎 = 𝟏 𝒙𝟏 = and 𝒙𝟐 = vaccine 𝒙𝟑 = the 𝒙𝟒 = of 𝒙𝟓 = nigeria y
Email a 1 1 1 0 1 1 1
Email b 1 0 0 1 1 0 0
Email c 1 0 1 1 0 0 1
Email d 1 1 0 0 1 0 0
Email e 1 1 0 1 0 1 1
Email f 1 1 0 1 1 0 0
Email k 1 0 1 0 0 1 1

Somehow interesting since it considered “vaccine” and “nigeria” indicative of spam!

Lesson 12 Logistic Regression
No ratings yet
Lesson 12 Logistic Regression
49 pages
23 LogisticRegression
No ratings yet
23 LogisticRegression
67 pages
1694600777-Unit2.2 Logistic Regression CU 2.0
100% (1)
1694600777-Unit2.2 Logistic Regression CU 2.0
37 pages
09 23ECE216 LogisticRegression
No ratings yet
09 23ECE216 LogisticRegression
40 pages
Logistic Regression
No ratings yet
Logistic Regression
79 pages
ML DSBA Lab2
No ratings yet
ML DSBA Lab2
4 pages
Problem Set Logistic Regression
No ratings yet
Problem Set Logistic Regression
2 pages
Logistic Regression for Beginners
No ratings yet
Logistic Regression for Beginners
9 pages
Logistic Regression
No ratings yet
Logistic Regression
6 pages
5 LR Apr 7 2021
No ratings yet
5 LR Apr 7 2021
93 pages
Logistic Regression
No ratings yet
Logistic Regression
21 pages
Logistic Regressions
No ratings yet
Logistic Regressions
11 pages
Module 04 - Extra Class: Logistic Regression
No ratings yet
Module 04 - Extra Class: Logistic Regression
41 pages
Logistic Regression
No ratings yet
Logistic Regression
12 pages
Ch03 LogisticRegression
No ratings yet
Ch03 LogisticRegression
79 pages
Lec12 Logreg
No ratings yet
Lec12 Logreg
41 pages
Lecture 6 Logistic Regression
No ratings yet
Lecture 6 Logistic Regression
28 pages
Logistic Regression
No ratings yet
Logistic Regression
78 pages
W2 Ann
No ratings yet
W2 Ann
12 pages
Logisticregression 2021
No ratings yet
Logisticregression 2021
78 pages
Logit PDF
No ratings yet
Logit PDF
44 pages
Logistic Regression: Jia Li
No ratings yet
Logistic Regression: Jia Li
44 pages
Reference Material - Logistic - Regression
No ratings yet
Reference Material - Logistic - Regression
11 pages
Lec 02 LogisticReg
No ratings yet
Lec 02 LogisticReg
33 pages
Classification Algorithms in Supervised Learning
No ratings yet
Classification Algorithms in Supervised Learning
3 pages
COMP-377Week6 v1.1
No ratings yet
COMP-377Week6 v1.1
38 pages
Reference Material - Logistic - Regression
No ratings yet
Reference Material - Logistic - Regression
11 pages
Logistic Regression
No ratings yet
Logistic Regression
9 pages
Sonia Jessica - 2022 - How Does Logistic Regression Work
No ratings yet
Sonia Jessica - 2022 - How Does Logistic Regression Work
4 pages
L14 Logistic Regression
No ratings yet
L14 Logistic Regression
22 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
41 pages
Reference Material Logistic Regression
No ratings yet
Reference Material Logistic Regression
11 pages
Logistic Regression for Analysts
No ratings yet
Logistic Regression for Analysts
33 pages
AI Lab8
No ratings yet
AI Lab8
8 pages
Ch2Regression and Regularization1
No ratings yet
Ch2Regression and Regularization1
45 pages
Generalized Linear Model
No ratings yet
Generalized Linear Model
67 pages
Unit 3-ML
No ratings yet
Unit 3-ML
99 pages
Lecture 3.3 - Logistic Regression
No ratings yet
Lecture 3.3 - Logistic Regression
5 pages
Lecture 07
No ratings yet
Lecture 07
26 pages
3-LG Eval
No ratings yet
3-LG Eval
52 pages
Week 8
No ratings yet
Week 8
38 pages
03-Logistic Regression
No ratings yet
03-Logistic Regression
59 pages
Business Analytics & Machine Learning: Logistic and Poisson Regressions
No ratings yet
Business Analytics & Machine Learning: Logistic and Poisson Regressions
62 pages
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
100% (1)
Logistic Regression: Gunjan Bharadwaj Assistant Professor Dept of CEA
42 pages
Logistic Regression Guide by Andrew Ng
No ratings yet
Logistic Regression Guide by Andrew Ng
29 pages
Lecture 8 Logistic Regression
No ratings yet
Lecture 8 Logistic Regression
34 pages
Machine Learning for Mechanics
No ratings yet
Machine Learning for Mechanics
19 pages
DM - Lecture 3
No ratings yet
DM - Lecture 3
41 pages
L5 LogisticRegression
No ratings yet
L5 LogisticRegression
22 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
4 pages
Logistic Regression Explained
No ratings yet
Logistic Regression Explained
15 pages
ML Exp1 Part A
No ratings yet
ML Exp1 Part A
5 pages
Logistic Regression
No ratings yet
Logistic Regression
4 pages
Logistic Regression
No ratings yet
Logistic Regression
25 pages
W8 - Logistic Regression
No ratings yet
W8 - Logistic Regression
18 pages
Logistic Regression - Metrics and Iteration
No ratings yet
Logistic Regression - Metrics and Iteration
26 pages
Intro to Logistic Regression
No ratings yet
Intro to Logistic Regression
4 pages
05 LogisticRegression PDF
No ratings yet
05 LogisticRegression PDF
23 pages
Csa Lab 3
No ratings yet
Csa Lab 3
14 pages
Username Password Peserta
No ratings yet
Username Password Peserta
4 pages
Crash 2025 06 23 - 16.25.08 FML
No ratings yet
Crash 2025 06 23 - 16.25.08 FML
4 pages
PyMOL Guide for Students & Professors
No ratings yet
PyMOL Guide for Students & Professors
11 pages
Performance Task 1 - Attempt Review Prog 114
100% (1)
Performance Task 1 - Attempt Review Prog 114
4 pages
Prgi User Manual Version 1 Printer
No ratings yet
Prgi User Manual Version 1 Printer
18 pages
Wilco - Eaglesoft Citation X 奖状用户手册 - 部分2
No ratings yet
Wilco - Eaglesoft Citation X 奖状用户手册 - 部分2
1 page
Basic Rules of Combining Probabilities Notes
No ratings yet
Basic Rules of Combining Probabilities Notes
11 pages
Pmath 1 2
No ratings yet
Pmath 1 2
18 pages
AUTOCAD 2021 Basic
100% (1)
AUTOCAD 2021 Basic
6 pages
Un1Ty Hdmi TV Device: User-Guide V1.0
No ratings yet
Un1Ty Hdmi TV Device: User-Guide V1.0
8 pages
SAP Data Archiving Process - Archiving Object FI - DOCUMNT - SAP Blogs
No ratings yet
SAP Data Archiving Process - Archiving Object FI - DOCUMNT - SAP Blogs
12 pages
MODULE 4 Network Security
No ratings yet
MODULE 4 Network Security
16 pages
Graph Traversal Techniques
No ratings yet
Graph Traversal Techniques
31 pages
MAD Mini Project Format 2022 PDF
No ratings yet
MAD Mini Project Format 2022 PDF
12 pages
CN Network Layer MCQ
No ratings yet
CN Network Layer MCQ
6 pages
ServiceNow Developer Profile
No ratings yet
ServiceNow Developer Profile
2 pages
Full Stack Dev
No ratings yet
Full Stack Dev
3 pages
IT Manager - Job Desc
No ratings yet
IT Manager - Job Desc
2 pages
CV5 - Not Done
No ratings yet
CV5 - Not Done
5 pages
CENGR 3140:: Numerical Solutions To Ce Problems
No ratings yet
CENGR 3140:: Numerical Solutions To Ce Problems
6 pages
OpenStack Install Guide 2024
No ratings yet
OpenStack Install Guide 2024
149 pages
Usability Study of Mobile Apps
No ratings yet
Usability Study of Mobile Apps
15 pages
History of Optical Storage Media
No ratings yet
History of Optical Storage Media
56 pages
Chapter 9 - Parallel Computation Problems
No ratings yet
Chapter 9 - Parallel Computation Problems
43 pages
Data Analyst Cheat Sheet
No ratings yet
Data Analyst Cheat Sheet
28 pages
REVISED 2020 CAPE Digital Media Unit 2 Word Document Summary Version 1 Oct 17, 2019 PDF
No ratings yet
REVISED 2020 CAPE Digital Media Unit 2 Word Document Summary Version 1 Oct 17, 2019 PDF
14 pages
Process Scheduling Simplilified Notes
No ratings yet
Process Scheduling Simplilified Notes
7 pages
BMW DISplus System Overview
No ratings yet
BMW DISplus System Overview
33 pages
Total Quality Management (TQM)
No ratings yet
Total Quality Management (TQM)
64 pages
2.4G Vs 5G Connections Astrea
No ratings yet
2.4G Vs 5G Connections Astrea
2 pages