Logistic
Regression
Classification
Machine Learning
Slides from CS-229 by Andrew Ng
Classification
Email: Spam / Not Spam?
Online Transactions: Fraudulent (Yes / No)?
Tumor: Malignant / Benign ?
0: “Negative Class” (e.g., benign tumor)
1: “Positive Class” (e.g., malignant tumor)
Andrew Ng
(Yes) 1
Malignant ?
(No) 0
Tumor Size Tumor Size
Threshold will change
because of these new points
Threshold classifier output at 0.5:
If , predict “y = 1”
If , predict “y = 0”
Andrew Ng
Classification: y = 0 or 1
can be > 1 or < 0
Logistic Regression:
Andrew Ng
Logistic
Regression
Hypothesis
Representation
Machine Learning
Logistic Regression Model
Want
0.5
Sigmoid function 0
Logistic function
Andrew Ng
Interpretation of Hypothesis Output
= estimated probability that y = 1 on input x
Example: If
Tell patient that 70% chance of tumor being malignant
“probability that y = 1, given x,
parameterized by ”
Andrew Ng
Logistic
Regression
Decision boundary
Machine Learning
Logistic regression 1
0.5
0
z
Suppose predict “ “ if
Alternatively, if 𝑧 ≥ 0 ; 𝜃𝑇 𝑥 ≥ 0
predict “ “ if
Alternatively, if 𝑧 <0 ; 𝜃𝑇 𝑥< 0
Andrew Ng
Decision Boundary
x2 Linear decision boundary
3
2
1
Predict “ “ if 𝜃𝑇 𝑥 ≥ 0
1 2 3 x1
Suppose
Predict “ “ if
𝑥1 + 𝑥 2 ≥ 3
Andrew Ng
Logistic
Regression
Cost function
Machine Learning
Training set:
m examples
How to choose parameters ?
Andrew Ng
Cost function
Linear regression:
“non-convex” “convex”
Andrew Ng
Logistic regression cost function
If y = 1
0 1 Andrew Ng
Logistic regression cost function
If y = 0
0 1 Andrew Ng
Logistic
Regression
Simplified cost function
Machine Learning
Logistic regression cost function
𝐶𝑜𝑠𝑡
( h𝜃 (𝑥 ) , 𝑦 ) =− 𝑦𝑙𝑜𝑔 ( h𝜃 ( 𝑥 ) ) − (1 − 𝑦 ) log (1 −h 𝜃 ( 𝑥 ))
Andrew Ng
Logistic regression cost function
To fit parameters :
To make a prediction given new :
Output
Andrew Ng
Gradient Descent
Want :
Repeat
(simultaneously update all )
Andrew Ng
Gradient Descent
Want :
Repeat
(simultaneously update all )
Algorithm looks identical to linear regression!
What’s the difference then?
Andrew Ng
Recall
Linear regression:
𝑛
𝑇
h𝜃 ( 𝑥 ) =∑ 𝜃 𝑖 𝑥 𝑖=𝜃 𝑥
𝑖=0
Logistic regression:
Andrew Ng
Gradient Descent
Want :
Repeat
(simultaneously update all )
Algorithm looks identical to linear regression!
The hypothesis h𝜃 ( 𝑥 ) has changed now!
Andrew Ng
Logistic Regression (Binary Classification)
Want :
Repeat
(simultaneously update all )
How to extend for multi-class classification?
Andrew Ng
Logistic
Regression
Multi-class classification:
One-vs-all
Machine Learning
Multiclass classification
Email tagging: Work, Friends, Family, Hobby, etc.
Medical diagrams: Not ill, Cold, Flu, etc.
Weather: Sunny, Cloudy, Rain, Snow, etc.
Images: Cat, Table, Person, etc.
Andrew Ng
Binary classification: Multi-class classification:
x2 x2
x1 x1
Andrew Ng
x2
One-vs-all (one-vs-rest):
x1
x2 x2
x1 x1
x2
Class 1:
Class 2:
Class 3:
x1
Andrew Ng
One-vs-all
Train a logistic regression classifier for each
class to predict the probability that .
On a new input , to make a prediction, pick the
class that maximizes
Andrew Ng