KEMBAR78
Lecture 6 Logistic Regression | PDF | Logistic Regression | Regression Analysis
0% found this document useful (0 votes)
5 views28 pages

Lecture 6 Logistic Regression

The document provides an overview of logistic regression, a statistical method used for predicting binary outcomes based on continuous independent variables. It explains the logistic function, logit transformation, and the interpretation of parameters in the model, as well as applications in classification tasks. Additionally, it discusses types of logistic regression, including binary, multinomial, and ordinal, and introduces the concept of gradient descent for minimizing cost functions.

Uploaded by

Varun Goel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views28 pages

Lecture 6 Logistic Regression

The document provides an overview of logistic regression, a statistical method used for predicting binary outcomes based on continuous independent variables. It explains the logistic function, logit transformation, and the interpretation of parameters in the model, as well as applications in classification tasks. Additionally, it discusses types of logistic regression, including binary, multinomial, and ordinal, and introduces the concept of gradient descent for minimizing cost functions.

Uploaded by

Varun Goel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Logistic Regression

Dr. Dinesh Kumar Vishwakarma


Professor,
Department of Information Technology,
Delhi Technological University, Delhi
Logistic Regression: Intro
 Logistic regression extends the ideas of linear
regression to the situation where the dependent
variable, 𝑌 , is categorical.
 Now suppose the dependent variable y is binary.
 It takes on two values “Success” (1) or “Failure” (0)
 We are interested in predicting a y from a continuous
independent variable x.
 This is the situation in which Logistic Regression is
used.
Logistic Regression
Linear vs Logistic
Example
 Based CGPA of UG, a student will get the admission
in PG? Yes/No
 The values of y are 1 (Success) or 0 (Failure). The
values of x range over a continuum. Raining or Not.
 A categorical variable as divides the observations into
classes of a stock such as holding /selling / buying,
then categorical variable with 3 categories. “hold"
class, the “sell" class, and the “buy” class.
 It can be used for classifying a new observation into
one of the classes, based on the values of its predictor
variables (called “classification").
Applications
 Logistic regression is used in applications such as:
 Classifying customers as returning or non-returning
(classification)
 Finding factors that differentiate between male and female
top executives (profiling)
 Predicting the approval or disapproval of a loan based on
information such as credit scores (classification).
 Popular examples of binary response outcomes are
 success/failure, yes/no, buy/don't buy, default/don't default,
and survive/die.
 We code the values of a binary response Y as 0 and 1.
Introduction Logistic Regression
 Most important model for categorical
response (yi) data
 Categorical response with 2 levels (binary: 0
and 1)
 Categorical response with ≥ 3 levels (nominal
or ordinal)
 Predictor variables (xi) can take on any form:
binary, categorical, and/or continuous.
Logistic Curve
1.0

0.9

0.8
Probability

0.7 Sigmoid cure


0.6

0.5

0.4

0.3

0.2

0.1

0.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

x
Logistic Function
Logistic Function
1.0

e  o  1 X
P (" Success"| X ) 
0.8
1  e  o  1 X
P(“Success”|X)

0.6

0.4

0.2

0.0

X
Logit Transformation

 The logistic regression model is given by


 o  1 X
e
P (Y | X ) 
1  e  o  1 X
 which is equivalent to
 P (Y | X ) 
ln    o  1 X
 1  P (Y | X ) 

This is called the


Logit Transformation
Logit Transformation
 Logistic regression models transform
probabilities called logits.
 pi 
logit( pi )  log  
 where  1  pi 
•i indexes all cases (observations).
• pi is the probability the event (a sale, for
example) occurs in the ith case.
• log is the natural log (to the base e).
Comparing LP and Logit Models

LP Model

Logit Model

0
Assumption

pii
P (pi )

Logit
Transform

Predictor Predictor
Logistic regression model with a
single continuous predictor
 𝑙𝑜𝑔𝑖𝑡 𝑝𝑖 = log 𝑜𝑑𝑑𝑠 = 𝛽0 + 𝛽1 𝑋1
 Where 𝑙𝑜𝑔𝑖𝑡 𝑝𝑖 logit transformation of the
probability of the event.
 0 intercept of the regression line
 1 slope of the regression line
The logistic Regression Model
 Let p denote P[y = 1] = P[Success]. This
quantity will increase with the value of x.
The ratio:
p
is called the odds ratio
1 p
This quantity will also increase with the value of
x, ranging from zero to infinity.
 p 
The quantity: ln  
 1 p 
is called the log odds ratio
Example: odds ratio, log odds ratio
Suppose a die is rolled:
Success = “roll a six”, p = 1/6
1 1
p 1
The odds ratio  61  6

1 p 1 6 5
6 5

The log odds ratio


 p  1
ln    ln    ln  0.2   1.69044
 1 p  5
The logistic Regression Model
Assumes the log odds ratio is linearly
related to x.
 p 
i. e. : ln    0  1 x
 1 p 
In terms of the odds ratio
p 0  1 x
e
1 p
The logistic Regression Model
Solving for p in terms x.
p
 e 0  1x
1 p
pe 0  1 x
1  p 
p  pe 0  1x  e 0  1x
0  1 x
e
or p 0  1 x
1 e
Interpretation of the parameter 0
• determines the intercept
1

0.8
p 0.6

0.4 0
e
0
0.2 1 e
0
0 2 4 6 8 10
x
Interpretation of the parameter 1
• determines when p is 0.50 (along with 0)

0.8 0  1 x
e 1 1
p p 0  1 x
 
0.6 1 e 11 2
0.4 when
0
0.2  0  1 x  0 or x  
1
0
0 2 4 6 8 10
x
Interpretation of the parameter 1…
dp d e 0  1 x
Also 
dx dx 1  e 0  1 x


 
e 0  1 x 1 1  e 0  1 x  e 0  1 x 1e 0  1 x

1  e 
2
 0  1 x

e 0  1 x 1 1 0
 
1  e 
 0  1 x 2 4 when x  
1
1 is the rate of increase in p with respect to x
4 when p = 0.50
Interpretation of the parameter 1
determines slope when p is 0.50

0.8
p 0.6 1
slope 
0.4 4
0.2

0
0 2 4 6 8 10
x
Binary Classification
 In logistic regression we take two steps:
 First step yields estimates of the probabilities of belonging to
each class. In the binary case we get an estimate of P(Y = 1).
 the probability of belonging to class 1 (which also tells us the
probability of belonging to class 0).
 In the next step we use
 a cutoff value on these probabilities in order to classify each case
to one of the classes.
 a cutoff of 0.5 means that cases with an estimated probability of
P(Y = 1) > 0.5 are classified as belonging to class 1,
 whereas cases with P(Y = 1) < 0.5 are classified as belonging to
class 0.
 The cutoff need not be set at 0.5.
Types of Logistic Regression
 Binary Logistic Regression
The categorical response has only two 2 possible
outcomes. Example: Spam or Not
 Multinomial Logistic Regression
Three or more categories without ordering.
Example: Predicting which food is preferred more
(Veg, Non-Veg, Vegan)
 Ordinal Logistic Regression
Three or more categories with ordering. Example:
Movie rating from 1 to 5
Cost Function
Gradient Descent
 Now the question arises, how do we reduce the cost
value. Well, this can be done by using Gradient
Descent.
 The main goal of Gradient descent is to minimize the
cost value. i.e. min J(θ).
 Now to minimize our cost function we need to run the
gradient descent function on each parameter i.e.
Gradient Descent…
 Objective: To minimize the cost function we
have to run the gradient descent function on
each parameter

You might also like