KEMBAR78
Lecture 7 Logistic Regression | PDF | Logistic Regression | Regression Analysis
0% found this document useful (0 votes)
11 views33 pages

Lecture 7 Logistic Regression

The document compares linear regression and logistic regression, highlighting that linear regression is suitable for continuous outcomes while logistic regression is appropriate for binary outcomes. It explains the logistic function, how to interpret coefficients, and provides examples of modeling binary outcomes using logistic regression. Additionally, it discusses variable selection methods for multiple regression, including forward selection, backward elimination, and stepwise selection.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views33 pages

Lecture 7 Logistic Regression

The document compares linear regression and logistic regression, highlighting that linear regression is suitable for continuous outcomes while logistic regression is appropriate for binary outcomes. It explains the logistic function, how to interpret coefficients, and provides examples of modeling binary outcomes using logistic regression. Additionally, it discusses variable selection methods for multiple regression, including forward selection, backward elimination, and stepwise selection.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

ASC399

LOGISTIC
REGRESSION
Linear Regression vs Logistic
Regression
• Linear regression models have a particular form.
• The regression formula is the equation for a
straight line.
• Among the properties of a straight line is that it
goes on forever, continuously in both directions.
• These properties make linear regression models
well-suited to estimating continuous quantities
that can take a wide range of values.
Linear Regression vs Logistic
Regression
• The same properties that make linear
regression models appropriate for modeling
unbounded, continuous targets make them
unsuitable for modeling binary outcomes such
as yes/no or good/bad.
• Logistic regression is a regression model
suitable for modeling binary outcomes
Modeling Binary Outcomes
• Modeling a binary outcome tries to answer
"What is the probability that this record
belongs to class one?“
• Because probabilities are numbers, the
modeling binary outcome is an estimation
task.
Logistic Function
• The goal is to estimate the probability that an
event occur, p.
• The first step is to transform that probability, p
into odds, by taking the ratio of p over 1-p.
Recall that odds and probability say exactly
the same thing, but while probabilities are
restricted to the range 0 to 1, odds go from 0
to infinity.
p
Odds 
1-p
Y – Cancer Outcome (1-
Improved, 0-Otherwise) log odds will
allow the
p = P(Improved) relationships
shown
Odds = p/(1-p) above to
become
X1 – Survival Rating by linear.
Physician
Logistic Function
• Setting up the log odds as the target variable
for the regression looks makes the regression
equation look like:

 p 
logit  ln    o  1 X
1 p 
• This is the logistic function
• A method called maximum likelihood is used,
to find the best-fit line for logistic regression.
Logistic Function
• Thus odds now becomes:

 p    o  1 X 
Odds     e
 1-p 
• Solving for the probability p requires a bit of
algebra. The result of which is:
  o  1 X 
1 e
p    o  1 X 
p   X 
1 e 1 e o 1
Interpreting the Coefficients
• We can use either the original or
exponentiated logistic coefficients for
interpretation. The two types of logistic
coefficient differ in that they reflect the
relationship of the independent variable with
the two forms of the dependent variable, as
shown here:
Logistic Coefficient Reflects Changes in ...
Original Logit (logged odds)
Exponentiated Odds
Coefficients of Metric (Interval/Ratio)
Independent Variables
Value
Exponentiated Coefficient (eb) .20 .50 1.0 1.5 1.8
eb – 1.0 -.80 -.50 0.0 .50 .80
Percentage change in odds -80% -50% 0% 50% 80%
Model’s predicted probability of Lower Zero Higher
occurrence
For any positive change in the Decrease None Increase
independent variable (X), the odds will

• If eb = 0.20:
– A one-unit change in X will reduce the odds of Y by 80%
– Thus reverse relationship
Coefficients of Metric (Interval/Ratio)
Independent Variables
Value
Exponentiated Coefficient (eb) .20 .50 1.0 1.5 1.8
eb – 1.0 -.80 -.50 0.0 .50 .80
Percentage change in odds -80% -50% 0% 50% 80%
Model’s predicted probability of Lower Zero Higher
occurrence
For any positive change in the Decrease None Increase
independent variable, the odds will

• If eb = 1.5:
– A one-unit change in X will increase the odds of Y by 50%
– Thus direct relationship
Coefficients of Metric (Interval/Ratio)
Independent Variables
Value
Exponentiated Coefficient (eb) .20 .50 1.0 1.5 1.8
eb – 1.0 -.80 -.50 0.0 .50 .80
Percentage change in odds -80% -50% 0% 50% 80%
Model’s predicted probability of Lower Zero Higher
occurrence
For any positive change in the Decrease None Increase
independent variable, the odds will

• If eb = 1.0:
– A one-unit change in X will not change the odds of Y
– Thus no relationship between X and Y
Coefficients of Nonmetric
(Categorical/Dummy) Independent
Variables
• Dummy variables represent a single category of a
non- metric variable.
• It takes on just the values of 1 or 0, indicating the
presence or absence of a characteristic.
• Any time a dummy variable is used, it is essential
to note the reference or omitted category.
– Eg. Gender – 1 (Male), 0 (Female) => X1 = 1 or 0, the
omitted or reference category is Female
– Eg. Race – Malay, Chinese, Indian => X1 = 1 (Malay),
X2 = 2 (Chinese), the omitted or reference category is
Indian
Coefficients of Nonmetric
(Categorical/Dummy) Independent
Variables
• If the nonmetric variable is gender, the two
possibilities are male and female.
• The dummy variable can be defined as
representing males (i.e., value of 1 if male, 0 if
female) or females (i.e., value of 1 if female, 0
if male).
• Whichever way is chosen, however,
determines how the coefficient is interpreted.
Coefficients of Nonmetric
(Categorical/Dummy) Independent
Variables
• Let's assume that a 1 is given to females, making
the exponentiated coefficient represent the
percentage of the odds ratio of females
compared to males (Reference category).
• If the exponentiated coefficient is 1.25, then
females have 25 percent higher odds compared
to/than males (1.25 -1.0 = .25).
• Likewise, if the exponentiated coefficient is .80,
then the odds for females are 20 percent less (.80
– 1.0 = -.20) compared to/than males.
Example
A researcher is interested in how variables, such as GRE (Graduate Record Exam scores),
GPA (grade point average) and prestige of the undergraduate institution, effect
admission into graduate school. The outcome variable, admit/don't admit, is binary.

This data set has a binary response (outcome,


dependent) variable called admit, which is
equal to 1 if the individual was admitted to
graduate school, and 0 otherwise. There are
three predictor variables: gre, gpa, and rank.
We will treat the variables gre and gpa as
continuous. The variable rank takes on the
values 1 through 4. Institutions with a rank of 1
have the highest prestige, while those with a
rank of 4 have the lowest.
 p 
ln   5.5414  .00226GRE  .804GPA  1.5514 Rank1  .876 Rank 2  .2112 Rank 3
1 p 
X b e^b (e^b)-1 % change Interpretation
(odds ratio) in odds
GRE 0.00226 1.002263 0.002263 0.23 1 unit change in GRE will increase the odds of
admission to graduate school by 0.23%
GPA 0.804 2.234461 1.234461 123.45 1 unit change in GPA will increase the odds of
admission to graduate school by 123.45%
Rank 1 1.5514 4.718071 3.718071 371.81 Having attended an undergraduate institution with a
rank of 1, will increase the odds of admission to
graduate school by 371.81% compared to
If attended an undergraduate with a rank of 4
Rank 2 0.876 2.401275 1.401275 140.13 Having attended an undergraduate institution with a
rank of 2, will increase the odds of admission to
graduate school by 140.13% compared to
If attended an undergraduate with a rank of 4
Rank 3 0.2112 1.235159 0.235159 23.52 Having attended an undergraduate institution with a
rank of 3, will increase the odds of admission to
graduate school by 23.52% compared to
If attended an undergraduate with a rank of 4
 p 
ln   5.5414  .00226GRE  .804GPA  1.5514 Rank1  .876 Rank 2  .2112 Rank 3
 1  p 

Odds :
p ( 5.5414 .00226 GRE .804 GPA1.5514 Rank 1.876 Rank 2 .2112 Rank 3)
e
1 p

1
p  ( 5.5414 .00226 GRE .804 GPA1.5514 Rank 1.876 Rank 2 .2112 Rank 3)
1 e
X b obs 62 obs 64 obs 66 obs 79

GRE 0.00226 560 680 600 540

GPA 0.804 3.32 3.85 3.59 3.12

Rank 1 1.5514 0 0 0 1

Rank 2 0.876 0 0 1 0

Rank 3 0.2112 0 1 0 0

p(y=1): 0.1671 0.3323 0.3958 0.4351

Yhat: 0 0 0 0

1
p
1 e  ( 5.5414.00226GRE .804GPA1.5514Rank1.876Rank 2 .2112Rank3 )
Variable Selection for Multiple
Regression
• Forward Selection
– Forward selection starts with a set of input variables,
where the variable will be added to the model one at
a time.
– The first step is to create a separate regression model
for each input variable; if there are n input variables,
then the first step considers n different models with
one input variable. The variable whose model scores
best on some test becomes the first variable included
in the forward selection model.
– At each step, each variable that is NOT in the model is
tested for inclusion in the model. The most
singnificant of these variables is added to the model.
Variable Selection for Multiple
Regression
• Backward Elimination
– The backward elimination approach to variable
selection begins by creating a multiple regression
model using all n input variables.
(starts with fitting a model with all the input variables)
– Then using a statistical test, the worse/least significant
variable is removed/dropped from the model. Then
the model is refit without it. This process continues
until all remaining variables in the model are
statistically significant OR some stopping criterion,
such as a minimum number of variables desired, is
reached.
Variable Selection for Multiple
Regression
• Stepwise selection method
– Stepwise selection method approach combines
both forward selection and backward elimination.
Stepwise selection method allows the variable
added in earlier on in the model to be dropped
out and variables that are dropped at one point to
be added back in the model.
Example
Epping-Jordan, Compas, & Howell (1994)
We were interested in looking at cancer
outcomes as a function of psychological
variables—specifically intrusions and avoidance
behavior.

Variables
Outcome : 1 = Improved 0 = Worse
SurvRate : higher scores = better prognosis
Intrus
Avoid
Model Generation
Using all inputs

After removing intrus (not sig, p>.05)


Interpret the coefficients

X b e^b (e^b)-1 % change Interpretation


in odds
Suvrate -.082 0.92 -0.08 -8 1 unit change in Suvrate will decrease the
odds of Improved by 8%
Avoid .133 1.14 0.14 14 1 unit change in Avoid will increase the
odds of Improved by 14%
Write down the estimated equation

1
p    o  1 X 
1 e
1
p 1.196.082SURVRATE.133 AVOID 
1 e
Estimate the status of cancer outcome for 3 new observations
1
p 1.196.082SURVRATE.133 AVOID 
1 e
X b Obs Obs Obs
114 120 118
Suvrate -.082 15 91 14

Avoid .133 19 17 23

P (Y=1) 0.923 0.017 0.957


(Improved) 6 89 2
Y (predict) 1 0 1
Another Example
Mean of our response variable attending
self-help group (FYI)
• The sample mean of Y is the sum of the
number of successes (yes to attend) divided
by the sample size, n
• The sample mean is the proportion of
successful outcomes
• Thus, 44 said yes and n = 400, thus mean
proportion of yes is .11 or 11%
Odds ratio and % in odds change by age

• Age  =-.0586 and p-value (0.0072)<.01 (beta negative).


Thus, log odds of attending a self-help group decrease
as a person gets older
• Exp  = .9431 … the odds ratio [exp  <1] thus odds
decrease
• % change (in this case a reduction in) in odds of
attending for each additional year of age is
100 (exp  - 1) = 100 (.9431 – 1) = -0.0569 X 100 =
-5.69 % less likely each year one ages
Predicted probability of attending by age
• A point estimate for age 80 would be:
 p = e(-.0586)(80) / 1 + e(-.0586)(80) = .00912
 The probability of those 80 years of age attending a help
group is (0.9%) ~ 1%
• A point estimate for age 40 would be:
 p = e(-.0586)(40) / 1 + e(-.0586)(40) = .0875
 The probability of those 40 years of age attending a help
group is 8.75% ~ 9%
 1 X 1 
e
p  X 
1 e 1 1
Odds ratio and % change in odds by gender
• Gender  =1.2540 and p-value (0.0163) <.05. Thus, log
odds of attending a self-help group among females is
greater (reference category is male and beta is positive)
• Exp  = 3.5043 … odds of attending are 3.5 times as
large for females as they are for males [exp  >1]
• % change (in this case an increase in) in odds of
attending when a person is female is 100(exp  -
1)=100(3.50 – 1) = 250 % compared to male
Predicted probability of attending by gender

• A point estimate for females would be:


e(1.254)(1) / 1 + e(1.254)(1) = .77
Thus, the probability of attending among
females is 77%
• A point estimate for males would be:
e(1.254)(0) / 1 + e(1.254)(0) = .5
Thus, the probability of attending among
males is 50%

You might also like