National Economic University
Chapter 3
             Logit and Probit Models
                   Dr. Phung Minh Duc
  Contents
1. Logit and Probit Model
2. Practice
 1                                             Logit and Probit Model
❖ The logit/probit model is suitable for the case where the dependent
  variable 𝑌 is a binary variable.
                                      1 𝑖𝑓 𝑦𝑒𝑠
                                   𝑌=ቊ
                                       0 𝑖𝑓 𝑛𝑜
❖ Examples of binary variables:
     ▪ Consumer economics: Whether a consumer makes a purchase or not?
     ▪ Labor economics: Whether an individual participates in the labor market or not?
     ▪ Agricultural economics: Whether a farmer adopts or uses organic practices,
       marketing/production contracts,… or not
 1                                           Logit and Probit Model
❖ OLS regression with binary dependent variable
     ▪ OLS does not fit the data:   𝒀 = 𝜷𝟎 + 𝜷𝟏 𝑿𝟏 + ⋯ + 𝜷𝒌 𝑿𝒌 + 𝒖
     ▪ The predicted Y value can be less than 0 and greater than 1
     ▪ The value of Y between 0 and 1 has no meaning
1                                         Logit and Probit Model
❖ The logit or probit models is often chosen to answer the question:
    What factors determine the probability that the dependent variable will
                 receive a certain value in its set of values?
1                                           Logit and Probit Model
❖ Binary outcome models
    ▪ The OLS model:     𝒀 = 𝑿’𝜷 + 𝒖
      in which, 𝑋 ′ 𝛽 = 𝛽0 + 𝛽1 𝑋1 + ⋯ + 𝛽𝑘 𝑋𝑘 .
    ▪ Binary outcome models estimate the probability that 𝑌 = 1 as a function
      of the independent variables
                     𝒑 = 𝐏𝐫 𝒀 = 𝟏 𝑿 = 𝑮(𝑿′ 𝜷)
1                                             Logit and Probit Model
    There are three different models depending on the function form of 𝐺(𝑋 ′ 𝛽),
    include:
                                     𝑷 𝒀 = 𝟏 𝑿 = 𝑿′ 𝜷       (Linear Probability Model)
                                             𝐞𝐱𝐩 𝑿′ 𝜷
    𝑷 𝒀=𝟏𝑿 =𝑮       𝑿′ 𝜷          𝑷 𝒀=𝟏𝑿 =
                                           𝟏 + 𝒆𝒙𝒑 𝑿′ 𝜷
                                                            (Logit Model)
                                   𝑷 𝒀 = 𝟏 𝑿 =  𝑿′ 𝜷       (Probit Model)
1                                          Logit and Probit Model
❖ Linear probability model (LPM)
    ▪ The Linear Probability model has the form:
                   𝒑 = 𝑷 𝒀 = 𝟏 𝑿 = 𝑮 𝑿′ 𝜷 = 𝑿′ 𝜷
    ▪ A problem with the LPM is the predicted probabilities will not be limited
      between 0 and 1.
    => We do not use the linear probability model with binary outcome data.
1                                                    Logit and Probit Model
    ❖ Logit Model
       ▪ Logit model is used in case the dependent variable takes two values
         0 and 1
       ▪ The general logit model has the form
                                                 ′       𝐞𝐱𝐩 𝑿′ 𝜷
                    𝒑=𝑷 𝒀=𝟏𝑿 =𝑮 𝑿𝜷 =                                 (*)
                                                        𝟏+𝒆𝒙𝒑 𝑿′ 𝜷
       ▪ Linearizing (*) we get:
                         𝑝
                      ln    = 𝑋 ′ 𝛽 = 𝛽0 + 𝛽1 𝑋1 + ⋯ + 𝛽𝑘 𝑋𝑘
                         1−𝑝
                                              𝑝
       ▪ The odd ratio or relative risk is       .
                                             1−𝑝
1                                              Logit and Probit Model
❖ Note:
                        𝒑
                  𝒍𝒏         = 𝑿′ 𝜷 = 𝜷𝟎 + 𝜷𝟏 𝑿𝟏 + ⋯ + 𝜷𝒌 𝑿𝒌
                       𝟏−𝒑
                                           𝑝
    ▪ The odds ratio or relative risk is     and measures the probability that
                                         1−𝑝
      𝑌 = 1 relative to the probability that 𝑌 = 0.
    ▪ The larger the odds ratio, the greater the probability of 𝑌 = 1 .
    ▪ An odds ratio of 2 means that the probability of the outcome 𝑌 = 1 is
      twice as likely as the outcome of 𝑌 = 0.
    ▪ If 𝛽𝑘 is positive and statistically significant => The variable 𝑋𝑘 positively
      affects the probability that 𝑌 = 1 occurs (or increasing the probability
      that 𝑌 = 1 occurs).
1                                             Logit and Probit Model
    ❖ Probit Model
       ▪ Probit model is used in case the dependent variable takes two
         values 0 and 1.
       ▪ The general logit model has the form
        𝒑 = 𝑷 𝒀 = 𝟏 𝑿 =  𝑿′ 𝜷 = (𝜷𝟎 + 𝜷𝟏 𝑿𝟏 + ⋯ + 𝜷𝒌 𝑿𝒌 )
       in which, (. ) is the standard normal distribution.
                                          Logit and Probit Model
❖ Estimate Method
   ▪ Logit and Probit can not be estimate with OLS method.
   ▪ Logit and Probit models are both estimate using Maximum Likelihood
     Estimation (MLE).
   ▪ The log-likehood function of the parameters and the data (𝑋𝑖 , 𝑌𝑖 ) has
     the form:
             𝐿𝑖 𝛽 = 𝑌𝑖 log[𝐺 𝑋𝑖′ 𝛽) + 1 − 𝑌𝑖 log[1 − 𝐺 𝑋𝑖′ 𝛽) .
1                                              Logit and Probit Model
    ❖ Marginal Effects
       ▪ Marginal Effects for the OLS model:
                                      𝝏𝒑
                                            = 𝜷𝒋
                                      𝝏𝑿𝒋
       ▪ Marginal Effects for the Logit model
                          𝝏𝒑         ′               𝐞𝐱𝐩 𝑿′ 𝜷
                                = 𝑮 𝑿 𝜷 𝜷𝒋 =                     𝜷
                          𝝏𝑿𝒋                      [𝟏+𝒆𝒙𝒑 𝑿′ 𝜷 ]𝟐 𝒋
       ▪ Marginal Effects for the Probit model
                            𝝏𝒑
                                  = 𝑮 𝑿′ 𝜷 𝜷𝒋 =  𝑿′ 𝜷 𝜷𝒋
                            𝝏𝑿𝒋
1                                          Logit and Probit Model
❖ Marginal Effects
    ▪ When estimating logit and probit models, it is common to report the
      marginal effects after reporting the coefficients.
    ▪ The marginal effects reflect the change in the probability of 𝑌 = 1
      given a 1 unit change in an independent variable 𝑋.
    ▪ The marginal effects depend on 𝑋, so we need to estimate the
      marginal effects at a specific value of 𝑋 (typically the means)
    ▪ Coefficients and marginal effects have the same signs.
1                                              Logit and Probit Model
❖ Goodness of fit
    ▪ Assume the observations are classified into 𝐽 groups. Let 𝑛𝑗 be the
      number of observations in group 𝑗, the number of observations with 𝑌 =
      1 in group 𝑗 is 𝑌(𝑗) and the mean predictive value in group 𝑗 is 𝑝(𝑗),
                                                                          Ƹ
      where
                                                            𝑝ො
                             𝑌(𝑗) = σ𝑗∈𝐽 𝑦𝑗 , 𝑝(𝑗)
                                               Ƹ   = σ𝑖∈𝐽 𝑛 𝑖
                                                             𝑗            2
                                                     𝐽   𝑌(𝑗)−𝑛𝑗 𝑝(𝑗)
                                                                 ො
    ▪ Test statistics Hosmer-Lemeshow: 𝐻𝐿 =        σ
                                                     𝑗=1 𝑛𝑗 𝑝ො 𝑗 1−𝑝(𝑗)
                                                                   ො
    ▪ Hosmer-Lemeshow (1989) showed that, if the model has the correct
      format then the HL statistic will have a 2 distribution with 𝐽 − 2 degrees
      of freedom.
1                                            Logit and Probit Model
❖ R2 and Pseudo_R2
    ▪ R2 is used to evaluate the fit of the model in the OLS method. In the
      logistic model, the fit is represented by the Pseudo R2, which is
      determined as follows:
                                         𝟐     𝑳∗
                              𝑷𝒔𝒆𝒅𝒐_𝑹 = 𝟏 −
                                                    𝑳𝟎
    In which, 𝐿∗ is maximum log-likehood value of the estimated model, and 𝐿0
    is maximum log-likehood value of the model which has only constant.
    ▪ In the logit model, the most important thing is the expected sign of the
      regression coefficients, their statistical significance as well as their
      practical significance. The coefficient R-squared is only of second interest.
1                                            Logit and Probit Model
❖ Commands on Stata
    ▪ logit Y X => The effect of variables X on the variable ln(odd)
    ▪ logit Y X, or => The effect of variables X on the variable odd
    ▪ mfx or margins, dydx(*) : The marginal effect at the mean of X
    ▪ margins, dydx(*) at(…): The marginal effect at a group
1                                          Logit and Probit Model
❖ Commands on Stata
    ▪ probit Y X
    ▪ mfx or margins, dydx(*): The marginal effect at the mean of X
    ▪ margins, dydx(*) at(…): The marginal effect at a group
1                                             Logit and Probit Model
❖ Testing
    ▪ linktest:   The function form Test
        (If P-value is large then the function form is correct)
    ▪ estat gof (or estat gof, group(10): the goodness of fit Test
        (If P-value is large then the model is suitable)
1                                                  Logit and Probit Model
❖ Logit or Probit?
▪ The choice of logit or probit to estimate is
  based on the distribution of error
    ➢ If the distribution is logistic => use the
      logit model
    ➢ If the distribution is normal => use the
      probit model
▪ The estimated results from the two
  models are similar, but not comparable
▪ The choice is up to you.
Practice
1                                           Practice
    To get the data. Type:
      use "https://dss.princeton.edu/training/logit.dta"
    To run a logit model, type:
      logit y_bin x1 x2 x3 i.opinion
2   Practice
2                                                          Practice
    Interpretation:
    • In this estimation result, none of the coefficients except for Agree
       significantly affect the log-odds ratio of the dependent variable. The
       coefficient for Agree is significant at the 5% level.
    • The Coefficient column shows the coefficients in log-odds form.
       For example, when x1 increases by one unit, the expected change in
       the log odds is 1.133556 (an increase), holding all other variables in
       the model constant. However, this increase is not statistically significant
       because the p-value is not <0.05.
2                                                             Practice
    To get odds ratio rather than logit coefficients, type:
    logit y_bin x1 x2 x3 i.opinion, or
2                                                          Practice
    Interpretation:
    • Odds Ratio: They represent the odds of 𝑌 = 1 when X increases by 1 unit.
       These are the exp(logit coeff).
        – If the 𝑂𝑅 > 1 then the odds of 𝑌 = 1 increases
        – If the 𝑂𝑅 < 1 then the odds of 𝑌 = 1 decreases
    • The odds ratio for x1 which is 3.106685, that mean, if x1 increases by
      one unit, the odds of 𝑌 = 1 are 3.1 times higher when x1 increases by
      one unit, keeping all other predictors constant.
1                                                      Practice
    To calculate marginal effects after logit, type:
       quietly logit y_bin x1 x2 x3 i.opinion
       margins, dydx(*) atmeans post
2                                                         Practice
    Interpretation:
    • x1=.1384634 The change in probability for one instant change in x1 is
       13 percentage points (however, the change is not statistically significant)
    • x2=.036904 The change in probability for one instant change in x2 is
       3 percentage points (however, the change is not statistically significant)
    • Agree=-.3656898 The change in probability when opinion goes from
       ‘strongly agree’ to ‘agree’ decreases by 36 percentage points
       or -0.36. This change is statistically significant, because the p-value is
       0.029 which is <0.05.
2                                                    Practice
    Interpretation:
    • Disag=.0312784 The change in probability when opinion goes from
       ‘strongly agree’ to ‘disagree’ increases by 3 percentage
       points or 0.03. However, the change is not statistically significant.
    • Str Disag=.0574484 The change in probability when opinion goes
       from ‘strongly agree’ to ‘strongly disagree’ increases by
       5 percentage points or 0.05. However, the change is not statistically
       significant.
1                                                    Practice
    Estimating predicted probabilities after logit
       quietly logit y_bin x1 x2 x3 i.opinion
       margins, atmeans post
1                                                                 Practice
    Interpretation:
     The probability of y_bin=1 is 85%, given that all predictors are set to their mean values.
1                                                    Practice
    Estimating predicted probabilities after logit
       quietly logit y_bin x1 x2 x3 i.opinion
       margins opinion, atmeans post
1                                                            Practice
    Interpretation:
     Holding all variables at their mean values. The probability of y_bin = 1 is:
       ▪ 87% among those who “strongly agree”,
       ▪ 51% among those who “agree”,
       ▪ 91% among those who “disagree” and
       ▪ 93% among those who “strongly disagree”
Practice