Economics 210
Handout # 6 The Probit, Logit, Tobit and Linear Probability Models
The Linear Probability Model (LPM)
Problems with the LPM
1) Heteroscedasticity
2) The difficulty of interpreting probabilities >1 and < 0
3) Constant marginal effects
The Probit Model
Index function
Interpretation of
I. Utility Interpretation: is the additional utility that individual i would get by choosing
rather than . The utility gain is partly random across individuals.
II. Threshold Interpretation: represents a threshold such that if
then . The threshold is distributed across the
population according to a known distribution.
The Probit Model
If we assume that the probability density function (pdf) of the error term is the standard
normal distribution, , the model is called the probit model. The probit model
is estimated by Maximum Likelihood Estimation.
Marginal Effects:
The coefficients from the probit model are difficult to interpret because they measure the
change in the unobservable y* associated with a change in one of the explanatory
variables. A more useful measure is what we call the marginal effects.
Example
The Linear Probability Model(LPM)
. reg lfp we hhrs kl6 k618
Source | SS df MS Number of obs = 753
---------+------------------------------ F( 4, 748) = 20.29
Model | 18.0827189 4 4.52067973 Prob > F = 0.0000
Residual | 166.645037 748 .222787482 R-squared = 0.0979
---------+------------------------------ Adj R-squared = 0.0931
Total | 184.727756 752 .245648611 Root MSE = .472
------------------------------------------------------------------------------
lfp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
we | .0482551 .0076384 6.317 0.000 .0332599 .0632504
hhrs | -.0000672 .0000292 -2.305 0.021 -.0001244 -9.97e-06
kl6 | -.2262515 .0331852 -6.818 0.000 -.2913987 -.1611043
k618 | .0145722 .0131891 1.105 0.270 -.0113198 .0404642
_cons | .1619111 .1123544 1.441 0.150 -.0586563 .3824785
------------------------------------------------------------------------------
Probit
probit lfp we hhrs kl6 k618
Iteration 0: log likelihood = -514.8732
Iteration 1: log likelihood = -476.38262
Iteration 2: log likelihood = -476.00012
Iteration 3: log likelihood = -476
Probit estimates Number of obs = 753
LR chi2(4) = 77.75
Prob > chi2 = 0.0000
Log likelihood = -476 Pseudo R2 = 0.0755
------------------------------------------------------------------------------
lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
we | .1351213 .0220863 6.118 0.000 .091833 .1784096
hhrs | -.0001862 .0000803 -2.319 0.020 -.0003436 -.0000288
kl6 | -.6343051 .0984215 -6.445 0.000 -.8272077 -.4414025
k618 | .0404934 .0361599 1.120 0.263 -.0303787 .1113655
_cons | -.9628781 .3146603 -3.060 0.002 -1.579601 -.3461551
------------------------------------------------------------------------------
mfx compute
Marginal effects after probit
y = Pr(lfp) (predict)
= .5710825
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
we | .0530477 .00866 6.12 0.000 .036067 .070028 12.2869
hhrs | -.0000731 .00003 -2.32 0.020 -.000135 -.000011 2267.27
kl6 | -.2490236 .03875 -6.43 0.000 -.32497 -.173077 .237716
k618 | .0158974 .0142 1.12 0.263 -.011928 .043723 1.35325
------------------------------------------------------------------------------
mat A = (12,2000,2,0,1)
mfx compute, at (A)
Marginal effects after probit
y = Pr(lfp) (predict)
= .16293159
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
we | .0332682 .00756 4.40 0.000 .018458 .048079 12.0000
hhrs | -.0000459 .00002 -2.08 0.038 -.000089 -2.7e-06 2000.00
kl6 | -.1561719 .01127 -13.86 0.000 -.178256 -.134088 2.00000
k618 | .0099699 .00845 1.18 0.238 -.006589 .026528 0.00000
------------------------------------------------------------------------------
mfx compute, at (mean kl6=5)
Marginal effects after probit
y = Pr(lfp) (predict)
= .00224432
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
we | .0009511 .00126 0.75 0.451 -.00152 .003422 12.2869
hhrs | -1.31e-06 .00000 -0.72 0.474 -4.9e-06 2.3e-06 2267.27
kl6 | -.0044648 .0053 -0.84 0.400 -.014858 .005928 5.00000
k618 | .000285 .00044 0.65 0.514 -.00057 .00114 1.35325
------------------------------------------------------------------------------
The Logit Model
The Logit Model is very similar to the probit model.
In the probit model we assumed that In the logit model we assumed that has
what is known as a logistic distribution. The pdf of is given by
The model is estimated by MLE. It can be readily be verified that the CDF of is
. From the above it can be readily seen that
Examples
logit lfp we kl6 k618 hhrs
Iteration 0: log likelihood = -514.8732
Iteration 1: log likelihood = -476.50292
Iteration 2: log likelihood = -476.1179
Iteration 3: log likelihood = -476.11756
Logit estimates Number of obs = 753
LR chi2(4) = 77.51
Prob > chi2 = 0.0000
Log likelihood = -476.11756 Pseudo R2 = 0.0753
------------------------------------------------------------------------------
lfp | Coef. Std. Err. z P>|z| [95% Conf. Interval]
---------+--------------------------------------------------------------------
we | .2202259 .0369616 5.958 0.000 .1477826 .2926692
kl6 | -1.03135 .1647924 -6.258 0.000 -1.354337 -.7083623
k618 | .0657628 .0595517 1.104 0.269 -.0509563 .182482
hhrs | -.0003026 .0001316 -2.299 0.022 -.0005605 -.0000446
_cons | -1.573647 .5193776 -3.030 0.002 -2.591608 -.5556858
------------------------------------------------------------------------------
mfx compute
Marginal effects after logit
y = Pr(lfp) (predict)
= .57201064
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
we | .0539145 .00903 5.97 0.000 .036218 .071611 12.2869
hhrs | -.0000741 .00003 -2.30 0.022 -.000137 -.000011 2267.27
kl6 | -.2524893 .04052 -6.23 0.000 -.331903 -.173076 .237716
k618 | .0160997 .01458 1.10 0.269 -.012475 .044674 1.35325
Note: Despite the differences in the coefficients the marginal effects are
very similar.
The Censored Regression (Tobit) Model
The Tobit Model arises when the y variable is limited (or censored) from above or below. For
example, if we want to measure the demand for concert tickets we might use data on the number
of tickets sold from previous concerts. However if the concerts are occasionally sold out then the
y variable would be limited by the size of the auditorium. Similarly if we are estimating a wage
function the wage may be limited from below by the minimum wage, or if we are estimating the
demand for a particular good using sales as our dependent variable the minimum value that sales
can take on is zero. There are numerous other examples where the dependent variable in a
regression equation is similarly limited.
Examples
reg tickets price
Number of obs = 50
R-squared = 0.5630
------------------------------------------------------------------------------
tickets | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
price | -8.318243 1.057751 -7.864 0.000 -10.44499 -6.191492
_cons | 1120.203 31.77058 35.259 0.000 1056.324 1184.082
------------------------------------------------------------------------------
reg tickets price if tickets < 1000
Number of obs = 16
R-squared = 0.5876
------------------------------------------------------------------------------
tickets | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
price | -24 5.374079 -4.466 0.001 -35.52625 -12.47375
_cons | 1720 231.1596 7.441 0.000 1224.212 2215.788
------------------------------------------------------------------------------
tobit tickets price , ul
Tobit estimates Number of obs = 50
LR chi2(1) = 72.27
Prob > chi2 = 0.0000
Log likelihood = -100.56717 Pseudo R2 = 0.2643
------------------------------------------------------------------------------
tickets | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
price | -34.97672 4.717039 -7.415 0.000 -44.45597 -25.49748
_cons | 2217.171 197.6837 11.216 0.000 1819.911 2614.432
---------+--------------------------------------------------------------------
_se | 103.9596 18.72889 (Ancillary parameter)
------------------------------------------------------------------------------
Obs. summary: 16 uncensored observations
34 right-censored observations at tickets>=1000
reg insure income
Number of obs = 200
R-squared = 0.7269
------------------------------------------------------------------------------
insure | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
income | .9704973 .0422701 22.959 0.000 .8871398 1.053855
_cons | -23.42143 2.490869 -9.403 0.000 -28.33346 -18.50939
------------------------------------------------------------------------------
reg insure income if insure > 0
Number of obs = 107
R-squared = 0.8958
------------------------------------------------------------------------------
insure | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
income | 1.940817 .0646011 30.043 0.000 1.812725 2.068909
_cons | -95.09566 4.887191 -19.458 0.000 -104.7861 -85.40526
------------------------------------------------------------------------------
tobit insure income ,ll
Tobit estimates Number of obs = 200
LR chi2(1) = 468.22
Prob > chi2 = 0.0000
Log likelihood = -416.40595 Pseudo R2 = 0.3599
------------------------------------------------------------------------------
insure | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------
income | 2.0489 .0597651 34.283 0.000 1.931046 2.166754
_cons | -104.2937 4.41718 -23.611 0.000 -113.0042 -95.58318
---------+--------------------------------------------------------------------
_se | 10.73973 .7259328 (Ancillary parameter)
------------------------------------------------------------------------------
Obs. summary: 93 left-censored observations at insure<=0
107 uncensored observations