KEMBAR78
Ch3 Multiple Regression | PDF | Coefficient Of Determination | Ordinary Least Squares
0% found this document useful (0 votes)
51 views56 pages

Ch3 Multiple Regression

Uploaded by

La Land
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views56 pages

Ch3 Multiple Regression

Uploaded by

La Land
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 56

Chapter 3

Multiple Regression
Outline
1. Multiple Regression Equation
2. The Three-Variable Model: Notation and
Assumptions
3. OLS Estimation for the three-variable
model
4. Properties of OLS estimators
5. Goodness of fit –R2 and adjusted R2
6. More on Functional Form
7. Hypothesis Testing in Multiple Regression
1. Multiple regression equation
Yi  1   2 X 2i  ....   k X ki  ui
• Y = One dependent variable (criterion)
• X = Two or more independent variables (predictor
variables).
• ui the stochastic disturbance term
•  1is the intercept
•  k measures the change in Y with respect to Xk,
holding other factors fixed.
Motivation for multiple regression
– Incorporate more explanatory factors into the model
– Explicitly hold fixed other factors that otherwise
would be in u
– Allow for more flexible functional forms
• Example: Wage equation

Now measures effect of education explicitly holding experience fixed

All other factors…

Hourly wage Years of education Labor market experience


Motivation for multiple regression
• Example: Average test scores and per student spending

Other factors

Average standardized Per student spending Average family income


test score of school at this school of students at this school

– Per student spending is likely to be correlated with average family


income at a given high school because of school financing
– Omitting average family income in regression would lead to biased
estimate of the effect of spending on average test scores
– In a simple regression model, effect of per student spending would
partly include the effect of family income on test scores
Motivation for multiple regression

• Example: Family income and family consumption

Other factors

Family consumption Family income Family income squared

Model has two explanatory variables: inome and income


squared
Consumption is explained as a quadratic function of income
One has to be very careful when interpreting the coefficients:
By how much does consumption Depends on how
increase if income is increased much income is
by one unit? already there
Motivation for multiple regression

• Example: CEO salary, sales and CEO tenure

Log of CEO salary Log sales Quadratic function of CEO tenure with firm

– Model assumes a constant elasticity relationship between


CEO salary and the sales of his or her firm
– Model assumes a quadratic relationship between CEO
salary and his or her tenure with the firm
• Meaning of linear regression
– The model has to be linear in the parameters (not in the
variables)
2. The Three-Variable Model: Notation and Assumptions
Assumptions Yi  1   2 X 2i   3 X 3i  ui
1. Linear regression model, or linear in the parameters.
2. Zero mean value of disturbance ui: E(ui|X2i, X3i) = 0
3. No serial correlation between the disturbances:
Cov(ui,uj) = 0, i ≠ j
4. Homoscedasticity or constant variance of ui: Var(ui)=σ2
5. Zero covariance between ui and each X variable
cov (ui, X2i) = cov (ui,X3i) = 0
6. No specification bias or the model is correctly specified.
7. No exact collinearity between the X variables.
3. OLS Estimation for the three-variable model

• To find the OLS estimators, let us first write the sample


regression function (SRF) as follows:
Yi  ˆ1  ˆ2 X 2i  ˆ3 X 3i  uˆi
• The residual sum of squares (RSS) ∑uˆ2i is as small as
possible
 
uˆ  (Y  Yˆ ) 2  min
2
i i i

 ˆ
u  Y2
i 
 i 1 2 2 i 3 3i
ˆ  ˆ X  ˆ X
 
2
 min
Example- Stata output
• Model: wage = f(educ,exper )
. reg wage educ exper

Source SS df MS Number of obs = 526


F( 2, 523) = 75.99
Model 1612.2545 2 806.127251 Prob > F = 0.0000
Residual 5548.15979 523 10.6083361 R-squared = 0.2252
Adj R-squared = 0.2222
Total 7160.41429 525 13.6388844 Root MSE = 3.257

wage Coef. Std. Err. t P>|t| [95% Conf. Interval]

educ .6442721 .0538061 11.97 0.000 .5385695 .7499747


exper .0700954 .0109776 6.39 0.000 .0485297 .0916611
_cons -3.390539 .7665661 -4.42 0.000 -4.896466 -1.884613
4. Properties of OLS estimators
• The sample regression line (surface) passes through
the means of (Y , X 2 ,..., X k )

• The mean value of the estimated Yi is equal to the mean


value of f the actual Yi. ˆ
Y Y n
• Sum of residuals is equal to 0  uˆ
i 1
i 0
n
• The residuals are uncorrelated with Xki : X ki uˆ i  0
Yˆi
i 1
• The residuals are uncorrelated with n

 Yˆ uˆ
i 1
i i 0
4. Properties of OLS estimators
Gauss-Markov Theorem ˆ1 , ˆ 2 ,...., ˆ k are the best
linear unbiased estimators (BLUEs) of  1 ,  2, ......,  k
~
• An estimator  j is an unbiased estimator of j if
~
E ( j )   j
• An estimator
~
j of j is linear if and only if it can
be expressed as a linear function of the data on the
n
~
dependent variable:
 j   wij y i
i 1

• “best” is defined as smallest variance.


̂

4. Properties of OLS estimators


Standard errors of the OLS estimators
n

• An unbiased estimator of  2
:  
2
E (u ) 
2
 i /n
u 2

i 1

 This is not a true estimator because we can not


observe the ui.

• The unbiased estimator of  : ˆ 


2
2RSS i
ˆ
u 2

nk nk
RSS /  follows
2 2

distribution with df = number of
observations – number of estimated parameters = n-k
Positive ˆ is called the standard error of the regression
(SER) (or Root MSE). SER is an estimator of the standard
deviation of the error term.
4. Properties of OLS estimators
 2
Var ( ˆ j ) 
TSS j (1  R )2
j

n
• Where TSS j   ( xij  x j ) 2 is total sample
i 1
variation in xj and R 2
j is the R-squared from
regressing xj on all other independent
variables (and including an intercept).
• Since  is unknown, we replace it with its
estimator ̂ . Standard error:
se( ˆ j )  ˆ /[TSS j (1  R 2j )]1/ 2
5. A measure of “Goodness of fit”
• Decomposition of total variation

Total variation Explained part Unexplained part


• Goodness-of-fit measure (R-squared)

R-squared measures the


fraction of the total variation
that is explained by the
regression.  0 ≤ r2 ≤ 1
Example- Goodness of fit
• Determinants of college GPA:
- "D:\Bai giang\Kinh te luong\datasets\GPA1.DTA", clear
. use

. reg colGPA hsGPA ACT

Source SS df MS Number of obs = 141


F( 2, 138) = 14.78
Model 3.42365506 2 1.71182753 Prob > F = 0.0000
Residual 15.9824444 138 .115814814 R-squared = 0.1764
Adj R-squared = 0.1645
Total 19.4060994 140 .138614996 Root MSE = .34032

colGPA Coef. Std. Err. t P>|t| [95% Conf. Interval]

hsGPA .4534559 .0958129 4.73 0.000 .2640047 .6429071


ACT .009426 .0107772 0.87 0.383 -.0118838 .0307358
_cons 1.286328 .3408221 3.77 0.000 .612419 1.960237
5. Goodness-of-fit or coefficient of determination R2

• Note that R2 lies between 0 and 1.


o If it is 1, the fitted regression line explains 100
percent of the variation in Y
o If it is 0, the model does not explain any of the
variation in Y.
• The fit of the model is said to be “better’’ the closer
R2 is to 1
• As the number of independent variables increases,
R2 almost invariably increases and never decreases.
R2 and the adjusted R2
• An alternative coefficient of determination:
RSS /(n  k )
R  1
2

TSS /(n  1)
n 1
R  1  (1  R )
2 2

nk
where k = the number of parameters in the model including the
intercept term.
R2 and the adjusted R2
• It is good practice to use adjusted R2 than R2
because R2 tends to give an overly optimistic
picture of the fit of the regression, particularly
when the number of explanatory variables is
not very small compared with the number of
observations.
The game of maximizing adjusted R2
• Researchers play the game of maximizing adjusted R2, that
is, choosing the model that gives the highest adjusted R2.
 This may be dangerous.
• Our objective is not to obtain a high adjusted R2 per se but
rather to obtain dependable estimates if the true
population regression coefficients and draw statistical
inferences about them.
• Researchers should be more concerned about the logical or
theoretical relevance of the explanatory variables to the
dependent variable and their statistical significance.
• Even if R-squared is small (as in the given example),
regression may still provide good estimates of ceteris
paribus effects
Comparing Coefficients of Determination R2

• It is crucial to note that in comparing two models on the


basis of the coefficient of determination, whether adjusted
or not
• the sample size n must be the same
• the dependent variable must be the same
• the explanatory variables may take any form.
Thus for the models
lnYi = β1 + β2X2i + β3X3i + ui (1)
Yi = α1 + α2X2i + α3X3i + ui (2)
the computed R2 terms cannot be compared
6. More on Functional Form
The Cobb–Douglas Production Function

• The Cobb–Douglas production function, in its stochastic


form, may be expressed as:
2 3 U i
Yi  1 X 2i X 3i e (1)
where Y = output
X2 = labor input
X3 = capital input
u = stochastic disturbance term
e = base of natural logarithm
• if we log-transform this model, we obtain:
ln Yi = ln β1 + β2 lnX2i + β3lnX3i + ui
= β0 + β2lnX2i + β3lnX3i + ui (2)
where β0 = ln β1.
6. More on Functional Form
Polynomial Regression Models

• The U-shaped marginal cost curve shows that the


relationship between MC and output is nonlinear. the
parabola is represented by the following equation:
Y = β0 + β1X + β2Xi2 (4)
which is called a quadratic function,
6. More on Functional Form
Polynomial Regression Models

• The general kth degree polynomial regression may


be written as
Yi = β0 + β1Xi + β2Xi2+ · · ·+βkXik + ui (5)
7. Hypothesis Testing in Multiple Regression

7.1. Testing hypotheses about an individual partial


regression coefficient
7.2. Testing the overall significance of the estimated
multiple regression model, that is, finding out if all the
partial slope coefficients are simultaneously equal to
zero.
7.3. Testing that two or more coefficients are equal to one
another
7.4. Testing that the partial regression coefficients satisfy
certain restrictions
7.5 Testing for Structural or Parameter Stability of
Regression Models: The Chow Test
7.1. Hypothesis testing about individual coefficients

• A hypothesis about any individual partial


regression coefficient.
H0: j = 0
H1: j  0
• Xj has no effect on the expected value of Y
 The null hypothesis in most applications.
• Compare |t| with critical values:
ˆ j  0
t
se( ˆ )
j
Testing Hypotheses on the coefficients

Hypotheses H0 Alternative Rejection


hypothesis H1 region

 j  0
Two tail |t0|>t(n-k),α/2
 j  0
Right tail t0 > t(n-k),α
 j  0  j  0

Left tail t0 <- t(n-k),α


 j  0  j  0
Example 2: Determinants of college GPA

• -
. use "D:\Bai giang\Kinh te luong\datasets\GPA1.DTA", clear

. reg colGPA hsGPA ACT

Source SS df MS Number of obs = 141


F( 2, 138) = 14.78
Model 3.42365506 2 1.71182753 Prob > F = 0.0000
Residual 15.9824444 138 .115814814 R-squared = 0.1764
Adj R-squared = 0.1645
Total 19.4060994 140 .138614996 Root MSE = .34032

colGPA Coef. Std. Err. t P>|t| [95% Conf. Interval]

hsGPA .4534559 .0958129 4.73 0.000 .2640047 .6429071


ACT .009426 .0107772 0.87 0.383 -.0118838 .0307358
_cons 1.286328 .3408221 3.77 0.000 .612419 1.960237
A reminder on the language of
classical hypothesis testing
• When H0 is not rejected  “We fail to reject
H0 at the x% level”, do not say: “H0 is accepted
at the x% level”.
• Statistical significance vs economic
significance: The statistical significance is
determined by the size of the t-statistics
whereas the economic significance is related
to the size and sign of the estimators.
Guidelines for discussing economic and statistical
significance
– If a variable is statistically significant, discuss the
magnitude of the coefficient to get an idea of its
economic or practical importance
– The fact that a coefficient is statistically significant
does not necessarily mean it is economically or
practically significant!
– If a variable is statistically and economically
important but has the wrong sign, the regression
model might be misspecified.
7.2. Testing the Overall Significance of
the Sample Regression
For Yi = 1 + 2X2i + 3X3i + ........+ kXki + ui
 To test the hypothesis
H0: 2 =3 =....= k= 0 (all slope coefficients are simultaneously zero)
(this is also a test of significance of R2)
H1: Not at all slope coefficients are simultaneously zero
R 2 (n  k )
F
(1  R 2 )(k  1)
(k = total number of parameters to be estimated including intercept)
If F > F critical = F,(k-1,n-k), reject H0, Otherwise you do not
reject it
7.3. Testing the Equality of Two Regression Coefficients
• Suppose in the multiple regression
Yi = β1 + β2X2i + β3X3i + β4X4i + ui
we want to test the hypotheses
H0: β3 = β4 or (β3 − β4) = 0
H1: β3 ≠ β4 or (β3 − β4) ≠ 0
that is, the two slope coefficients β3 and β4 are equal.

• If the t statistic exceeds the critical value, then you can reject the
null hypothesis; otherwise, you do not reject it
Example- Stata output
• Model: wage = f(educ,exper, tenure )
. reg wage educ exper tenure

Source SS df MS Number of obs = 526


F( 3, 522) = 76.87
Model 2194.1116 3 731.370532 Prob > F = 0.0000
Residual 4966.30269 522 9.51398984 R-squared = 0.3064
Adj R-squared = 0.3024
Total 7160.41429 525 13.6388844 Root MSE = 3.0845

wage Coef. Std. Err. t P>|t| [95% Conf. Interval]

educ .5989651 .0512835 11.68 0.000 .4982176 .6997126


exper .0223395 .0120568 1.85 0.064 -.0013464 .0460254
tenure .1692687 .0216446 7.82 0.000 .1267474 .2117899
_cons -2.872735 .7289643 -3.94 0.000 -4.304799 -1.440671
Example- Stata output
• Model: wage = f(educ,exper, tenure )
. estat vce

Covariance matrix of coefficients of regress model

e(V) educ exper tenure _cons

educ .00263
exper .00019406 .00014537
tenure -.0001254 -.00013218 .00046849
_cons -.03570219 -.0042369 .00143314 .53138894
Example- Stata output
• We have se( ˆ3  ˆ 4 )  0.029635
t = -4.958 t 0.025,522  2.
Reject H0
7.3. Testing the Equality of Two Regression Coefficients

Method 2: F-test
• If the F statistics exceeds the critical value then you can reject
the null hypothesis; otherwise, you do not reject it.

2
 ( ˆ3  ˆ 4 )  (  3   4 ) 
F1,n  k  
 ˆ  ˆ ) 
 se ( 3 4 
7.3. Testing the Equality of Two Regression Coefficients
Method 3
• Example: Return to education at 2 year vs. at 4 year colleges

Years of education at Years of education at


2 year colleges 4 year colleges

Test against .

A possible test statistic would be:


twoyear.dta
Source SS df MS Number of obs = 6,763
F(3, 6759) = 644.53
Model 357.752575 3 119.250858 Prob > F = 0.0000
Residual 1250.54352 6,759 .185019014 R-squared = 0.2224
Adj R-squared = 0.2221
Total 1608.29609 6,762 .237843255 Root MSE = .43014

lwage Coef. Std. Err. t P>|t| [95% Conf. Interval]

jc .0666967 .0068288 9.77 0.000 .0533101 .0800833


univ .0768762 .0023087 33.30 0.000 .0723504 .0814021
exper .0049442 .0001575 31.40 0.000 .0046355 .0052529
_cons 1.472326 .0210602 69.91 0.000 1.431041 1.51361
7.3. Testing the Equality of Two Regression Coefficients

• Method 3 Usually not available in regression output

Define and test against .

Insert into original regression a new regressor (= total years of college)


7.3. Testing the Equality of Two Regression Coefficients

Stata output F-test

. test exper=tenure
( 1) exper - tenure = 0

F( 1, 522) = 24.58
Prob > F = 0.0000
 We reject the hypothesis that the two effects are
equal.
7.4. Restricted Least Squares: Testing Linear Equality
Restrictions

• Now consider the Cobb–Douglas production function:


2 3 U i
Yi  1 X X e
2i 3i (1)
where Y = output
X2 = labor input
X3 = capital input
• Written in log form, the equation becomes
ln Yi = ln β1 + β2 lnX2i + β3lnX3i + ui
= β0 + β2lnX2i + β3lnX3i + ui (2)
where β0 = ln β1.
7.4. Restricted Least Squares: Testing Linear Equality
Restrictions

• If there are constant returns to scale, economic theory


would suggest that:
β2 + β3 = 1
which is an example of a linear equality restriction.
• If the restriction is valid? There are two approaches:
– The t-Test Approach
– The F-Test Approach
The t-Test Approach

•If the t statistic exceeds the critical value, we reject the


hypothesis of constant returns to scale. Otherwise we do
not reject it.
The F-Test Approach

•If the restriction is true: β2 = 1 − β3


•we can write the Cobb–Douglas production function as
lnYi = β0 + (1 − β3) ln X2i + β3 ln X3i + ui
= β0 + ln X2i + β3(ln X3i − ln X2i ) + ui
or (lnYi − lnX2i) = β0 + β3(lnX3i − lnX2i ) + ui
or ln(Yi/X2i) = β0 + β3ln(X3i/X2i) + ui (3)
Where (Yi/X2i) = output/labor ratio
(X3i/X2i) = capital/labor ratio
Eq. (1): unrestricted model
Eq. (3): restricted model.
The F-Test Approach

• We want to test the hypotheses


H0: β2 + β3 = 1 (the restriction H0 is valid)

RSSUR: RSS of the unrestricted regression


RSSR : RSS of the restricted regression
m = number of linear restrictions (1 in the present example)
k = number of parameters in the unrestricted regression
n = number of observations
• If the F statistic > the critical F value at the chosen level of
significance, we reject the hypothesis H0
A Cautionary Note
• Keep in mind that if the dependent
variable in the restricted and
unrestricted models is not the same,
R2(unrestricted) and R2(restricted) are
not directly comparable.
Testing multiple linear restrictions: The F-test

• Testing exclusion restrictions (\MLB1.DTA)

Salary of major lea- Years in Average number of


gue base ball player the league games per year

Batting average Home runs per year Runs batted in per year

against

Test whether performance measures have no effect/can be


exluded from regression.
Testing multiple linear restrictions: The F-test
• Estimation of the unrestricted model

None of these variabels is statistically significant when tested individually

Idea: How would the model fit be if these variables were


dropped from the regression?
Testing multiple linear restrictions: The F-test

• Estimation of the restricted model

The sum of squared residuals necessarily increases, but is the increase


•statistically significant?
Test statistic
Number of restrictions
F-distribution
• Rejection rule

A F-distributed variable only takes on positive


values. This corresponds to the fact that the sum
of squared residuals can only increase if one
moves from H1 to H0.

Choose the critical value so that the null hypo-


thesis is rejected in, for example, 5% of the cases,
although it is true.
Testing multiple linear restrictions: The F-test

• Test decision in example Number of restrictions to be tested

Degrees of freedom in
the unrestricted model

The null hypothesis is overwhel-


mingly rejected (even at very small
significance levels).

• Discussion
– The three variables are „jointly significant“
– They were not significant when tested individually
– The likely reason is multicollinearity between them
• Test hypothesis that, after controlling for cigs, parity, and faminc, parents’
education has no effect on birth weight
7.5. Testing for Structural or Parameter Stability of
Regression Models: The Chow Test
• Now we have three possible regressions:
• Time period 1970–1981: Yt = λ1 + λ2Xt + u1t (1)
Time period 1982–1995: Yt = γ1 + γ2Xt + u2t (2)
Time period 1970–1995: Yt = α1 + α2Xt + ut (3)
• there is no difference between the two time periods. The mechanics
of the Chow test are as follows:
1. Estimate regression (3), obtain RSS3 with df = (n1 + n2 − k)
We call RSS3 the restricted residual sum of squares (RSSR) because it is obtained by
imposing the restrictions that λ1 = γ1 and λ2 = γ2, that is, the subperiod regressions are
not different.
2. Estimate Eq. (1) and obtain its residual sum of squares, RSS1, with df
= (n1 − k).
3. Estimate Eq. (2) and obtain its residual sum of squares, RSS2, with df
= (n2 − k).
7.5. Testing for Structural or Parameter Stability of
Regression Models: The Chow Test
4. The unrestricted residual sum of squares (RSSUR), that is,
RSSUR = RSS1 + RSS2 with df = (n1 + n2 − 2k)
5. F ratio:

6. If the computed F value exceeds the critical F value, we reject the


hypothesis of parameter stability conclude that the regressions (1) and
(2) are different
• END OF CHAPTER 3

You might also like