Week 4
The multiple linear regression model (MLRM)
(part 1)
1
Generalising the simple model to
multiple linear regression model (MLRM)
• Before, we have used the model
yt xt ut t = 1,2,...,T
• But what if our dependent (y) variable depends on more than one
independent variable?
For example the number of cars sold might plausibly depend on
1. the price of cars
2. the price of public transport
3. the price of petrol
4. the extent of the public’s concern about global warming
• Similarly, stock returns might depend on several factors.
• Having just one independent variable is no good in this case - we want to
have more than one x variable. It is very easy to generalise the simple
model to one with k regressors (independent variables).
Multiple regression and the constant term
• Now we write
yt 1 2 x2t 3 x3t ... k xkt ut , t = 1,2,...,T
Each coefficient is now known as a partial regression coefficient,
interpreted as representing the partial effect of the given explanatory
variable on the explained variable, after holding constant, or eliminating
the effect of, all other explanatory variables.
E.g., ̂ 2 measures the effect of x2 on y after eliminating the effects of x3, x4,
. . . , xk .
Each coefficient measures the average change in the dependent variable
per unit change in a given independent variable, holding all other
independent variables constant at their average values.
Multiple regression and the constant term
• Now we write
yt 1 2 x2t 3 x3t ... k xkt ut , t = 1,2,...,T
• Where is x1? It is the constant term. In fact the constant term is usually
represented by a column of ones of length T:
1
1
x1
1
• There is a variable implicitly hiding next to β1, which is a column vector of
ones, the length of which is the number of observations in the sample.
• 1 is the coefficient attached to the constant term (which we called before).
Different ways of expressing the MLRM
• We could write out a separate equation for every value of t:
y1 1 2 x21 3 x31 ... k xk1 u1
y2 1 2 x22 3 x32 ... k xk 2 u 2
yT 1 2 x2T 3 x3T ... k xkT uT
• We can write this in matrix form
y = X +u
where y is T 1
X is T k
is k 1
u is T 1
Inside the matrices of the MLRM
• e.g. if k is 2, we have 2 regressors, one of which is a column of ones, i.e., the
constant term (yt = α + βxt + ut):
y1 1 x21 u1
y 1 x u
2 22 2
1
2
yT 1 x2T uT
T 1 T2 21 T1
• Notice that the matrices written in this way are conformable.
How do we calculate the parameters (the ) in this
generalised case?
• Previously, we took the residual sum of squares, and minimised it
w.r.t. and .
• In the matrix notation, we have
uˆ1
uˆ
uˆ 2
uˆ T
• The RSS would be given by:
uˆ1
uˆ
uˆ ' uˆ uˆ1 uˆ2 uˆT 2 uˆ12 uˆ22 ... uˆT2 uˆt2
uˆT
The OLS estimator for the multiple regression model
• In order to obtain the parameter estimates, 1, 2,..., k, we would
minimise the RSS with respect to all the s.
• It can be shown that
ˆ1
ˆ ˆ 2
( X X ) 1 X y
ˆ k
• (X'X)-1 is the inverse of the X'X matrix.
• X' is the transpose of the X matrix.
Numerical example
X Y
4.0 33 Y = -2.68 + 9.500X
4.5 42
5.0 45
5.5 51
6.0 53
6.5 61
7.0 62
Numerical example
X Y X*Y X2
4 33 132 16
4.5 42 189 20.25
5 45 225 25
5.5 51 280.5 30.25
6 53 318 36
6.5 61 396.5 42.25
7 62 434 49
38.5 347 1975 218.75
Determinant of A
Numerical example
ˆ1
ˆ ˆ 2
( X X ) 1 X y
ˆ k
Y = -2.68 + 9.500X
Calculating the standard errors for the multiple regression
model
• Check the dimensions: is k 1 as required.
• But how do we calculate the standard errors of the coefficient estimates?
• Previously, to estimate the variance of the errors, 2, we used s 2
uˆ 2
t
.
T 2
u' u
• Now using the matrix notation, we use s T k
2
• where k = number of regressors including a constant. It can be proved that
the OLS estimator of the variance of is given by the diagonal elements
of s2 ( X ' X ) 1 , so that the variance of 1 is the first element, the variance of
2 is the second element, and …, and the variance of k is the kth
diagonal element.
Calculating parameter and standard error estimates for
multiple regression models: An example
• Example: The following model with k = 3 is estimated over 15 observations:
y 1 2 x2 3 x3 u
and the following data have been calculated from the original X’s.
2.0 35 .
. 10 30 .
( X ' X ) 1 35
. 10 . ,( X ' y) 2.2 , u' u 10.96
. 65
10 . 4.3
. 65 0.6
Calculate the coefficient estimates and their standard errors.
• To calculate the coefficients, just multiply the matrix by the vector to obtain
X ' X 1 X ' y
• To calculate the standard errors, we need an estimate of 2.
RSS 10.96
s
2 0.91
Tk 15 3
Calculating parameter and standard error estimates
for multiple regression models: An example (cont’d)
• The variance-covariance matrix of is given by
183
. 320
. 0.91
s2 ( X ' X ) 1 0.91( X ' X ) 1 320
. .
0.91 594
0.91 594
. .
393
• The variances are on the leading diagonal:
Var ( 1 ) 183
. SE ( 1 ) 135
.
Var ( 2 ) 0.91 SE ( 2 ) 0.96
Var ( ) 3.93
3 SE ( ) 198
3 .
• We write: yˆ 1.10 4.40 x2t 19.88x3t
1.35 0.96 1.98
Testing multiple hypotheses: The F-test
• We used the t-test to test single hypotheses, i.e., hypotheses involving only one
coefficient. But what if we want to test more than one coefficient
simultaneously?
• E.g., what if a researcher wanted to determine whether a restriction that the
coefficient values for β2 and β3 are both unity could be imposed, so that an
increase in either one of the two variables x2 or x3 would cause y to rise by one
unit?
• We do this using the F-test. The F-test involves estimating 2 regressions.
• The unrestricted regression is the one in which the coefficients are freely
determined by the data, as we have done before.
• The restricted regression is the one in which the coefficients are restricted, i.e.,
the restrictions are imposed on some s.
Calculating the F-test statistic
• The RSS from each regression are determined, and the two residual sums of squares are
‘compared’ in the test statistic. The F-test statistic for testing multiple hypotheses about
the coefficient estimates is given by:
RRSS URSS T k
test statistic
URSS m
where URSS = RSS from unrestricted regression
RRSS = RSS from restricted regression
m = number of restrictions
T = number of observations
k = number of regressors in unrestricted regression including a
constant in the unrestricted regression (or the total number of
parameters to be estimated).
• If the residual sum of squares increased considerably after the restrictions were imposed,
the restrictions were not supported by the data, and therefore that the hypothesis should
be rejected, and vice versa.
The F-test:
Restricted and unrestricted regressions
• Example
The general regression is:
yt = 1 + 2x2t + 3x3t + 4x4t + ut (1)
• We want to test the restriction that 3+4 = 1 (we have some hypothesis
from theory which suggests that this would be an interesting hypothesis to
study). The unrestricted regression is (1) above, but what is the restricted
regression?
yt = 1 + 2x2t + 3x3t + 4x4t + ut s.t. 3+4 = 1 (2)
• We substitute the restriction (3+4 = 1) into the regression so that it is
automatically imposed on the data. Make either β3 or β4 the subject of (2),
e.g.:
3+4 = 1 4 = 1- 3 (3)
The F-test: Forming the restricted regression
• Then substitute into (1) for β4
yt = 1 + 2x2t + 3x3t + (1-3)x4t + ut (4)
yt = 1 + 2x2t + 3x3t + x4t - 3x4t + ut (5)
• Gather terms in ’s together and rearrange
(yt - x4t) = 1 + 2x2t + 3(x3t - x4t) + ut (6)
• Any variables without coefficients attached (e.g., x4 in (5)) are taken over to
the LHS and are then combined with y. This is the restricted regression. We
actually estimate it by creating two new variables, call them, say, Pt and Qt.
Pt = yt - x4t
Qt = x3t - x4t
so
Pt = 1 + 2x2t + 3Qt + ut (7)
is the restricted regression we actually estimate.
The F-distribution
• The test statistic follows the F-distribution under the null hypothesis.
• The F-distribution has 2 degrees of freedom parameters (recall that the t-distribution had
only 1 degree of freedom parameter, equal to T − k).
• The value of the degrees of freedom parameters for the F-test are m, the number of
restrictions imposed on the model, and (T − k), the number of observations less the number
of regressors for the unrestricted regression, respectively.
• Note that the order of the degree of freedom parameters is important.
• The appropriate critical value will be in column m, row (T − k) of the F-distribution tables.
• The F-distribution has only positive values and is not symmetrical.
• We therefore only reject the null if the test statistic > critical F-value.
The relationship between the t and the F-distributions
• Any hypothesis which could be tested with a t-test could have been
tested using an F-test, but not the other way around.
For example, consider the hypothesis
H0: 2 = 0.5
H1: 2 0.5
2 0.5
We could have tested this using the usual t-test: test stat
SE ( 2 )
or it could be tested in the framework above for the F-test.
• Note that the two tests always give the same result since the t-
distribution is just a special case of the F-distribution.
• For example, if we have some random variable Z, and Z t (T-k) then
also Z2 F(1,T-k)
Determining the number of restrictions in an F-test
• Examples :
H0: hypothesis No. of restrictions, m
1 + 2 = 2 1
2 = 1 and 3 = -1 2
2 = 0, 3 = 0 and 4 = 0 3
• If the model is yt = 1 + 2x2t + 3x3t + 4x4t + ut,
then the null hypothesis
H0: 2 = 0, and 3 = 0 and 4 = 0 is tested by the regression F-statistic. It tests the null
hypothesis that all of the coefficients except the intercept coefficient are zero. If this null
hypothesis cannot be rejected, it would imply that none of the independent variables in the
model was able to explain variations in y.
• Note the form of the alternative hypothesis for all tests when more than one restriction is
involved: H1: 2 0, or 3 0 or 4 0
• ‘and’ occurs under the null hypothesis and ‘or’ under the alternative, so that it takes only one
part of a joint null hypothesis to be wrong for the null hypothesis as a whole to be rejected.
What we cannot test with either an F or a t-test
• We cannot test using this framework hypotheses which are not linear
or which are multiplicative, e.g.,
H0: 2 3 = 2 or H0: 2 2 = 1
cannot be tested.
F-test example
• Question: Suppose a researcher wants to test whether the returns on a company stock (y) show
unit sensitivity to two factors (factor x2 and factor x3) among three considered. The regression is
carried out on 144 monthly observations. The regression is:
yt = 1 + 2x2t + 3x3t + 4x4t+ ut
- What are the restricted and unrestricted regressions?
- If the two RSS are 436.1 and 397.2 respectively, perform the test.
• Solution:
Unit sensitivity implies H0:2 = 1 and 3 = 1. The unrestricted regression is the one in the
question.
Impose the restriction:
yt = β1 + β2x2t + β3x3t + β4x4t + ut s.t. β2 = 1 and β3 = 1
Replacing β2 and β3 by their values under the null hypothesis:
yt = β1 + x2t + x3t + β4x4t + ut
Rearrange. The restricted regression is:
(yt-x2t-x3t) = 1+ 4x4t+ut
or letting
zt=yt-x2t-x3t, the restricted regression is zt = 1+ 4x4t+ut
F-test example
In the F-test formula,
RRSS URSS T k
test statistic
URSS m
• The following inputs to the formula are available: T=144, k=4, m=2, RRSS=436.1,
URSS=397.2
• F-test statistic = 6.68. Critical value is an F(2,140) = 3.07 (5%) and 4.79 (1%).
• Conclusion: Reject H0; the restriction is not supported by the data.
Sample EViews output for multiple hypothesis tests
erford = rford - ustb3m; ersandp=rsandp-ustb3m
• The result of F-test is exactly the same as the t-test for the beta coefficient
(since there is only one slope coefficient).
• Thus, the F-test statistic is equal to the square of the slope t-ratio.
Sample EViews output for multiple hypothesis tests
• The F-version is adjusted for small sample bias and should be used when the
regression is estimated using a small sample.
• Both statistics asymptotically yield the same result, and in this case the p-values are
very similar.
• The conclusion is that the joint null hypothesis, H0: β1 = 1 and β2 = 1, is not rejected.
Multiple regression in EViews using an APT-style model
• Whether the monthly returns on Microsoft stock can be explained by reference to unexpected
changes in a set of macroeconomic and financial variables.
• Microsoft stock price → dependent variable
• S&P500 index value
• consumer price index (CPI)
• Industrial production index (IPI)
• Treasury bill yields for the following maturities:
– three months
– six months
– one year
– three years
– five years
– ten years
• measure of ‘narrow’ money supply
• consumer credit
• ‘credit spread’ (difference in annualised average yields between a portfolio of bonds rated AAA
and a portfolio of bonds rated BAA).
Multiple regression in EViews using an APT-style
model
• Generate a set of changes or differences for each of the variables, since the APT posits that
the stock returns can be explained to the unexpected changes in the macroeconomic
variables rather than their levels.
• The unexpected value of a variable can be defined as the difference between the actual
(realised) value of the variable and its expected value.
• How investors might have formed their expectations while there are many ways to construct
measures of expectations?
– the easiest is to assume that investors have naive expectations that the next period
value of the variable is equal to the current value.
• This being the case, the entire change in the variable from one period to the next is the
unexpected change (because investors are assumed to expect no change).
– It is an interesting question as to whether the differences should be taken on the levels
of the variables or their logarithms.
– If the former, we have absolute changes in the variables, whereas the latter would lead
to proportionate changes.
– We assume that the former is chosen.
Multiple regression in EViews using an APT-style model
• mustb3m = ustb3m/12
• rmsoft = 100*dlog(microsoft)
– ermsoft = rmsoft - mustb3m
• rsandp = 100*dlog(sandp)
– ersandp = rsandp - mustb3m
• dprod = industrial production - industrial production(-1)
• dcredit = consumer credit - consumer credit(-1)
• inflation = 100*dlog(cpi)
– dinflation = inflation - inflation(-1)
• dmoney = m1money supply - m1money supply(-1)
• dspread = baa_aaa_spread – baa_aaa_spread(-1)
• rterm = term - term(-1)
Multiple regression in EViews using an APT-style model
• F(3, 244) distribution; 3 restrictions; 252 usable observations; 8 parameters to estimate in the unrestricted
regression.
• The F-statistic value suggesting that the null hypothesis cannot be rejected.
• The parameters on DINLATION and DMONEY are almost significant at the 10% level and the variables are
retained.
Multiple regression in EViews using an APT-style model
• Stepwise regression is an automatic variable selection procedure which
chooses the jointly most ‘important’ (variously defined) explanatory variables
from a set of candidate variables.
• There are a number of different stepwise regression procedures, but the
simplest is the uni-directional forwards method.
• This starts with no variables in the regression (or only those variables that are
always required by the researcher to be in the regression) and then it selects
first the variable with the lowest p-value (largest t-ratio) if it were included,
then the variable with the second lowest p-value conditional upon the first
variable already being included, and so on.
• The procedure continues until the next lowest p-value relative to those already
included variables is larger than some specified threshold value, then the
selection stops, with no more variables being incorporated into the model.
Multiple regression in EViews using an APT-style model
Multiple regression in EViews using an APT-style model
• “Forwards’ will start with the list of required regressors (the intercept only in this case)
and will sequentially add to them.
• ‘Backwards’ will start by including all of the variables and will sequentially delete
variables from the regression.