Chapter 9
Multiple Linear
Regression Analysis
Week 13
L1 Multiple Linear Regression
1
Lecture 3
Learning Outcomes:
At the end of the lesson, the student should be able to
Use the least squares method to estimate a multiple
linear model
Carry out tests to determine if the model obtained is an
adequate fit to the data
Carry out test for inferences on regression parameters
Find the CI for the slope
2
MULTIPLE LINEAR REGRESSION
an extension of a simple linear regression model
allows the dependent variable y to be modeled as a linear
function of more than one input variable xi
Consider the following data consisting of n sets of values
( y1 , x11 , x21 , ....xk 1 )
( y2 , x12 , x22 , ....xk 2 )
.
( yn , x1n , x2 n , ....xkn )
the value of the dependent variable yi is modeled as
Y 0 1 x1 .... k x k
3
the dependent variable is related to k independent or
regressor variables
the multiple linear regression model can provide a rich variety
of functional relationships by allowing some of the input
variables xi to be functions of other input variables.
As in simple linear regression, the parameters 0 , 1 , ....., k
are estimated using the method of least squares.
However, it would be tedious to find these values by hand, thus
we use the computer to handle the computations.
the ANOVA is used to test for significance of regression
the t - test is used to test for inference on individual
regression coefficient
4
Observation Number Pull Strength y Wire Length x1 Die Height x2
1 9.95 2 50
2 24.45 8 110
Example 1: 3 31.75 11 120
4 35.00 10 550
pg. 310
5 25.02 8 295
6 16.86 4 200
7 14.38 2 375
8 9.60 2 52
9 24.35 9 100
10 27.50 8 300
11 17.08 4 412
12 37.00 11 400
13 41.95 12 500
14 11.66 2 360
15 21.65 4 205
16 17.89 4 400
17 69.00 20 600
18 10.30 1 585
19 34.93 10 540
20 46.59 15 250
21 44.88 15 290
22 54.12 16 510
23 56.63 17 590
24 22.13 6 100
25 21.15 5 400
Using these estimated model parameters, the fitted regression equation is
MULTIPLE LINEAR REGRESSION ANALYSIS
- testing for the significance of regression.
Hypotheses: H 0 : 1 2 3 .... k 0
H 1 : at least one j 0
MS R
Test statistic: F0
MS E
SSR SSE
where: MSR MSE
k n p
MS R
Rejection criteria: F0 f , k , n p
MS E
7
ANOVA Table for multiple linear regression
Source Sum of Degrees Mean Computed F
Of variation Squares of Square
freedom (Sum of squares /
(df) df)
SSR
Regression SSR k MSR F = MSR/MSE
k
SSE
Error SSE n (k+1) MSE
n (k 1)
Total SST n1
8
Inferences on the model parameters in multiple regression.
The hypotheses are H 0 : j j ,0
H 1 : j j ,0
Test statistic j j , 0
T0
se ( )j
Reject H0 if T0 t / 2 , n p
9
Example 1:
A set of experimental runs were made to determine a way of
predicting cooking time y at various levels of oven width x1, and
temperature x2. The data were recorded as follows:
y x1 x2
6.4 1.32 1.15
15.05 2.69 3.4 Carry out an analysis to
18.75 3.56 4.1 determine the regression
30.25 4.41 8.75 equation
44.86 5.35 14.82
48.94 6.3 15.15
51.55 7.12 15.32
61.5 8.87 18.18
100.44 9.8 35.19
111.42 10.65 40.4
Solution:
Y X1 X2 X1-square X2-square X1X2 X1Y X2Y
6.4 1.32 1.15 1.7424 1.3225 1.518 8.448 7.36
15.05 2.69 3.4 7.2361 11.56 9.146 40.4845 51.17
18.75 3.56 4.1 12.6736 16.81 14.596 66.75 76.875
30.25 4.41 8.75 19.4481 76.5625 38.5875 133.4025 264.6875
44.86 5.35 14.82 28.6225 219.6324 79.287 240.001 664.8252
48.94 6.3 15.15 39.69 229.5225 95.445 308.322 741.441
51.55 7.12 15.32 50.6944 234.7024 109.0784 367.036 789.746
61.5 8.87 18.18 78.6769 330.5124 161.2566 545.505 1118.07
100.44 9.8 35.19 96.04 1238.336 344.862 984.312 3534.484
111.42 10.65 40.4 113.4225 1632.16 430.26 1186.623 4501.368
TOTAL 489.16 60.07 156.46 448.2465 3991.121 1284.037 3880.884 11750.03
Solution:
Using the computer for computations, the following results were
observed.
Regression Analysis: cooking time versus width, temperature
The regression equation is ?
Predictor Coef SE Coef T
Constant 0.568 0.585 0.970
Width 2.706 0.194 ? (2.706/0.194)
Temp 2.051 0.046 ? (2.051/0.046)
S=? R-Sq = ? R-Sq(adj) = 100%
Analysis of Variance
Source DF SS MS F P
Regression(SSR) ? 10953.334 5476.667 13647.872 0.000
Residual Error (SSE) 7 2.809 0.401
Total ? 10956.143
Solution:
Using the computer for computations, the following results were
observed.
Regression Analysis: cooking time versus width, temperature
The regression equation is
Cooking time = 0.568 + 2.706 width + 2.051 temperature
Predictor Coef SE Coef T
Constant 0.568 0.585 0.970
Width 2.706 0.194 13.935
Temp 2.051 0.046 44.380
S = 0.6334 R-Sq = 99.9% R-Sq(adj) = 100%
Analysis of Variance
Source DF SS MS F
Regression 2 10953.334 5476.667 13647.872
Residual Error 7 2.809 0.401
Total 9 10956.143
Example 2 ( continued):
b) Test whether the regression explained by the model obtained in
part (a) is significant at the 0.01 level of significance.
Solution:
We use ANOVA to test for significance of regression
The hypotheses are
H 0 : 1 2 0
H 1 : 1 and 2 are not both zero
The test statistic is MSR
F0
MSE
The following ANOVA table is obtained:
Analysis of Variance
Source DF SS MS F
Regression 2 10953.334 5476.667 13647.872
Residual Error 7 2.809 0.401
Total 9 10956.143
Taking the significance = 1% = 0.01
f , k , n ( k 1) f 0.01, 2 , 7 9 .55
F 13647 .87 9 .55
Decision : Reject H 0, A linear model is fitted