KEMBAR78
Multiple Linear Regression | PDF
0% found this document useful (0 votes)
173 views73 pages

Multiple Linear Regression

This document introduces multiple linear regression models and discusses estimating regression coefficients, testing the significance of individual coefficients and the overall model fit, and calculating partial regression coefficients and multiple correlation coefficients from correlation coefficients. Examples are provided to demonstrate calculating regression coefficients using different methods and testing the overall significance of regression models.

Uploaded by

Ruchika Motwani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
173 views73 pages

Multiple Linear Regression

This document introduces multiple linear regression models and discusses estimating regression coefficients, testing the significance of individual coefficients and the overall model fit, and calculating partial regression coefficients and multiple correlation coefficients from correlation coefficients. Examples are provided to demonstrate calculating regression coefficients using different methods and testing the overall significance of regression models.

Uploaded by

Ruchika Motwani
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

Module 4

Introduction to Multiple Linear


Regression
Contents
• Multiple Linear Regression Model
• Partial Regression Coefficients
• Testing Significance overall significance of
Overall fit of the model
• Testing for Individual Regression Coefficients
MLR Equation
• X1 =F(X2, X3,…)
• Regression equation of X1, on X2 and X3
• X1.23 = a1.23 + b12.3 X2 + b13.2 X3
• Partial regression coefficients-b12.3 , b13.2
• b12.3 – amount by which a unit change in X2 is
expected to affect X1 when X3 is held constant
• X1 varies partially because of variation in X2
and partially because of X3
Normal Equations
y=a+bx
𝑦 = 𝑁𝑎 + 𝑏 𝑥

𝑥𝑦 = 𝑎 𝑥+𝑏 𝑥2

y=a+b1x1+b2x2
𝑦 = 𝑁𝑎 + 𝑏1 x 1 + 𝑏2 x2

𝑥1 𝑦 = 𝑎 𝑥1 + 𝑏1 𝑥1 2 +𝑏2 x1x2
𝑥2 𝑦 = 𝑎 𝑥2 + 𝑏1 x1x2+𝑏2 x22
Example
Method 1- Normal Equations

y x1 x2 x1 y x2 y x1 x2 x1 2 X22
4 15 30 60 120 450 225 900
6 12 24 72 144 288 144 576
7 8 20 56 140 160 64 400
9 6 14 54 126 84 36 196
13 4 10 52 130 40 16 100
15 3 4 45 60 12 9 16
54 48 102 339 720 1034 494 2188

54 = 6𝑎 + 48𝑏1 + 102𝑏2
339 = 48𝑎 + 494𝑏1+1034𝑏2
720 = 102𝑎 + 1034𝑏1+2188𝑏2 y=16.47+0.38x1-0.62x2
Example
y x1 x2 x1 y x2 y x1 x2 x1 2 X22
2 3 4 6 8 12 9 16
4 5 6 20 36 30 25 36
6 7 8 42 48 56 49 64
8 9 10 72 80 90 81 100

20 24 28 140 172 188 164 216

20 = 4𝑎 + 24𝑏1 + 28𝑏2
140 = 24𝑎 + 164𝑏1+188𝑏2
172 = 28𝑎 + 188𝑏1+216𝑏2 y=0+2x1-1x2
MLR
Method 2-

Deviations taken from mean

[ x22 ) 𝑥1 𝑦 − ( x1x2 𝑥2 𝑦 ]
𝑏1 =
( 𝑥1 2 )( 𝑥2 2 ) − x1x22

[ x12 ) 𝑥2 𝑦 − ( x1x2 𝑥1 𝑦 ]
𝑏2 =
( 𝑥1 2 )( 𝑥2 2 ) − x1x22

𝑏0 = 𝑦 − 𝑏1 𝑥1 −𝑏2 𝑥2
MLR - two independent variables
𝑌 = 40.96 − 6.30𝑋1 + 24.77𝑋2
Method 3
(Yule’s Notation)
Finding regression coefficient from correlation
coefficients (deviation from mean)

b12.3 = Partial Regression coefficient of X1 on X2

b13.2 = Partial Regression coefficient of X1 on X3


Example
Example
Example
Example
F-test of overall significance in
regression analysis
The F-Test of overall significance in regression is
a test of whether or not your linear regression
model provides a better fit to a dataset than a
model with no predictor variables.
F-test of overall significance in
regression analysis
• The F-statistic is calculated as regression MS/residual MS.
• This statistic indicates whether the regression model provides a
better fit to the data than a model that contains no independent
variables.
• In essence, it tests if the regression model as a whole is useful.
• If the P < the significance level, there is sufficient evidence to
conclude that the regression model fits the data better than the
model with no predictor variables.
• This finding is good because it means that the predictor variables
in the model actually improve the fit of the model.
• In general, if none of the predictor variables in the model are
statistically significant, the overall F statistic is also not statistically
significant.
Example
Estimating output (Y) of physiotherapist from a knowledge of his/her test
score on the aptitude test (X1) and years of experience (X2) in a hospital
Regression Picture

yi
ŷi  xi  
C A

B
y
B y
A
C
yi

*Least squares estimation


x gave us the line (β) that
n n n minimized C2
(y
i 1
i  y) 2
  ( yˆ
i 1
i  y) 2
  ( yˆ
i 1
i  yi ) 2
R2=SSreg/SStotal
A2 B2 C2
SStotal SSreg SSresidual
Total squared distance of Distance from regression line to naïve Variance around the regression line
observations from naïve mean of y Additional variability not
mean of y Variability due to x (regression) explained by x—what least
Total variation squares method aims to
minimize
The degrees of freedom in a multiple regression equals N-k-1, where k is the number of variables.
Since 285.75>4.74 we reject null hypothesis and
conclude that model is significant with predictor
variables.
F-statistics
• The t tests are used to conduct hypothesis tests
on the regression coefficients obtained in simple
linear regression. A statistic based on the t
distribution is used to test the two-sided
hypothesis that the true slope, β1, equals some
constant value, β1,0. The statements for the
hypothesis test are expressed as:

• H0:β1= β1,0
• H1:β1≠ β1,0
• Coefficient of Multiple correlation
=sqrt(Coefficient of determination )
Conditions: intercept is included and best
possible linear predictors are used.
• Coefficient of determination is more general
case including non-linear predictions and
predicted values not derived from model
fitting approach
R1.23 = Multiple Correlation Coefficient
coefficient of X1 on X2 and X3
𝑟212+ 𝑟213 −2𝑟12 𝑟13 𝑟23
• 𝑅 21.23 =
1−𝑟223
𝑟221+ 𝑟223 − 2𝑟12 𝑟13 𝑟23
𝑅2 2.13 = 1 − 𝑟213

𝑟221+ 𝑟231 − 2𝑟12 𝑟13 𝑟23


𝑅2 3.12 = 1 − 𝑟221
• In a trivariate distribution, if 𝑟12 = 0.7, 𝑟13 =
0.61 𝑎𝑛𝑑 𝑟23 = 0.4
Find all multiple correlation coefficients.
𝑟212+ 𝑟213 − 2𝑟12 𝑟13 𝑟23
𝑅2 1.23 = 1 − 𝑟223

𝑅1.23 = 0.6196
𝑅2.13 = 0.4912
𝑅1.23 = 0.6111

You might also like