KEMBAR78
11-Simple Linear Regression | PDF | Errors And Residuals | Regression Analysis
0% found this document useful (0 votes)
96 views25 pages

11-Simple Linear Regression

The document describes simple linear regression. Simple linear regression is used to predict a dependent variable (Y) based on an independent variable (X). It generates a regression equation of the form Y = βX + α, where β is the slope and α is the intercept. Residual plots are examined to check if the data meets the assumptions of linearity, normality and homoscedasticity. An example uses data to predict academic score based on number of free meals and checks if the assumptions are met before conducting the linear regression analysis. The results show number of free meals significantly predicts academic score and explains about 67% of variance in academic scores.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
96 views25 pages

11-Simple Linear Regression

The document describes simple linear regression. Simple linear regression is used to predict a dependent variable (Y) based on an independent variable (X). It generates a regression equation of the form Y = βX + α, where β is the slope and α is the intercept. Residual plots are examined to check if the data meets the assumptions of linearity, normality and homoscedasticity. An example uses data to predict academic score based on number of free meals and checks if the assumptions are met before conducting the linear regression analysis. The results show number of free meals significantly predicts academic score and explains about 67% of variance in academic scores.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

SIMPLE LINEAR SRT605/

SRT666
REGRESSION
LINEAR REGRESSION
 Is based on correlation and used when we want to
predict the value of a variable (DV; outcome variable)
based on the value of another variable (IV;
explanatory variable)
 Example of RQ:
 Can exam performance be predicted based on
revision time?
 Can alcohol consumption be predicted based on
smoking duration?
SIMPLE LINEAR REGRESSION
 Regression equation:
Y = βX + α

 Regression coefficient ()


- the slope of the line is the  coefficient
- measures the change in the average value of Y
(DV) for a unit change in X value (IV)
RESIDUALS SCATTERPLOTS
 Generated as part of linear regression procedure
 Residuals are the differences between the obtained
and the predicted DV scores
Residual = observed value – predicted value

Predicted value
Residual

Observed value
RESIDUALS SCATTERPLOTS
 The residuals scatterplots allow you to check:-
 ➢ normality : the residuals should be normally
distributed about the predicted DV scores

Normal P-P Plot of Regression Standardized Residual


RESIDUALS SCATTERPLOTS
 The residuals scatterplots allow you to check:-
 ➢ linearity : The residuals should have a straight-
line relationship with predicted DV scores
RESIDUALS SCATTERPLOTS
 The residuals scatterplots allow you to check:-
 ➢ homoscedasticity : The variance of the residuals
around predicted DV scores should be the same
for all predicted scores
ASSUMPTIONS
 Assumptions 1 : Two continuous variables
 Assumption 2 : The relationship between the two
variables should be linear (scatterplot)

 Assumption 3 : Independence of residuals (Durbin


Watson test)
ASSUMPTIONS
 Assumption 4 : The residuals should be normally
distributed about the predicted DV scores
 Assumption 5 : The data needs to show
homoscedasticity (i.e. equal/similar variances)

 Assumption 6 : There should be no significant outliers


RQ: CAN WE PREDICT ACADEMIC SCORE BASED
ON THE NUMBER OF FREE MEALS RECEIVED?
 Dataset : https://bit.ly/2E1xFWL
 Assumptions 1 : Two continuous variables
 Variable 1 (IV) :
 Variable 2 (DV) :
Academic score =  (free meals) + 

Where,
Academic score is the dependent (response) variable
Free meals is the independent (explanatory) variable
 is a constant
 is the regression coefficient
RQ: CAN WE PREDICT ACADEMIC SCORE BASED
ON THE NUMBER OF FREE MEALS RECEIVED?
 Assumption 2 : The relationship between the two
variables should be linear (scatterplot)
RQ: CAN WE PREDICT ACADEMIC SCORE BASED
ON THE NUMBER OF FREE MEALS RECEIVED?
 Assumption 2 : The relationship between the two
variables should be linear (scatterplot)
RQ: CAN WE PREDICT ACADEMIC SCORE BASED
ON THE NUMBER OF FREE MEALS RECEIVED?

 Generate
unstandardized
predicted
value (PRE) and
unstandardized
residuals (RES)
RQ: CAN WE PREDICT ACADEMIC SCORE BASED
ON THE NUMBER OF FREE MEALS RECEIVED?
RQ: CAN WE PREDICT ACADEMIC SCORE BASED
ON THE NUMBER OF FREE MEALS RECEIVED?
 To investigate assumptions 3 (independence), 4
(normality), 5 (homoscedasticity), & 6 (outliers)
 Analyze → Regression → Linear
 Select DV & IV, click Statistics button and tick Durbin-
Watson & Casewise Diagnostics in Residuals box
RQ: CAN WE PREDICT ACADEMIC SCORE BASED
ON THE NUMBER OF FREE MEALS RECEIVED?
 Click Plots button
 ➢ move *ZRESID into the Y box
 ➢ move *ZPRED into the X box
 ➢ In the Standardized Residual Plots, tick the Normal
probability plot and histogram options
RQ: CAN WE PREDICT ACADEMIC SCORE BASED
ON THE NUMBER OF FREE MEALS RECEIVED?
 Assumption 3 : Independence of residuals
 If there is no autocorrelation in the residuals, the
Durbin-Watson statistic should be between 1.5 and
2.5
RQ: CAN WE PREDICT ACADEMIC SCORE BASED
ON THE NUMBER OF FREE MEALS RECEIVED?
 Assumption 4 : The residuals should be normally
distributed about the predicted DV scores
RQ: CAN WE PREDICT ACADEMIC SCORE BASED
ON THE NUMBER OF FREE MEALS RECEIVED?
 Assumption 5 : The data needs to show
homoscedasticity (i.e. equal/similar variances)
RQ: CAN WE PREDICT ACADEMIC SCORE BASED
ON THE NUMBER OF FREE MEALS RECEIVED?
 Assumption 6 : There should be no significant outliers
 Outliers : cases that have a standardised residual of
more than 3.3 or less than –3.3
 Casewise Diagnostics shows cases that have
standardised residual values above 3 or below –3

 In a normally distributed sample, we would expect


only 1% of cases to fall outside this range
RQ: CAN WE PREDICT ACADEMIC SCORE BASED
ON THE NUMBER OF FREE MEALS RECEIVED?
 As all assumptions are met, we can proceed to SLR
RQ: CAN WE PREDICT ACADEMIC SCORE BASED
ON THE NUMBER OF FREE MEALS RECEIVED?
Coefficientsa
Model Unstandardized Standardized t Sig. 95.0% Confidence Interval
Coefficients Coefficients for B

B Std. Error Beta Lower Bound Upper


Bound
(Constant) 866.482 11.309 76.621 .000 844.231 888.733
1 free meal
-3.762 .149 -.819 -25.280 .000 -4.055 -3.469
given
a. Dependent Variable: academic performance

 In the Coefficients table, check if the variable show a


Sig. value less than .05
 A Sig. value < 0.05 indicates that the variable is
making a statistically significant contribution to the
equation
 Academic score = -3.762(free meal) + 866.48
RQ: CAN WE PREDICT ACADEMIC SCORE BASED
ON THE NUMBER OF FREE MEALS RECEIVED?
Model Summary
Model R R Adjusted R Std. Error Change Statistics
Square Square of the R Square F df1 df2 Sig. F
Estimate Change Change Change
1 .819a .671 .670 64.299 .671 639.098 1 313 .000
a. Predictors: (Constant), free meal given

 The correlation of determination (r2) tells you how


much of the variance in the DV (academic score) is
explained by the IV (free meal)
 Obtained by squaring the r value
 can only take on values from 0 to 1
RQ: CAN WE PREDICT ACADEMIC SCORE BASED
ON THE NUMBER OF FREE MEALS RECEIVED?
Model Summary
Model R R Adjusted R Std. Error Change Statistics
Square Square of the R Square F df1 df2 Sig. F
Estimate Change Change Change
1 .819a .671 .670 64.299 .671 639.098 1 313 .000
a. Predictors: (Constant), free meal given

 When expressed as a percentage (multiply r2 by


100), the model explains 67.1% of the variance in
academic score
 The closer the r2 is to 1, the greater the ability of the
model to predict a trend

Academic score = -3.762(free meal) + 866.48

You might also like