0% found this document useful (0 votes)

29 views32 pages

Simple Linear Regression and Correlation

Uploaded by

barajaalalaa133

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views32 pages

Simple Linear Regression and Correlation

Uploaded by

barajaalalaa133

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

Wachemo University

School of Public Health

Abriham S. Areba (Assistant Professor)

Simple Linear Regression and Correlation

October, 2024
Hossana, Ethiopia
Variable: is a characteristic that takes on different values in
different things.
Types of Variables

Quantitative (Numerical) variables

Example: Number of children in a family, Weight, height,

BP,...

Qualitative (Categorical) variables

Example: Marital status, religion, Education status, patient

satisfaction, …
Abriham S. Areba
Variables can be again classified into two broad categories

Dependent Variable
o Also called response/regressed/endogenous/outcome/effect/
explained variable
o It is the focus of the research
o Affected by other (independent) variables

Independent Variables
o Also called explanatory/regressors/Exogenous/predictor/
Covariate/causal variables
o Affects the outcome variable

Abriham S. Areba
Simple Linear Regression and Correlation

Abriham S. Areba
o Regression analysis is concerned with describing and evaluating
the relationship between a given variable (often called the
dependent variable) and one or more variables which are assumed
to influence the given variable (often called explanatory variables).

o Predict the value of a dependent variable based on the value of at

least one independent variable.

o Explain the impact of changes in an independent variable on the

dependent variable.

Abriham S. Areba
Linear Regression Model
o When we observe pairs (X, Y ), we would like to write a statistical
relation with uniformly small error.
o We do not know Y exactly for every X, we will often approximate
the relation between X and Y.

 The relationship between X and

Y is described by a linear
function
 Changes in Y are assumed to be
caused by changes in X

Abriham S. Areba
Simple Linear Regression Model
o Is to determine how the average value of the continuous
outcome y varies with the value of a single predictor x.

o Linear in the parameters since no parameter appears as an

exponent or is multiplied or divided by another parameter.

Consider the following 2 models:

Model 1 : Yi = β0 + β1Xi + εi
Model 2 : Yi = β0 + β1Xi + β 2 Xi2 + εi
Models 1 and 2 are both linear in the parameters, and
can thus both be considered as linear models.
Abriham S. Areba
Error term (ε)
In this context, error does not mean mistake but it is a statistical
term representing random fluctuations, measurement error or
the effect of factors outside of our control.
𝜀~𝑁(0, 𝜎 2 )
The true model cannot be observed since β0 and β1 are not
known. We must estimate them from the data.
This gives the estimated or fitted regression line is:
෢0 +β෠ 1 xi
yෝi = β
෢0 : estimates of β0
Where: β
β෠ 1 ∶ estimates of β1
yෝi : estimates of yi Abriham S. Areba
Assumptions of Simple Linear Regression Model
1. Linearity
2. Normality
3. Homogeneity of variance
4. Independence of error
Linearity: the relationships between the predictor and the
outcome the variable should be linear
Normality: the errors should be normally distributed
Homogeneity of Variance: the error variance should be constant
Independence: the errors associated with one observation are
not correlated with the errors of any other observation.

Abriham S. Areba
Assumption 3: The variance of y is the same for any x that is,
the spread of values for y at each level of x remains approximately
constant.
o The magnitude of the residuals is the vertical distance
between the actual observed points and the estimating line.

o The estimating line will have a ‘good fit’ if it minimizes the

error between the estimated points on the line and the actual
observed points that were used to draw it.
Abriham S. Areba
Abriham S. Areba
Abriham S. Areba
Abriham S. Areba
෢0 and β෠ 1 calculated as:
The parameter β

෢0 = yത − β෠ 1 xത
β

n σn n n
i=1 xi yi − σi=1 yi σi=1 yi
β෠ 1 = 2
n σn 2 n
i=1 xi − σi=1 x

Abriham S. Areba
o The regression line is that for independent variable the
corresponding dependent variable will be normally distributed
with:
❖ Mean, 𝟎 + 𝟏 x
and
❖ Variance, 2
o If 2 were 0, then every point would fall exactly on the
regression line, whereas
o the larger 2 is, the more scatter occurs about the regression
line.

Abriham S. Areba
The β0 and β1 are not known. We must estimate them,
This gives the estimated or fitted regression line is:
෢0 +β෠ 1 Xi
Y෡i = β
෢0 is the estimated mean response when X = 0.
β
β෠ 1 is the estimated change in the mean response for a unit increase
in X.

β෠ 1 > 0, indicates that there is a direct linear r/ship b/n x & y.

β෠ 1 < 0, indicates that there is an inverse r/ship b/n x and y.

β෠ 1 = 0, indicates that there is no linear r/ship between x and y.

Abriham S. Areba
Tests of Significance of Regression Coefficients

The null hypothesis is that there is no relationship between X and

Y is expressed as:
Ho : β1 = 0

The alternative hypothesis is that there is a significant relationship

between X and Y, that is,

HA : β1 ≠ 0

In order to reject or not reject the HO , we calculate the test

statistic given by:
෡ 1 −β0
β ෡1
β
t= ෡1) = ෡1)
Se(β Se(β
and compare the student’s t distribution with (n-2) df for a given
significance level α.
Decision rule:
If t > t αൗ n − 2
2

then we reject the null hypothesis, and conclude that there is a

significant relationship between X and Y.

A 1 − α 100% CI for β1 and β0 are given by:

෢1 ± t αൗ n − 2 Se β෠ 1
β 2
Correlation
Correlation Analysis: deals with the measurement of the closeness of
the relation ship which are described in the regression equation.
❖ Correlation: measures the relative strength of the linear
relationship between two variables
❖ Unit-less
❖ Ranges between –1 and 1
❖ r = -1 implies perfect negative linear correlation between the
variables under consideration
❖ r = +1 implies perfect positive linear correlation between the
variables under consideration

Abriham S. Areba
❖ The closer to –1, the stronger the negative linear relationship
❖ The closer to 1, the stronger the positive linear relationship
❖ The closer to 0, the weaker any positive linear relationship

Abriham S. Areba
The correlation coefficient, r between x & y is given by:

𝐶𝑜𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒(𝑥,𝑦)
r=
𝑉𝑎𝑟 𝑥 𝑉𝑎𝑟(𝑦)

σni=1 xi − 𝑥ҧ yi − 𝑦ത
𝑟=
𝑥𝑖 − 𝑥ҧ 2 𝑦𝑖 − 𝑦ത 2

n σni=1 xi yi − σni=1 xi σni=1 yi

𝑟=
n σni=1 xi 2 − σni=1 xi 2 n σni=1 yi 2 − σni=1 yi 2

𝑆𝑆𝑥𝑦
r=
𝑆𝑆𝑥𝑥 𝑆𝑆𝑦𝑦
Abriham S. Areba
Scatter Plots of Data with Various Correlation Coefficients

Abriham S. Areba
Hypothesis Testing for Correlation

H0: ρ = 0 (no correlation between two variables)

HA: ρ ≠ 0 (correlation exists between two variables)

𝑛−2
Test statistic 𝑡=𝑟
1−𝑟 2

has a t distribution with n-2 degrees of freedom.

conclusion: if P<0.05, Reject the Ho and conclude that there is
evidence to suggest that there is a correlation between two
variables.

Abriham S. Areba
Coefficient of Determination (R2)

o The coefficient of determination is the portion of the total

variation in the dependent variable that is explained by
variation in the independent variable
o It is an indicator of how well the model fits the data.
o Adding more predictors to the model increases R2

SSR SSE
R2 = =1− 0  R2  1
SST SST

The proportion of total variation in the dependent variable (y)

that is explained by changes in the independent variable (x) or by
the regression line is equal to: R2 𝑥100%
Abriham S. Areba
Example: A researcher wants to find out if there is any relationship between
the height of the son and his father. He took a random sample of 6 fathers and
their sons. The height in inches is given in the table below.

σ X i = 392, σ X i 2 = 25628, σ X i Yi = 26476, σ Yi =405, σ Yi 2 = 27355

A. Estimate the parameters β෠ 0 and β෠ 1

B. Fit a simple linear regression line and interpret the estimates
C. What would be the height of the son if his father’s height is 70 inch?
D. Calculate coefficient of correlation and Interpret it
E. Calculate coefficient of determination

Abriham S. Areba
n σn n
nσ XY − i=1 Xi σi=1 Yi 6∗26476−392∗405
β෠ 1 = i=1 in i 2 = = 0.92
n σi=1 Xi 2 − σn 6∗25628−3922
i=1 X

405 392
β෠ 0 = yത − β෠ 1 xത = − 0.92 ∗ = 7.2
6 6
B. The fitted (regression) line of Y on X is:
yො = β෠ 0 +β෠ 1 𝑥 = 7.2 + 0.92 x
β෠ 0 = 7.2, indicates that the value of Y when no effect of X for the Y.

β෠ 1 = 0.92, indicates for every unit increase in height of father, the

mean height of the son increase by 0.92.

β෠ 1 > 0, there is direct r/ship between height of father and son.

yො = 7.2 + 0.92 x
ො = 7.2+0.92(70) =71.8, thus the height of the son is 71.8 inch.
C. y
n σn n n
i=1 xi yi − σi=1 xi σi=1 yi
D. r =
2 2
n σn 2 n
i=1 xi − σi=1 xi n σn 2 n
i=1 yi − σi=1 yi

6 ∗ 26476 − 392 ∗ 405

r= = 0.92
6 ∗ 25628 − 3922 6 ∗ 27355 − 4052

There is strongest positive correlation between height of father and

height of son.
E.
r 2 = 0.922 = 84.6%
About 84.6% of the variation in the height of the son is due to
changes in the height of the father.

Correlation - Linear - Logistic Regression
No ratings yet
Correlation - Linear - Logistic Regression
123 pages
Correlation & Regression Analysis
100% (1)
Correlation & Regression Analysis
39 pages
Correlation and Regression Analysis - Updated
No ratings yet
Correlation and Regression Analysis - Updated
49 pages
Topic - Chapter 12 - Regression Models
No ratings yet
Topic - Chapter 12 - Regression Models
1 page
Regression and Correlation
No ratings yet
Regression and Correlation
17 pages
Lecture 8 Correlation and Linear Regression
No ratings yet
Lecture 8 Correlation and Linear Regression
66 pages
Correlation
100% (1)
Correlation
29 pages
Econometrics For Finance
100% (1)
Econometrics For Finance
54 pages
Chapter 10
No ratings yet
Chapter 10
3 pages
13simple Linear Regression
No ratings yet
13simple Linear Regression
127 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
7 pages
Chapter 9-Correlation and Regression
No ratings yet
Chapter 9-Correlation and Regression
23 pages
15 MAY - NR - Correlation and Regression
No ratings yet
15 MAY - NR - Correlation and Regression
10 pages
DAM Class 21-24 Regression Analysis
No ratings yet
DAM Class 21-24 Regression Analysis
93 pages
Correlation Simple Regression
No ratings yet
Correlation Simple Regression
26 pages
Simple Linear Regression Part 1
No ratings yet
Simple Linear Regression Part 1
63 pages
Handout 5 Correlation and Regression (Recovered)
No ratings yet
Handout 5 Correlation and Regression (Recovered)
6 pages
Regression Analysis
No ratings yet
Regression Analysis
18 pages
6 Continuous Data Analysis
No ratings yet
6 Continuous Data Analysis
49 pages
12.1correlation and Simple Linear
No ratings yet
12.1correlation and Simple Linear
45 pages
M. Amir Hossain PHD: Course No: Emba 502: Business Mathematics and Statistics
No ratings yet
M. Amir Hossain PHD: Course No: Emba 502: Business Mathematics and Statistics
31 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
34 pages
Lecture 8 and 9 Regression Correlation and Index
No ratings yet
Lecture 8 and 9 Regression Correlation and Index
32 pages
Stat Cor Reg
No ratings yet
Stat Cor Reg
85 pages
CH 4 - Correlation and Regression YARA&LAMA
No ratings yet
CH 4 - Correlation and Regression YARA&LAMA
27 pages
14 - Regresi Dan Korelasi
No ratings yet
14 - Regresi Dan Korelasi
34 pages
BSC - Applied Statistics - Correlation and SLR
No ratings yet
BSC - Applied Statistics - Correlation and SLR
67 pages
F Regression
No ratings yet
F Regression
65 pages
Correlation and Regression
No ratings yet
Correlation and Regression
7 pages
Correlation (Quantitative Variables)
No ratings yet
Correlation (Quantitative Variables)
39 pages
Stat Chapter 6
No ratings yet
Stat Chapter 6
23 pages
Mago, Jessica Marionne O. - Hypothesis Tests in Simple Linear Regression - Quiz
No ratings yet
Mago, Jessica Marionne O. - Hypothesis Tests in Simple Linear Regression - Quiz
2 pages
Correlation & Regression Guide
No ratings yet
Correlation & Regression Guide
25 pages
Simple Linear Regression
100% (1)
Simple Linear Regression
50 pages
Regression & Correlation Guide
100% (1)
Regression & Correlation Guide
9 pages
Review: I Am Examining Differences in The Mean Between Groups
100% (2)
Review: I Am Examining Differences in The Mean Between Groups
44 pages
Final Project: Raiha, Maheen, Fabiha Mahnoor, Zara
No ratings yet
Final Project: Raiha, Maheen, Fabiha Mahnoor, Zara
14 pages
Relationship - Correlation and Regression
No ratings yet
Relationship - Correlation and Regression
42 pages
Linear Regression Analysis - 1
No ratings yet
Linear Regression Analysis - 1
18 pages
Intro to Correlation & Regression
No ratings yet
Intro to Correlation & Regression
71 pages
Correlation
No ratings yet
Correlation
13 pages
Simple LR Lecture
No ratings yet
Simple LR Lecture
60 pages
Regression
No ratings yet
Regression
12 pages
Regression and Correlation
No ratings yet
Regression and Correlation
13 pages
Biostatistics Lect 7b - 112025
No ratings yet
Biostatistics Lect 7b - 112025
50 pages
Regression-SIMPLE LINEAR
No ratings yet
Regression-SIMPLE LINEAR
25 pages
Regression Analysis Overview
No ratings yet
Regression Analysis Overview
15 pages
Dr. Sufian M. Salih / Regression and Correlation
No ratings yet
Dr. Sufian M. Salih / Regression and Correlation
14 pages
Biostat Lecture Note 3
No ratings yet
Biostat Lecture Note 3
5 pages
Lectures 14 15
No ratings yet
Lectures 14 15
66 pages
Unit 07 Regression Correlation
No ratings yet
Unit 07 Regression Correlation
36 pages
Correlation and Regression Analyses
No ratings yet
Correlation and Regression Analyses
8 pages
QT - Unit 2 - Part B - Regression
No ratings yet
QT - Unit 2 - Part B - Regression
40 pages
Chapter 6 Student
No ratings yet
Chapter 6 Student
21 pages
Correlation and Regression
No ratings yet
Correlation and Regression
7 pages
BSC - Applied Statistics - Correlation and SLR
No ratings yet
BSC - Applied Statistics - Correlation and SLR
67 pages
Correlation Notes
No ratings yet
Correlation Notes
8 pages
Business Analytics Regression Guide
No ratings yet
Business Analytics Regression Guide
91 pages
Chapter 3 Complete
No ratings yet
Chapter 3 Complete
109 pages
TTP Oriontation 2017
No ratings yet
TTP Oriontation 2017
27 pages
Family Planning
No ratings yet
Family Planning
43 pages
Malaria Vs Dengue Fever Blog Article
No ratings yet
Malaria Vs Dengue Fever Blog Article
3 pages
The Puerperium Finall
No ratings yet
The Puerperium Finall
67 pages
Bipolar H
No ratings yet
Bipolar H
57 pages
New Born and Child Health
No ratings yet
New Born and Child Health
70 pages
4.2. Unit Four Observational Studies
No ratings yet
4.2. Unit Four Observational Studies
35 pages
Drug Management
No ratings yet
Drug Management
18 pages
Anc Ci SPHHMC
No ratings yet
Anc Ci SPHHMC
24 pages
Epidemiology Simba
No ratings yet
Epidemiology Simba
345 pages
Unit Nine Screening12
No ratings yet
Unit Nine Screening12
18 pages
Comprehensive Qualification Exam
No ratings yet
Comprehensive Qualification Exam
28 pages
4.1. Unit Four Descriptive Study Design
No ratings yet
4.1. Unit Four Descriptive Study Design
23 pages
Disease Incidence & Prevalence Guide
No ratings yet
Disease Incidence & Prevalence Guide
43 pages
Clinical Trial Design Essentials
No ratings yet
Clinical Trial Design Essentials
22 pages
Unit Seven Outbreak Investigation12
No ratings yet
Unit Seven Outbreak Investigation12
41 pages
Pharmacology Assignment
No ratings yet
Pharmacology Assignment
3 pages
Unit Six Causation12
No ratings yet
Unit Six Causation12
30 pages
Unit Five Measure of Association
No ratings yet
Unit Five Measure of Association
37 pages
02steps in The Development of Health Systems Research
No ratings yet
02steps in The Development of Health Systems Research
36 pages
TB Harriso - 21 - Edition
No ratings yet
TB Harriso - 21 - Edition
96 pages
Chapter 5
No ratings yet
Chapter 5
22 pages
Introduction To PHO. by Hana
No ratings yet
Introduction To PHO. by Hana
102 pages
Thyroid +History+Taking+&+Examination+ (Surgery+II) + (5th+year)
No ratings yet
Thyroid +History+Taking+&+Examination+ (Surgery+II) + (5th+year)
2 pages
Medicaly Important Gram Positive Rods
No ratings yet
Medicaly Important Gram Positive Rods
99 pages
ARF and IE
No ratings yet
ARF and IE
57 pages
VHD For HO
No ratings yet
VHD For HO
53 pages
Enter CH 4
No ratings yet
Enter CH 4
18 pages
2, Anatomy of The Upper Limb
No ratings yet
2, Anatomy of The Upper Limb
203 pages
SAT Math: Problem-Solving & Data Analysis
No ratings yet
SAT Math: Problem-Solving & Data Analysis
11 pages
Dagohoy - Resume
No ratings yet
Dagohoy - Resume
2 pages
Week 1 - Data Visualization and Summarization
No ratings yet
Week 1 - Data Visualization and Summarization
8 pages
Machine Learning With Python: The Complete Course
No ratings yet
Machine Learning With Python: The Complete Course
17 pages
Analysing Weather Data1
No ratings yet
Analysing Weather Data1
5 pages
Time Series Analysis Spark Practical
No ratings yet
Time Series Analysis Spark Practical
293 pages
MBA Papers
No ratings yet
MBA Papers
82 pages
Intro to Bootstrap for Econometrics
No ratings yet
Intro to Bootstrap for Econometrics
29 pages
Effectiveness of Geometer's Sketchpad Learning in Two-Dimensional Shapes
No ratings yet
Effectiveness of Geometer's Sketchpad Learning in Two-Dimensional Shapes
93 pages
BA-310-315 - Final Project
No ratings yet
BA-310-315 - Final Project
18 pages
Mixed Methods Designs: Power Point Slides by Ronald J. Shope in Collaboration With John W. Creswell
100% (2)
Mixed Methods Designs: Power Point Slides by Ronald J. Shope in Collaboration With John W. Creswell
20 pages
Machine Learning Final Notes by Sakhawat Hossain
No ratings yet
Machine Learning Final Notes by Sakhawat Hossain
76 pages
Strategy Analytics Analyst Resume
No ratings yet
Strategy Analytics Analyst Resume
3 pages
Assignment #1
No ratings yet
Assignment #1
3 pages
Adobe Analytics Business Practicioner
No ratings yet
Adobe Analytics Business Practicioner
14 pages
Minitab Time Series Guide
No ratings yet
Minitab Time Series Guide
1 page
Advanced Regression Analysis
No ratings yet
Advanced Regression Analysis
13 pages
Data Driven Decision Making
100% (1)
Data Driven Decision Making
27 pages
Jarasamaniego 2017
No ratings yet
Jarasamaniego 2017
41 pages
Report On Customer Satisfaction in of Shree Telecom Airtel Relationship Centre (ARC)
No ratings yet
Report On Customer Satisfaction in of Shree Telecom Airtel Relationship Centre (ARC)
34 pages
DEEPAK
No ratings yet
DEEPAK
55 pages
Gaming and Gender
No ratings yet
Gaming and Gender
22 pages
Machine Learning Presentaion
No ratings yet
Machine Learning Presentaion
15 pages
E-JRA Vol. 11 No. 11 Februari 2022 Fakultas Ekonomi Dan Bisnis Universitas Islam Malang
No ratings yet
E-JRA Vol. 11 No. 11 Februari 2022 Fakultas Ekonomi Dan Bisnis Universitas Islam Malang
10 pages
FDS L1 To L8 Slides
No ratings yet
FDS L1 To L8 Slides
143 pages
W.A.S.M.U.Widanaarachchi Postgraduate Institute of Science University of Peradeniya Peradeniya, Sri Lanka Csc2239@pgis - LK
No ratings yet
W.A.S.M.U.Widanaarachchi Postgraduate Institute of Science University of Peradeniya Peradeniya, Sri Lanka Csc2239@pgis - LK
7 pages
Summer Project Report
100% (1)
Summer Project Report
42 pages
Krishna Jaiswal Csds
No ratings yet
Krishna Jaiswal Csds
1 page
Banking Operation Project Work
No ratings yet
Banking Operation Project Work
16 pages
An Introduction To The Psych Package: Part I: Data Entry and Data Description
No ratings yet
An Introduction To The Psych Package: Part I: Data Entry and Data Description
63 pages

Simple Linear Regression and Correlation

Uploaded by

Simple Linear Regression and Correlation

Uploaded by

Wachemo University

School of Public Health

Abriham S. Areba (Assistant Professor)

Simple Linear Regression and Correlation

Quantitative (Numerical) variables

Example: Number of children in a family, Weight, height,

Qualitative (Categorical) variables

Example: Marital status, religion, Education status, patient

o Predict the value of a dependent variable based on the value of at

o Explain the impact of changes in an independent variable on the

 The relationship between X and

o Linear in the parameters since no parameter appears as an

Consider the following 2 models:

o The estimating line will have a ‘good fit’ if it minimizes the

β෠ 1 > 0, indicates that there is a direct linear r/ship b/n x & y.

β෠ 1 < 0, indicates that there is an inverse r/ship b/n x and y.

β෠ 1 = 0, indicates that there is no linear r/ship between x and y.

The null hypothesis is that there is no relationship between X and

The alternative hypothesis is that there is a significant relationship

In order to reject or not reject the HO , we calculate the test

then we reject the null hypothesis, and conclude that there is a

A 1 − α 100% CI for β1 and β0 are given by:

n σni=1 xi yi − σni=1 xi σni=1 yi

H0: ρ = 0 (no correlation between two variables)

has a t distribution with n-2 degrees of freedom.

o The coefficient of determination is the portion of the total

The proportion of total variation in the dependent variable (y)

σ X i = 392, σ X i 2 = 25628, σ X i Yi = 26476, σ Yi =405, σ Yi 2 = 27355

A. Estimate the parameters β෠ 0 and β෠ 1

β෠ 1 = 0.92, indicates for every unit increase in height of father, the

mean height of the son increase by 0.92.

6 ∗ 26476 − 392 ∗ 405

There is strongest positive correlation between height of father and

You might also like