05 Correlation Regression1
05 Correlation Regression1
Correlation and
Regression I
Day 5
Trends · Correlation · Causation · Spurious correlation · Pearson ·
Spearman · Linear regression · General model · 1 Dependent variable
· Strength · Model · Least square method · Explained / remaining
variation · (Adjusted) R square · Effect size · Normality residuals ·
Linearity ·Homoscedasticity · GLM · Coefficients · ANCOVA
1 2
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
2. Different distributions
Different response curves
3. Multiple regression
More than 1 independent variable
4. Zero-inflated models
Lots of zeros
3 4
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
1. Correlation 1. Correlation
2. Linear regression 2. Linear regression
Basics of regression Basics of regression
Linear model Linear model
Calculations Calculations
Assumptions Assumptions
Regression compared to … Regression compared to …
5 6
1
EM Regression 1 2022
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
3 Causation Correlation
2.5
Null hypothesis: X Y X Y
variable 2
2
1.5
there is no relationship
1 Once X has happened, Z
0.5 Y will follow
Not necessarily causal relation! 0
0 2 4 6 8 10 12
variable 1
7 8
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Spurious correlations
Examples
9 10
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Correlation or not?
11 12
2
EM Regression 1 2022
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Strong/weak correlation
Significant correlation?
9 18 r = 0.7
r=0
8 16 Weight
Weight
Weight
7 14
Weight
6 12
5 10
4 8
3 6
2 Lenght Lenght
4
1 2
0 0
0 5 10 15 20 25 0 5 10 15 20 25 r = -0.7
Lenght Lenght Weight 0 < r < 0.3 (positive or negative) weak correlation
r = 0.978, P < 0.001, N = 25 r = 0.494, P = 0.12, N = 25 0.3 < r < 0.6 moderate correlation
13 14
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Pearson correlation coefficient, rp
∑(𝑥 − 𝑥̅ )(𝑦 − 𝑦) (parametric test)
Pearson correlation coefficient 𝑟 =
(parametric test) x y
∑(𝑥 − 𝑥̅ ) ∑(𝑦 − 𝑦) ∑(𝑥 − 𝑥̅ )(𝑦 − 𝑦) 12 1
6 d 2
Spearman rank correlation coefficient 12 − 14.3 × 1 − 4 + 18 − 14.3 × 5 − 4 + 13 − 14.3 × 6 − 4
𝑟 = = 0.47
(non-parametric test)
rs 1 12 − 14.3 + 18 − 14.3 + 13 − 14.3 × 1−4 + 5−4 + 6−4
n3 n
Can always be used: not-normally distributed data df = total number of pairs of observations -2
For relationship between ranks
15 16
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Pearson correlation coefficient, rp Pearson correlation coefficient, rp
(parametric test) (parametric test)
1 r 2
Standard error of r sr
n2 Normal distribution of both
variables (bivariate normal
distribution)
9 18
Weight
8 16
7 14
6 12
5 10
4 8
3 6
2 4
1 2
0 0
0 5 10 15 20 25 0 5 10 15 20 25
Height Height
17 18
3
EM Regression 1 2022
rs 1 13 6 2 3 -1
n3 n n = number of pairs
Spearman rank correlation coefficient
(non-parametric test)
19 20
Today’s topics
ggplot(data = flycatcher, mapping = aes(x = length, y = weight)) +
geom_point()
Example R Make scatter plot
readxl
tidyverse 1. Correlation
2. Linear regression
Compute correlation coefficient Basics of regression
cor.test(flycatcher$weight, flycatcher$length, method = "pearson") Linear model
cor.test(flycatcher$weight, flycatcher$length, method = "spearman") Calculations
Assumptions
Regression compared to …
21 22
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Relationship between 2 or more variables Results often come from observational studies
23 24
4
EM Regression 1 2022
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Abundance species
Abundance species
Abundance species
Abundance species
pH pH
pH pH
25 26
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Abundance
Model
pH pH
Data
Abundance species
Abundance species
Expected
pH
Measured
pH pH
27 28
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
2. Linear regression
Remaining variation
Y
Basics of regression normally distributed
Linear model
b0 b1
Calculations
0 1
Assumptions X
29 30
5
EM Regression 1 2022
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
31 32
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
minimise sum
Residuals
}
Abundance
Abundance
}
0
} - pH
pH
pH pH
Minimize unexplained variation (residuals)
33 34
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
1. Correlation
2. Linear regression
Basics of regression
Linear model
Calculations
Assumptions
Regression compared to …
ANOVA calculation
35 36
6
EM Regression 1 2022
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
abundance
F-ratio and ANOVA output for regression
y SS d.f. M.S. F p
37 38
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Null hypothesis
No effect of independent variable on dependent variable
= Regression coefficient b1 is zero
39 40
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Clones of Daphnia
41 42
7
EM Regression 1 2022
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
43 44
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Linear regression
Fraction explained variance
y b0 b1 x
Size = 0.865 + 0.247 x Pred_density
45 46
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
regr.model <- lm(size ~ pred_dens, data = daphnia) What variation can be explained?
Example R
summary(regr.model)
Compute regression Strength of relation
readxl
daphnia$residuals <- residuals(regr.model)
tidyverse
ggh4x Retrieve residuals Radj2 = 0.90 Radj2= 0.08
Less variation, P << 0.001 Lot of variation, P > 0.05
shapiro.test(regr.model$residuals)
Test for normality
daphnia$predicted = predict(regr.model)
Size
Size
47 48
8
EM Regression 1 2022
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Size
Size
Lot of variation: Radj2 = 0.08 Lot of variation: Radj2= 0.08
Small sample size: P > 0.05 Large sample size: P < 0.01 Pred dens Pred dens
Size
Size
49 50
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Effect size?
Size
Size
Size
Size
51 52
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Today’s topics
1. Correlation
2. Linear regression
Basics of regression
Linear model
Calculations
Assumptions
Regression compared to …
53 54
9
EM Regression 1 2022
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Model selection
Multiple regression
Zero-inflated models
Assumptions regression
Today
Normal distribution (residuals)
Independence of errors
55 56
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Independence of errors
57 58
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Size
Test by eye
Transformation?
59 60
10
EM Regression 1 2022
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Assumptions
Regression compared to …
61 62
Introduction Correlation Regression Calculations Assumptions Comparisons Introduction Correlation Regression Calculations Assumptions Comparisons
Humid Arid
Rainfall
63 64
Tomorrow
General linear model: errors follow normal
distribution
65 66
11