Introduction to Probability
Lecture 7: Independence, Covariance and Correlation
Mateja Jamnik, Thomas Sauerwald
University of Cambridge, Department of Computer Science and Technology
email: {mateja.jamnik,thomas.sauerwald}@cl.cam.ac.uk
Independence of Random Variables
Definition of Independence
Two random variables X and Y are independent if for all values a, b:
P [ X ≤ a, Y ≤ b ] = P [ X ≤ a ] · P [ Y ≤ b ] .
Intro to Probability 2
Independence of Random Variables
Definition of Independence
Two random variables X and Y are independent if for all values a, b:
P [ X ≤ a, Y ≤ b ] = P [ X ≤ a ] · P [ Y ≤ b ] .
For two discrete random variables, an equivalent definition is:
P [ X = a, Y = b ] = P [ X = a ] · P [ Y = b ] .
Intro to Probability 2
Independence of Random Variables
Definition of Independence
Two random variables X and Y are independent if for all values a, b:
P [ X ≤ a, Y ≤ b ] = P [ X ≤ a ] · P [ Y ≤ b ] .
For two discrete random variables, an equivalent definition is:
P [ X = a, Y = b ] = P [ X = a ] · P [ Y = b ] .
This is useless for continuous random variables.
Intro to Probability 2
Independence of Random Variables
This definition covers the discrete and continuous case!
Definition of Independence
Two random variables X and Y are independent if for all values a, b:
P [ X ≤ a, Y ≤ b ] = P [ X ≤ a ] · P [ Y ≤ b ] .
For two discrete random variables, an equivalent definition is:
P [ X = a, Y = b ] = P [ X = a ] · P [ Y = b ] .
This is useless for continuous random variables.
Intro to Probability 2
Independence of Random Variables
This definition covers the discrete and continuous case!
Definition of Independence
Two random variables X and Y are independent if for all values a, b:
P [ X ≤ a, Y ≤ b ] = P [ X ≤ a ] · P [ Y ≤ b ] .
For two discrete random variables, an equivalent definition is:
P [ X = a, Y = b ] = P [ X = a ] · P [ Y = b ] .
This is useless for continuous random variables.
Remark
Using the joint probability distribution, the above is equivalent to for all a, b,
F (a, b) = FX (a) · FY (b).
Intro to Probability 2
Independence of Random Variables
This definition covers the discrete and continuous case!
Definition of Independence
Two random variables X and Y are independent if for all values a, b:
P [ X ≤ a, Y ≤ b ] = P [ X ≤ a ] · P [ Y ≤ b ] .
For two discrete random variables, an equivalent definition is:
P [ X = a, Y = b ] = P [ X = a ] · P [ Y = b ] .
This is useless for continuous random variables.
Remark
Using the joint probability distribution, the above is equivalent to for all a, b,
F (a, b) = FX (a) · FY (b).
All these definitions extend in the natural way to more than two variables!
Intro to Probability 2
Factorisation
Factorisation
The definition of independence of X and Y implies the following factor-
isation formula: for any “suitable” sets A and B,
P [ X ∈ A, Y ∈ B ] = P [ X ∈ A ] · P [ Y ∈ B ]
Intro to Probability 3
Factorisation
Factorisation
The definition of independence of X and Y implies the following factor-
isation formula: for any “suitable” sets A and B,
P [ X ∈ A, Y ∈ B ] = P [ X ∈ A ] · P [ Y ∈ B ]
For continuous distributions one obtains by differentiating both sides in
the formula for the joint distribution:
fX ,Y (x, y ) = fX (x) · fY (y )
Intro to Probability 3
Factorisation
Factorisation
The definition of independence of X and Y implies the following factor-
isation formula: for any “suitable” sets A and B,
P [ X ∈ A, Y ∈ B ] = P [ X ∈ A ] · P [ Y ∈ B ]
For continuous distributions one obtains by differentiating both sides in
the formula for the joint distribution:
fX ,Y (x, y ) = fX (x) · fY (y )
Example
Let X and Y be two independent variables. Let I = (a, b] be any interval
and define U := 1X ∈I and V := 1Y ∈I . Prove U and V are independent.
Answer
Intro to Probability 3
Buffon’s Needle Problem (1/2)
Georges-Louis Leclerc de Buffon 1707–1788 (Source Wikipedia)
A table is ruled with equidistant, parallel lines a distance D apart.
Intro to Probability 4
Buffon’s Needle Problem (1/2)
Georges-Louis Leclerc de Buffon 1707–1788 (Source Wikipedia)
A table is ruled with equidistant, parallel lines a distance D apart.
A needle of length L is thrown randomly on the table.
Intro to Probability 4
Buffon’s Needle Problem (1/2)
Georges-Louis Leclerc de Buffon 1707–1788 (Source Wikipedia)
A table is ruled with equidistant, parallel lines a distance D apart.
A needle of length L is thrown randomly on the table.
What is the probability that the needle will intersect one of the two lines?
Intro to Probability 4
Buffon’s Needle Problem (1/2)
Source: Ross, Probability 8th ed.
Georges-Louis Leclerc de Buffon 1707–1788 (Source Wikipedia)
A table is ruled with equidistant, parallel lines a distance D apart.
A needle of length L is thrown randomly on the table.
What is the probability that the needle will intersect one of the two lines?
Intro to Probability 4
Buffon’s Needle Problem (1/2)
Source: Ross, Probability 8th ed.
Georges-Louis Leclerc de Buffon 1707–1788 (Source Wikipedia)
A table is ruled with equidistant, parallel lines a distance D apart.
A needle of length L is thrown randomly on the table.
What is the probability that the needle will intersect one of the two lines?
Let X be the distance of the middle point of the needle to the closest parallel
line. Needle intersects a line if hypotenuse of the triangle is less than L/2, i.e.,
X L L
< ⇔ X < cos(θ).
cos(θ) 2 2
Intro to Probability 4
Buffon’s Needle Problem (1/2)
Source: Ross, Probability 8th ed.
Georges-Louis Leclerc de Buffon 1707–1788 (Source Wikipedia)
A table is ruled with equidistant, parallel lines a distance D apart.
A needle of length L is thrown randomly on the table.
What is the probability that the needle will intersect one of the two lines?
Let X be the distance of the middle point of the needle to the closest parallel
line. Needle intersects a line if hypotenuse of the triangle is less than L/2, i.e.,
X L L
< ⇔ X < cos(θ).
cos(θ) 2 2
We assume that X ∈ [0, D/2] and θ ∈ [0, π/2] are independent and uniform.
Intro to Probability 4
Buffon’s Needle Problem (1/2)
Source: Ross, Probability 8th ed.
Georges-Louis Leclerc de Buffon 1707–1788 (Source Wikipedia)
A table is ruled with equidistant, parallel lines a distance D apart.
A needle of length L is thrown randomly on the table.
What is the probability that the needle will intersect one of the two lines?
Let X be the distance of the middle point of the needle to the closest parallel
line. Needle intersects a line if hypotenuse of the triangle is less than L/2, i.e.,
X L L
< ⇔ X < cos(θ).
cos(θ) 2 2
We assume that X ∈ [0, D/2] and θ ∈ [0, π/2] are independent and uniform.
Can be thought of as: 1. Sample the middle point of needle, 2. Sample the angle.
Intro to Probability 4
Buffon’s Needle Problem (2/2)
Let us compute the probability that the line intersects:
L
P X < · cos(θ)
2
Intro to Probability 5
Buffon’s Needle Problem (2/2)
Let us compute the probability that the line intersects:
ZZ
L
P X < · cos(θ) = fX ,θ (x, y ) dx dy
2
x<(L/2) cos y
Intro to Probability 5
Buffon’s Needle Problem (2/2)
Let us compute the probability that the line intersects:
ZZ
L
P X < · cos(θ) = fX ,θ (x, y ) dx dy
2
x<(L/2) cos y
ZZ
= fX (x)fθ (y ) dx dy
x<(L/2) cos y
Intro to Probability 5
Buffon’s Needle Problem (2/2)
Let us compute the probability that the line intersects:
ZZ
L
P X < · cos(θ) = fX ,θ (x, y ) dx dy
2
x<(L/2) cos y
ZZ
= fX (x)fθ (y ) dx dy
x<(L/2) cos y
Z π/2 Z L/2 cos(y )
4
= dxdy
πD 0 0
Intro to Probability 5
Buffon’s Needle Problem (2/2)
Let us compute the probability that the line intersects:
ZZ
L
P X < · cos(θ) = fX ,θ (x, y ) dx dy
2
x<(L/2) cos y
ZZ
= fX (x)fθ (y ) dx dy
x<(L/2) cos y
Z π/2 Z L/2 cos(y )
4
= dxdy
πD 0 0
Z π/2
4 L
= cos(y )dy
πD 0 2
Intro to Probability 5
Buffon’s Needle Problem (2/2)
Let us compute the probability that the line intersects:
ZZ
L
P X < · cos(θ) = fX ,θ (x, y ) dx dy
2
x<(L/2) cos y
ZZ
= fX (x)fθ (y ) dx dy
x<(L/2) cos y
Z π/2 Z L/2 cos(y )
4
= dxdy
πD 0 0
Z π/2
4 L
= cos(y )dy
πD 0 2
2L
= .
πD
Intro to Probability 5
Buffon’s Needle Problem (2/2)
Let us compute the probability that the line intersects:
ZZ
L
P X < · cos(θ) = fX ,θ (x, y ) dx dy
2
x<(L/2) cos y
ZZ
= fX (x)fθ (y ) dx dy
x<(L/2) cos y
Z π/2 Z L/2 cos(y )
4
= dxdy
πD 0 0
Z π/2
4 L
= cos(y )dy
πD 0 2
2L
= .
πD
This gives us a method to estimate π!
Intro to Probability 5
Covariance
Definition of Covariance
Let X and Y be two random variables. The covariance is defined as:
Cov [ X , Y ] = E [ (X − E [ X ]) · (Y − E [ Y ]) ] .
Intro to Probability 6
Covariance
Definition of Covariance
Let X and Y be two random variables. The covariance is defined as:
Cov [ X , Y ] = E [ (X − E [ X ]) · (Y − E [ Y ]) ] .
Interpretation:
Intro to Probability 6
Covariance
Definition of Covariance
Let X and Y be two random variables. The covariance is defined as:
Cov [ X , Y ] = E [ (X − E [ X ]) · (Y − E [ Y ]) ] .
Interpretation:
If Cov [ X , Y ] > 0 and X has a realisation larger (smaller) than E [ X ],
then Y will likely have a realisation larger (smaller) than E [ Y ].
Intro to Probability 6
Covariance
Definition of Covariance
Let X and Y be two random variables. The covariance is defined as:
Cov [ X , Y ] = E [ (X − E [ X ]) · (Y − E [ Y ]) ] .
Interpretation:
If Cov [ X , Y ] > 0 and X has a realisation larger (smaller) than E [ X ],
then Y will likely have a realisation larger (smaller) than E [ Y ].
If Cov [ X , Y ] < 0, then it is the other way around.
Intro to Probability 6
Covariance
Definition of Covariance
Let X and Y be two random variables. The covariance is defined as:
Cov [ X , Y ] = E [ (X − E [ X ]) · (Y − E [ Y ]) ] .
Interpretation:
If Cov [ X , Y ] > 0 and X has a realisation larger (smaller) than E [ X ],
then Y will likely have a realisation larger (smaller) than E [ Y ].
If Cov [ X , Y ] < 0, then it is the other way around.
Alternative Formula
Using the linearity of expectation rule, one has the equivalent definition:
Cov [ X , Y ] = E [ X · Y ] − E [ X ] · E [ Y ] .
Intro to Probability 6
Covariance
Definition of Covariance
Let X and Y be two random variables. The covariance is defined as:
Cov [ X , Y ] = E [ (X − E [ X ]) · (Y − E [ Y ]) ] .
Interpretation:
If Cov [ X , Y ] > 0 and X has a realisation larger (smaller) than E [ X ],
then Y will likely have a realisation larger (smaller) than E [ Y ].
If Cov [ X , Y ] < 0, then it is the other way around.
Alternative Formula
Using the linearity of expectation rule, one has the equivalent definition:
Cov [ X , Y ] = E [ X · Y ] − E [ X ] · E [ Y ] .
Note that Cov [ X , X ] = V [ X ].
Intro to Probability 6
Covariance
Definition of Covariance
Let X and Y be two random variables. The covariance is defined as:
Cov [ X , Y ] = E [ (X − E [ X ]) · (Y − E [ Y ]) ] .
Interpretation:
If Cov [ X , Y ] > 0 and X has a realisation larger (smaller) than E [ X ],
then Y will likely have a realisation larger (smaller) than E [ Y ].
If Cov [ X , Y ] < 0, then it is the other way around.
Alternative Formula
Using the linearity of expectation rule, one has the equivalent definition:
Cov [ X , Y ] = E [ X · Y ] − E [ X ] · E [ Y ] .
Note that Cov [ X , X ] = V [ X ].
Two variables X , Y with Cov [ X , Y ] > 0 are positively correlated.
Intro to Probability 6
Covariance
Definition of Covariance
Let X and Y be two random variables. The covariance is defined as:
Cov [ X , Y ] = E [ (X − E [ X ]) · (Y − E [ Y ]) ] .
Interpretation:
If Cov [ X , Y ] > 0 and X has a realisation larger (smaller) than E [ X ],
then Y will likely have a realisation larger (smaller) than E [ Y ].
If Cov [ X , Y ] < 0, then it is the other way around.
Alternative Formula
Using the linearity of expectation rule, one has the equivalent definition:
Cov [ X , Y ] = E [ X · Y ] − E [ X ] · E [ Y ] .
Note that Cov [ X , X ] = V [ X ].
Two variables X , Y with Cov [ X , Y ] > 0 are positively correlated.
Two variables X , Y with Cov [ X , Y ] < 0 are negatively correlated.
Intro to Probability 6
Covariance
Definition of Covariance
Let X and Y be two random variables. The covariance is defined as:
Cov [ X , Y ] = E [ (X − E [ X ]) · (Y − E [ Y ]) ] .
Interpretation:
If Cov [ X , Y ] > 0 and X has a realisation larger (smaller) than E [ X ],
then Y will likely have a realisation larger (smaller) than E [ Y ].
If Cov [ X , Y ] < 0, then it is the other way around.
Alternative Formula
Using the linearity of expectation rule, one has the equivalent definition:
Cov [ X , Y ] = E [ X · Y ] − E [ X ] · E [ Y ] .
Note that Cov [ X , X ] = V [ X ].
Two variables X , Y with Cov [ X , Y ] > 0 are positively correlated.
Two variables X , Y with Cov [ X , Y ] < 0 are negatively correlated.
Two variables X , Y with Cov [ X , Y ] = 0 are uncorrelated.
Intro to Probability 6
Illustration of 3 Cases for Cov [ X , Y ]
500 outcomes of randomly generated pairs of RVs (X , Y ) with different joint distributions
Source: Textbook by Dekking
1. What is the covariance (positive, negative, neutral)?
2. Where is the covariance the largest (in magnitude)?
Intro to Probability 7
Independence implies Uncorrelated
Example
Let X and Y be two independent random variables. Then X and Y are
uncorrelated, i.e., Cov [ X , Y ] = 0.
Answer
We give a proof for the discrete case:
Intro to Probability 8
Uncorrelated may not imply Independence
Example
Find a (simple) example of two random variables X and Y which are un-
correlated but dependent.
Answer
Intro to Probability 9
Uncorrelated may not imply Independence
Example
Find a (simple) example of two random variables X and Y which are un-
correlated but dependent.
Answer
Let X be uniformly sampled from {−1, 0, +1} and Y := 1X =0 .
Intro to Probability 9
Uncorrelated may not imply Independence
Example
Find a (simple) example of two random variables X and Y which are un-
correlated but dependent.
Answer
Let X be uniformly sampled from {−1, 0, +1} and Y := 1X =0 .
⇒ X · Y = 0 (for all outcomes), and thus
Intro to Probability 9
Uncorrelated may not imply Independence
Example
Find a (simple) example of two random variables X and Y which are un-
correlated but dependent.
Answer
Let X be uniformly sampled from {−1, 0, +1} and Y := 1X =0 .
⇒ X · Y = 0 (for all outcomes), and thus
E [ X · Y ] = 0.
Intro to Probability 9
Uncorrelated may not imply Independence
Example
Find a (simple) example of two random variables X and Y which are un-
correlated but dependent.
Answer
Let X be uniformly sampled from {−1, 0, +1} and Y := 1X =0 .
⇒ X · Y = 0 (for all outcomes), and thus
E [ X · Y ] = 0.
Further, E [ X ] = 0 (and E [ Y ] = 1/3), and hence:
Intro to Probability 9
Uncorrelated may not imply Independence
Example
Find a (simple) example of two random variables X and Y which are un-
correlated but dependent.
Answer
Let X be uniformly sampled from {−1, 0, +1} and Y := 1X =0 .
⇒ X · Y = 0 (for all outcomes), and thus
E [ X · Y ] = 0.
Further, E [ X ] = 0 (and E [ Y ] = 1/3), and hence:
Cov [ X , Y ] = E [ X · Y ] − E [ X ] · E [ Y ] = 0.
Intro to Probability 9
Uncorrelated may not imply Independence
Example
Find a (simple) example of two random variables X and Y which are un-
correlated but dependent.
Answer
Let X be uniformly sampled from {−1, 0, +1} and Y := 1X =0 .
⇒ X · Y = 0 (for all outcomes), and thus
E [ X · Y ] = 0.
Further, E [ X ] = 0 (and E [ Y ] = 1/3), and hence:
Cov [ X , Y ] = E [ X · Y ] − E [ X ] · E [ Y ] = 0.
On the other hand, P [ X = 0 ] = 1/3 and P [ Y = 0 ] = 2/3, and thus
Intro to Probability 9
Uncorrelated may not imply Independence
Example
Find a (simple) example of two random variables X and Y which are un-
correlated but dependent.
Answer
Let X be uniformly sampled from {−1, 0, +1} and Y := 1X =0 .
⇒ X · Y = 0 (for all outcomes), and thus
E [ X · Y ] = 0.
Further, E [ X ] = 0 (and E [ Y ] = 1/3), and hence:
Cov [ X , Y ] = E [ X · Y ] − E [ X ] · E [ Y ] = 0.
On the other hand, P [ X = 0 ] = 1/3 and P [ Y = 0 ] = 2/3, and thus
1 = P [ X · Y = 0 ] > P [ X = 0 ] · P [ Y = 0 ] = 2/9.
Intro to Probability 9
Variance of Sums and Covariances
Variance of Sum Formula
For any two random variables X , Y ,
V [ X + Y ] = V [ X ] + V [ Y ] + 2 · Cov [ X , Y ] .
Intro to Probability 10
Variance of Sums and Covariances
Variance of Sum Formula
For any two random variables X , Y ,
V [ X + Y ] = V [ X ] + V [ Y ] + 2 · Cov [ X , Y ] .
Hence if X and Y are uncorrelated variables,
V[X + Y ] = V[X ] + V[Y ].
Intro to Probability 10
Variance of Sums and Covariances
Variance of Sum Formula
For any two random variables X , Y ,
V [ X + Y ] = V [ X ] + V [ Y ] + 2 · Cov [ X , Y ] .
Hence if X and Y are uncorrelated variables,
Generalisation of the case where
V[X + Y ] = V[X ] + V[Y ]. X and Y are even independent!
Intro to Probability 10
Variance of Sums and Covariances
Variance of Sum Formula
For any two random variables X , Y ,
V [ X + Y ] = V [ X ] + V [ Y ] + 2 · Cov [ X , Y ] .
Hence if X and Y are uncorrelated variables,
Generalisation of the case where
V[X + Y ] = V[X ] + V[Y ]. X and Y are even independent!
For any random variables X1 , X2 , . . . , Xn :
" n # n n X
n
X X X
V Xi = V [ Xi ] + 2 · Cov [ Xi , Xj ] .
i=1 i=1 i=1 j=i+1
Intro to Probability 10
Computing Variances of Sums of Uncorrelated Variables
Example
Recall the example where X ∈ {−1, 0, +1} uniformly and Y := 1X =0 . Com-
pute V [ X + Y ].
Answer
Intro to Probability 11
Correlation Coefficient: Normalising the Covariance
The definition of covariance is not scaling invariant:
Intro to Probability 12
Correlation Coefficient: Normalising the Covariance
The definition of covariance is not scaling invariant:
If X increases by a factor of α, then Cov [ X , Y ] increases by a factor of α.
Intro to Probability 12
Correlation Coefficient: Normalising the Covariance
The definition of covariance is not scaling invariant:
If X increases by a factor of α, then Cov [ X , Y ] increases by a factor of α.
⇒ Even if X and Y both increase by α, then Cov [ X , Y ] will change.
(Exercise: It changes by?)
Intro to Probability 12
Correlation Coefficient: Normalising the Covariance
The definition of covariance is not scaling invariant:
If X increases by a factor of α, then Cov [ X , Y ] increases by a factor of α.
⇒ Even if X and Y both increase by α, then Cov [ X , Y ] will change.
(Exercise: It changes by?)
Correlation Coefficient
Let X and Y be two random variables. The correlation coefficient ρ(X , Y )
is defined as:
Cov [ X , Y ]
ρ(X , Y ) = p .
V[X ] · V[Y ]
If V [ X ] = 0 or V [ Y ] = 0, then it is defined as 0.
Intro to Probability 12
Correlation Coefficient: Normalising the Covariance
The definition of covariance is not scaling invariant:
If X increases by a factor of α, then Cov [ X , Y ] increases by a factor of α.
⇒ Even if X and Y both increase by α, then Cov [ X , Y ] will change.
(Exercise: It changes by?)
Correlation Coefficient
Let X and Y be two random variables. The correlation coefficient ρ(X , Y )
is defined as:
Cov [ X , Y ]
ρ(X , Y ) = p .
V[X ] · V[Y ]
If V [ X ] = 0 or V [ Y ] = 0, then it is defined as 0.
Properties:
Intro to Probability 12
Correlation Coefficient: Normalising the Covariance
The definition of covariance is not scaling invariant:
If X increases by a factor of α, then Cov [ X , Y ] increases by a factor of α.
⇒ Even if X and Y both increase by α, then Cov [ X , Y ] will change.
(Exercise: It changes by?)
Correlation Coefficient
Let X and Y be two random variables. The correlation coefficient ρ(X , Y )
is defined as:
Cov [ X , Y ]
ρ(X , Y ) = p .
V[X ] · V[Y ]
If V [ X ] = 0 or V [ Y ] = 0, then it is defined as 0.
Properties:
1. The correlation coefficient is scaling-invariant, i.e.,
ρ(X , Y ) = ρ(α · X , β · Y ) for any α, β > 0.
Intro to Probability 12
Correlation Coefficient: Normalising the Covariance
The definition of covariance is not scaling invariant:
If X increases by a factor of α, then Cov [ X , Y ] increases by a factor of α.
⇒ Even if X and Y both increase by α, then Cov [ X , Y ] will change.
(Exercise: It changes by?)
Correlation Coefficient
Let X and Y be two random variables. The correlation coefficient ρ(X , Y )
is defined as:
Cov [ X , Y ]
ρ(X , Y ) = p .
V[X ] · V[Y ]
If V [ X ] = 0 or V [ Y ] = 0, then it is defined as 0.
Properties:
1. The correlation coefficient is scaling-invariant, i.e.,
ρ(X , Y ) = ρ(α · X , β · Y ) for any α, β > 0.
2. For any two random variables X , Y , ρ(X , Y ) ∈ [−1, 1].
Intro to Probability 12
Range of the Correlation Coefficient
Example
Verify that the correlation coefficients’ range satisfies ρ(X , Y ) ∈ [−1, 1].
Answer
Intro to Probability 13
Range of the Correlation Coefficient
Example
Verify that the correlation coefficients’ range satisfies ρ(X , Y ) ∈ [−1, 1].
Answer
We will only prove ρ(X , Y ) ≥ −1 (the other direction follows in
analogous way).
Intro to Probability 13
Range of the Correlation Coefficient
Example
Verify that the correlation coefficients’ range satisfies ρ(X , Y ) ∈ [−1, 1].
Answer
We will only prove ρ(X , Y ) ≥ −1 (the other direction follows in
analogous way).
Let σx2 and σy2 denote the variances of X and Y , and σx and σy their
standard deviations.
Intro to Probability 13
Range of the Correlation Coefficient
Example
Verify that the correlation coefficients’ range satisfies ρ(X , Y ) ∈ [−1, 1].
Answer
We will only prove ρ(X , Y ) ≥ −1 (the other direction follows in
analogous way).
Let σx2 and σy2 denote the variances of X and Y , and σx and σy their
standard deviations.
Then:
X Y
0≤V +
σx σY
Intro to Probability 13
Range of the Correlation Coefficient
Example
Verify that the correlation coefficients’ range satisfies ρ(X , Y ) ∈ [−1, 1].
Answer
We will only prove ρ(X , Y ) ≥ −1 (the other direction follows in
analogous way).
Let σx2 and σy2 denote the variances of X and Y , and σx and σy their
standard deviations.
Then:
X Y
0≤V +
σx σY
X Y X Y
=V +V + 2 Cov ,
σX σY σX σY
Intro to Probability 13
Range of the Correlation Coefficient
Example
Verify that the correlation coefficients’ range satisfies ρ(X , Y ) ∈ [−1, 1].
Answer
We will only prove ρ(X , Y ) ≥ −1 (the other direction follows in
analogous way).
Let σx2 and σy2 denote the variances of X and Y , and σx and σy their
standard deviations.
Then:
X Y
0≤V +
σx σY
X Y X Y
=V +V + 2 Cov ,
σX σY σX σY
V[X ] V[Y ] Cov [ X , Y ]
= + +2·
V[X ] V[Y ] σX · σX
Intro to Probability 13
Range of the Correlation Coefficient
Example
Verify that the correlation coefficients’ range satisfies ρ(X , Y ) ∈ [−1, 1].
Answer
We will only prove ρ(X , Y ) ≥ −1 (the other direction follows in
analogous way).
Let σx2 and σy2 denote the variances of X and Y , and σx and σy their
standard deviations.
Then:
X Y
0≤V +
σx σY
X Y X Y
=V +V + 2 Cov ,
σX σY σX σY
V[X ] V[Y ] Cov [ X , Y ]
= + +2·
V[X ] V[Y ] σX · σX
= 2 · (1 + ρ(X , Y )).
Intro to Probability 13