Random Variables II
This lecture note is adapted and expanded from Prof. Changsik Kim (SKKU)’s original
lecture notes.
1. Random Vectors and Joint Distribution
(1) A Random Vector
• Random Vector : We define random variables X1 , . . . , Xn on the probability space
(Ω, F, P) , then random vector X can be defined as
X1 (ω)
X (ω) = ..
,
.
Xn (ω)
then the random vector is a function such that
X (ω) : Ω → Rn
• Example 1: Consider tossing a coin, and let X1 (H) = 1 and X1 (T ) = 0, and X2 (H) = 0
and X2 (T ) = 1, and denote X = (X1 , X2 )′ , then
1 0
X(H) = , X(T ) = .
0 1
The random vector is a function from Ω → R2 .
• Example 2: Consider tossing two coins, and let X1 (H) = 1 and X1 (T ) = 0 for the first
coin, and X2 (H) = 1 and X2 (T ) = 0 for the second coin. Now denote X = (X1 , X2 )′ ,
then
1 1 0 0
X(HH) = , X(HT ) = , X(T H) = , X(T T ) =
1 0 1 0
(2) Joint Distribution
• The distribution PX of an n-dimensional random vector X = (X1 , . . . , Xn )′ can be
defined as
PX (A) = P{ω| X(ω) ∈ A}
2
for A ⊂ Rn that is a probability measure. Sometimes we call PX (A) joint distribution
since it is defined for a random vector that involves more than 2 random variables. The
distributions of a subvector are sometimes called as marginal distribution.
• Example : For a random vector X = (X1 , X2 )′ , denote the distribution of X1 as PX1 .
Then for A1 ⊂ R, it follows that
PX1 = P{ω|X1 (ω) ∈ A1 } = P X −1 (A1 × R) .
(3) Joint Distribution Function
For X = (X1 , · · · , Xn ), the (cumulative) distribution function FX is given by
FX (x1 , · · · , xn ) = P{ω| X1 (ω) ≤ x1 , . . . , Xn (ω) ≤ xn }.
Therefore, FX is a real-valued function on Rn .
2. Joint Probability Density Function
• Joint density function : If
Z
P (A) = f (x1 , . . . , xn )dx1 . . . dxn , A ⊂ Rn ,
A
then f (x1 , . . . , xn ) is a joint density function. For a discrete random variable, we have
X
P (A) = p(x1 , . . . , xn ), A ⊂ Rn
A
• Marginal density function : marginal density functions can be naturally obtained
from the joint density function. For example,
Z ∞ X
fX (x) = f (x, y)dy or fX (x) = p(x, y)
−∞ y
Example : For random variables X and Y , f (x, y) is given by
f (x, y) = 8xyI{0 < x < y < 1}.
If 0 < x < 1,
Z ∞
fX (x) = f (x, y) dy
−∞
Z 1
= 8xydy
x
= 4x 1 − x2 .
3
Moreover,
1 1
Z Z −x
1 4 2 1
P {ω|X (ω) + Y (ω) ≤ } = 8xydydx = .
2 0 x 96
4. Conditional Density and Independence
(1) The definition of conditional density
Assume that X2 = x2 is given, then the conditional density X1 is given as
f (x1 , x2 )
f1 (x1 |x2 ) = .
f2 (x2 )
• f1 (x1 |x2 ) is proportional to f (x1 , x2 ) , and f1 (x1 |x2 ) is divided by f2 (x2 ) to normalize
the conditional density.
• Example : The joint density is given by
f (x, y) = (x + y)I{0 ≤ x, y ≤ 1},
then conditional density f (x|y) is
f (x, y) f (x, y)
f (x|y) = = R∞
fY (y) −∞ f (x, y) dx
x+y x+y
= R∞ = 1 .
−∞ (x + y)dx 2 +y
(2) Statistical Independence of random variables
Consider
σ(X) = {X −1 (A)|A ∈ B(R)},
that is a σ−field generated by random variable X. This is a collection of events (sets) that
is generated by X. In a similar way, we can define σ(Y ).
• X and Y are independent iff
P(X −1 (A) ∩ Y −1 (B)) = P X −1 (A) P Y −1 (B)
for all A ∈ σ(X) and B ∈ σ(Y ). Therefore, X and Y are defined to be independent iff
f (x, y) = fX (x) fY (y) .
4
• Therefore, X and Y are independent iff
f (x|y) = f (x) .
• Example : For events E and F , random variables are defined as X = I(E) and
Y = I(Y ). Then, we have
σ(X) = {X −1 (A)|A ∈ B(R)} = {∅, Ω, E, E c },
and
σ(Y ) = {Y −1 (A)|A ∈ B(R))} = {∅, Ω, F, F c }.
If E and F are independent, and X and Y are independent.
(3) Mean and Variance of a random vector
• For any random vector X = (X1 , . . . , Xn )′ , we define
E(X1 )
.. ′
E(X) = and V ar(X) = E (X − E(X)) (X − E(X)) .
.
E(Xn )
• The diagonal elements of V ar(X) are variances of X1 , . . . , Xn , respectively, and off-
diagonal elements are covariances, and sometimes we call V ar(X) a variance-covariance
matrix of random vector X.