Random vectors
Random quadratic forms
Independence
Linear Statistical Models: Random vectors
Notes by Yao-ban Chan and Owen Jones
Linear Statistical Models: Random vectors
1/53
Random vectors
Random quadratic forms
Independence
Random vectors
The theory of linear algebra provides us with a good grounding to
analyse our linear models. However we must still do some more
groundwork. Once we have done this, the theoretical results come
out quite easily!
Previously, we were thinking of matrices and vectors simply as a
bunch of numbers. However, there is no reason why we cant think
of them as a bunch of random variables!
We can then extend the traditional concepts of expectation,
variance, etc. to random vectors.
Linear Statistical Models: Random vectors
2/53
Random vectors
Random quadratic forms
Independence
Expectation
Although traditionally random variables are denoted with capital
letters, in keeping with our linear algebra notation, we will denote
them by lowercase.
We define the expectation of a random vector y to be the vector
of expectations of its components:
y1
E [y1 ]
y2
E [y2 ]
If y = . then E [y] =
.. .
..
.
yk
E [yk ]
Linear Statistical Models: Random vectors
3/53
Random vectors
Random quadratic forms
Independence
Expectation properties
I
If a is a vector of constants, then E [a] = a.
If a is a vector of constants, then E [aT y] = aT E [y].
If A is a matrix of constants, then E [Ay] = AE [y].
Example. Let
A=
2 3
1 4
,y =
y1
y2
and assume that E [y1 ] = 10 and E [y2 ] = 20. Then
2 3
10
80
AE [y] =
=
.
1 4
20
90
Linear Statistical Models: Random vectors
4/53
Random vectors
Random quadratic forms
Independence
On the other hand,
E [Ay] = E
=
=
=
Linear Statistical Models: Random vectors
2y1 + 3y2
y1 + 4y2
E [2y1 + 3y2 ]
E [y1 + 4y2 ]
2E [y1 ] + 3E [y2 ]
E [y1 ] + 4E [y2 ]
80
= AE [y].
90
5/53
Random vectors
Random quadratic forms
Independence
Variance
Defining the variance of a random vector is slightly trickier. We
want to not just include the variance of the variables themselves,
but also how the variables affect each other.
Recall that the variance of a random variable Y with mean is
defined to be E [(Y )2 ]. Now let y be as before, a k 1 vector
of random variables. We define the variance of y (sometimes
known as the covariance matrix) to be
var y = E [(y )(y )T ]
where = E [y].
Linear Statistical Models: Random vectors
6/53
Random vectors
Random quadratic forms
Independence
The diagonal elements of the covariance matrix are just the
variances of the individual elements of y:
[var y]ii = var yi , i = 1, 2, . . . , k .
The off-diagonal elements of the covariance matrix are the
covariances of the individual elements:
[var y]ij = cov(yi , yj ) = E [(yi i )(yj j )].
This means that all covariance matrices are symmetric.
Linear Statistical Models: Random vectors
7/53
Random vectors
Random quadratic forms
Independence
Variance properties
Suppose that y is a random vector with var y = V . Then
I
If a is a vector of constants, then var aT y = aT V a.
If A is a matrix of constants, then var Ay = AVAT .
These can be derived from first principles quite easily.
It follows that any covariance matrix is symmetric and positive
semidefinite.
Linear Statistical Models: Random vectors
8/53
Random vectors
Random quadratic forms
Independence
Example. Let
y1
y = y2
y3
be a random vector, such that var yi = 2 for all i , and that the
elements of y are independent. This means that cov(yi , yj ) = 0 for
i 6= j , so the covariance matrix of y is
2
0 0
var y = V = 0 2 0 = 2 I .
0 0 2
Linear Statistical Models: Random vectors
9/53
Random vectors
Random quadratic forms
Independence
Example continued. Assume that X is a matrix of full rank (with
more rows than columns), which implies that X T X is nonsingular.
Let
z = (X T X )1 X T y = Ay
then
var z = AVAT = [(X T X )1 X T ] 2 I [(X T X )1 X T ]T
= (X T X )1 X T (X T )T [(X T X )1 ]T 2
= (X T X )1 X T X [(X T X )T ]1 2
= (X T X )1 2 .
We will be using this quite a bit later on!
Linear Statistical Models: Random vectors
10/53
Random vectors
Random quadratic forms
Independence
Matrix square root
A square matrix A has a square root if there exists a matrix B , the
same size, such that B 2 = A. In general the square root is not
unique.
For a symmetric positive semidefinite matrix A, there is a unique
symmetric positive semidefinite square root, called the principle
root, denoted A1/2 .
Suppose that P diagonalises A, that is P T AP = . Then
A = P P T
= P 1/2 P T P 1/2 P T
A1/2 = P 1/2 P T .
Linear Statistical Models: Random vectors
11/53
Random vectors
Random quadratic forms
Independence
Multivariate normal
Definition
Let z be a k 1 vector of i.i.d. standard normal r.v.s, A an n k
matrix, and b an n 1 vector, then we say that
x = Az + b
is (an n dimensional) multivariate normal, with mean = E x = b
and covariance matrix = var x = AAT .
We write x MVN (, ) or just x N (, ).
For any and any symmetric positive semidefinite matrix , let z
be a vector of i.i.d. standard normals, then
+ 1/2 z MVN (, ).
Linear Statistical Models: Random vectors
12/53
Random vectors
Random quadratic forms
Independence
If x MVN (, ) and is k k positive definite, then x has the
density
1
1
T 1
f (x) =
e 2 (x) (x) .
k
/2
1/2
(2) ||
Note that a symmetric positive definite matrix is necessarily
invertible. Also, be aware that some authors require the covariance
matrix to be positive definite, rather than just positive
semi-definite.
Linear Statistical Models: Random vectors
13/53
Random vectors
Random quadratic forms
Independence
If x MVN (, ) is k 1, A is n k , and b is n 1, then
y = Ax + b MVN (A + b, AAT ).
To see why, put x = 1/2 z + , then
y = A1/2 z + A + b.
Linear Statistical Models: Random vectors
14/53
Random vectors
Random quadratic forms
Independence
If the random vector z = (z1 , z2 )T is multivariate normal, then z1
and z2 are independent if and only if they are uncorrelated.
In general, if z1 and z2 are normal random variables, z = (z1 , z2 )T
does not have to me multivariate normal. Moreover, z1 and z2 can
be uncorrelated but not independent.
For example, suppose that z1 N (0, 1) and u U (1, 1), then
z2 = z1 sign(u) N (0, 1), but z = (z1 , z2 )T is not multivariate
normal. (Consider its support.) Moreover z1 and z2 are
uncorrelated, but clearly dependent.
Linear Statistical Models: Random vectors
15/53
Random vectors
Random quadratic forms
Independence
R example: multivariate normal
To generate a sample of size 100 with distribution
3
1 0.8
MVN
,
1
0.8
1
>
>
>
>
>
library(MASS)
a <- matrix(c(3, 1), 2, 1)
V <- matrix(c(1, .8, .8, 1), 2, 2)
y <- mvrnorm(100, mu = a, Sigma = V)
plot(y[,1], y[,2])
Linear Statistical Models: Random vectors
16/53
Random quadratic forms
Independence
1
1
y[, 2]
Random vectors
y[, 1]
Linear Statistical Models: Random vectors
17/53
Random vectors
Random quadratic forms
Independence
Alternatively, starting with standard normals
>
>
>
>
>
P <- eigen(V)$vectors
sqrtV <- P %*% diag(sqrt(eigen(V)$values)) %*% t(P)
z <- matrix(rnorm(200), 2, 100)
y_new <- sqrtV %*% z + rep(a, 100)
points(y_new[1,], y_new[2,], col = "red")
Linear Statistical Models: Random vectors
18/53
Random quadratic forms
Independence
1
1
y[, 2]
Random vectors
y[, 1]
Linear Statistical Models: Random vectors
19/53
Random vectors
Random quadratic forms
Independence
Random quadratic forms
Just as we can consider vectors and matrices to be composed of
random variables, we can see what happens when these random
vectors are combined into quadratic forms. The result is a function
of random variables which is scalar (not vector), and so it is itself a
random variable.
Quadratic forms will pop up regularly in our analysis of linear
models. To fully analyse our models, we will want to know the
distribution of these forms, under the assumptions that we make
on the distribution of the variables in the model.
Linear Statistical Models: Random vectors
20/53
Random vectors
Random quadratic forms
Independence
Theorem
Let y be a random vector with E [y] = and var y = V , and let
A be a matrix of constants. Then
E [yT Ay] = tr (AV ) + T A.
Linear Statistical Models: Random vectors
21/53
Random vectors
Linear Statistical Models: Random vectors
Random quadratic forms
Independence
22/53
Random vectors
Random quadratic forms
Independence
Example. Let y be a 2 1 random vector with
1
2 1
=
,V =
.
3
1 5
Let
A=
4 1
1 2
.
Consider the quadratic form
yT Ay = 4y12 + 2y1 y2 + 2y22 .
Linear Statistical Models: Random vectors
23/53
Random vectors
Random quadratic forms
Independence
The expectation of this form is
E [yT Ay] = 4E [y12 ] + 2E [y1 y2 ] + 2E [y22 ].
From the definition of variance and the given covariance matrix,
2 = var y1 = E [y12 ] E [y1 ]2 = E [y12 ] 1
5 = var y2 = E [y22 ] E [y2 ]2 = E [y22 ] 9
so E [y12 ] = 3 and E [y22 ] = 14.
Linear Statistical Models: Random vectors
24/53
Random vectors
Random quadratic forms
Independence
From the definition of covariance and the given covariance matrix,
1 = cov(y1 , y2 ) = E [y1 y2 ] E [y1 ]E [y2 ] = E [y1 y2 ] 3
so E [y1 y2 ] = 4. This gives
E [yT Ay] = 4 3 + 2 4 + 2 14 = 48.
Linear Statistical Models: Random vectors
25/53
Random vectors
Random quadratic forms
Independence
From the theorem,
E [yT Ay] = tr (AV ) + T A
4 1
4 1
2 1
1
= tr
+ 1 3
1 2
1 5
1 2
3
7
9 9
= tr
+ 1 3
4 11
7
= 9 + 11 + 7 + 21 = 48.
Linear Statistical Models: Random vectors
26/53
Random vectors
Random quadratic forms
Independence
Noncentral 2 distribution
Definition
Let y = (yi ) be a k 1 normally distributed
vector with
Prandom
k
T
2
mean and variance I . Then x = y y = i=1 yi follows a
noncentral 2 distribution with k degrees of freedom and
noncentrality parameter = 12 T . We write x 2k , .
Warning: some authors define to be T .
Note that the distribution of x depends on only through .
Linear Statistical Models: Random vectors
27/53
Random vectors
Random quadratic forms
Independence
Suppose y MVN (, Ik ) and x = yT y 2k , . Then
E [x ] = tr (Ik ) + T = k + 2.
The noncentrality parameter = 12 T is zero if and only if
= 0, in which case x is just the sum of k i.i.d. standard normals.
That is, x has an ordinary (central) 2 distribution with k degrees
of freedom.
Linear Statistical Models: Random vectors
28/53
Random quadratic forms
Independence
0.10
0.05
0.00
chisq 4 df lambda = 0, 1, 2
0.15
Random vectors
10
15
x
Linear Statistical Models: Random vectors
29/53
Random vectors
Random quadratic forms
Independence
Theorem
Let Xk21 ,1 , Xk22 ,2 , . . . , Xk2n ,n be a collection of n independent
noncentral 2 random variables, with k1 , k2 , . . . , kn degrees of
freedom respectively and noncentrality parameters 1 , 2 , . . . , n
respectively. Then
n
X
Xk2i ,i
i=1
Pn
has a noncentral
distribution
Pn with i=1 ki degrees of freedom
and noncentrality parameter i=1 i .
2
If we set i = 0 for all i , we get the result that the sum of
independent 2 variables is another 2 variable.
Linear Statistical Models: Random vectors
30/53
Random vectors
Random quadratic forms
Independence
Distribution of quadratic forms
Theorem
Let y be a n 1 normally distributed random vector with mean
and variance I and let A be a n n symmetric matrix. Then
yT Ay has a noncentral 2 distribution with k degrees of freedom
and noncentrality parameter = 12 T A if and only if A is
idempotent and has rank k .
Linear Statistical Models: Random vectors
31/53
Random vectors
Linear Statistical Models: Random vectors
Random quadratic forms
Independence
32/53
Random vectors
Linear Statistical Models: Random vectors
Random quadratic forms
Independence
33/53
Random vectors
Random quadratic forms
Independence
Corollary
Let y be a n 1 normally distributed random vector with mean 0
and variance I and let A be a n n symmetric matrix. Then
yT Ay has a (ordinary) 2 distribution with k degrees of freedom if
and only if A is idempotent and has rank k .
Corollary
Let y be a n 1 normally distributed random vector with mean
and variance 2 I and let A be a n n symmetric matrix. Then
1 T
y Ay has a noncentral 2 distribution with k degrees of
2
freedom and noncentrality parameter = 21 2 T A if and only if
A is idempotent and has rank k .
Linear Statistical Models: Random vectors
34/53
Random vectors
Random quadratic forms
Independence
Example. Let y1 and y2 be independent normal random variables
with means 3 and -2 respectively and common variance 1. Let
1 1 1
A=
.
2 1 1
It is easy to verify that A is symmetric and idempotent, and has
rank 1. Therefore
yT Ay =
1
y1 y2
2
1 1
1 1
y1
y2
1
1
= y12 + y1 y2 + y22
2
2
has a noncentral 2 distribution with 1 degree of freedom and
noncentrality parameter
1 1
1
1
3
3 2
= .
=
1 1
2
4
4
Linear Statistical Models: Random vectors
35/53
Random vectors
Random quadratic forms
Independence
What happens if y does not have variance I ?
Theorem
Let y be a n 1 normal random vector with mean and variance
V , and let A be a n n symmetric matrix. Then yT Ay has a
noncentral 2 distribution with k degrees of freedom and
noncentrality parameter = 21 T A if and only if AV is
idempotent and has rank k .
Linear Statistical Models: Random vectors
36/53
Random vectors
Random quadratic forms
Independence
Corollary
Let y be a n 1 normal random vector with mean 0 and variance
V and let A be a n n symmetric matrix. Then yT Ay has a
(ordinary) 2 distribution with k degrees of freedom if and only if
AV is idempotent and has rank k .
Corollary
Let y be a n 1 normal random vector with mean and variance
V of full rank. Then yT V 1 y has a noncentral 2 distribution
with n degrees of freedom and noncentrality parameter
= 21 T V 1 .
Linear Statistical Models: Random vectors
37/53
Random vectors
Random quadratic forms
Independence
R example: noncentral chisquared
Consider the quadratic form yT Ay with
1
3
1 0.8
1 1
y MVN a =
,V =
, A=
.
1
0.8
1
3.6 1 1
> A <- matrix(1/3.6, 2, 2)
> A %*% V
[,1] [,2]
[1,] 0.5 0.5
[2,] 0.5 0.5
> library(Matrix)
> (df <- rankMatrix(A %*% V)[1])
[1] 1
> (lambda <- t(a) %*% A %*% a / 2)
[,1]
[1,] 2.222222
Linear Statistical Models: Random vectors
38/53
Random vectors
Random quadratic forms
Independence
> quadform <- function(y, A) t(y) %*% A %*% y
> x <- apply(y, 1, quadform, A = A)
> mean(x)
[1] 5.198274
> df + 2*lambda
[,1]
[1,] 5.444444
> hist(x, freq=F)
> curve(dchisq(x, df, 2*lambda), add = TRUE)
Linear Statistical Models: Random vectors
39/53
Random vectors
Random quadratic forms
Independence
0.08
0.06
0.00
0.02
0.04
Density
0.10
0.12
0.14
Histogram of x
10
15
20
x
Linear Statistical Models: Random vectors
40/53
Random vectors
Random quadratic forms
Independence
Example. Let y1 and y2 follow a multivariate normal distribution
with means
-1 and 4 respectively, and covariance matrix
3 2
V =
. Then
2 2
1
2 2
1 1
1
V =
=
,
3
1 3/2
3 2 2 2 2
and the quadratic form
3
yT V 1 y = y12 2y1 y2 + y22
2
has a noncentral 2 distribution with 2 degrees of freedom and
noncentrality parameter
1
33
1 1
1
1 4
=
= .
1 3/2
4
2
2
Linear Statistical Models: Random vectors
41/53
Random vectors
Random quadratic forms
Independence
Independence of quadratic forms
Sometimes we will want to know when two quadratic forms are
independent. The next theorem tells us when this happens.
Theorem
Let y be a n 1 normal random vector with mean and variance
V of full rank, and let A and B be symmetric n n matrices.
Then yT Ay and yT B y are independent if and only if
AVB = 0.
Linear Statistical Models: Random vectors
42/53
Random vectors
Linear Statistical Models: Random vectors
Random quadratic forms
Independence
43/53
Random vectors
Linear Statistical Models: Random vectors
Random quadratic forms
Independence
44/53
Random vectors
Linear Statistical Models: Random vectors
Random quadratic forms
Independence
45/53
Random vectors
Random quadratic forms
Independence
Example. Let y1 and y2 follow a multivariate normal distribution
with covariance matrix
1 c
V =
.
c 1
Consider the symmetric matrices
1 0
0 0
A=
,B =
.
0 0
0 1
It is obvious that
yT Ay = y12 , yT B y = y22 .
Linear Statistical Models: Random vectors
46/53
Random vectors
Random quadratic forms
Independence
Now these quadratic forms will be independent if and only if
AVB
1 0
0 0
1 c
c 1
1 0
0 0
0 c
0 1
0 c
0 0
=
=
=
0 0
0 1
is the 0 matrix. But this happens if and only if c = 0, i.e. if y1 and
y2 have zero covariance.
Linear Statistical Models: Random vectors
47/53
Random vectors
Random quadratic forms
Independence
Corollary
Let y be a random normal vector with mean and variance 2 I ,
and let A and B be symmetric matrices. Then yT Ay and yT B y
are independent if and only if AB = 0.
Linear Statistical Models: Random vectors
48/53
Random vectors
Random quadratic forms
Independence
Next we consider when a quadratic form is independent of a
random vector. Firstly, we define a random variable to be
independent of a random vector if and only if it is independent of
all elements of that vector.
Theorem
Let y be a n 1 normal random vector with mean and variance
V , and let A be a n n symmetric matrix and B a m n matrix.
Then yT Ay and B y are independent if and only if BVA = 0.
Lastly, we can combine several of the theorems we have seen
before to tell when a group of quadratic forms (more than two) are
independent.
Linear Statistical Models: Random vectors
49/53
Random vectors
Random quadratic forms
Independence
Theorem
Let y be a normal random vector with mean and variance I , and
let A1 , A2 , . . . , Am be a collection of m symmetric matrices. If any
two of the following statements are true:
I
All Ai are idempotent;
Pm
i=1 Ai is idempotent;
Ai Aj = 0 for all i 6= j ;
then so is the third, and
I
For all i , yT Ai y has a noncentral 2 distribution with r (Ai )
degrees of freedom and noncentrality parameter
i = 12 T Ai ;
yT Ai y and yT Aj y are independent for i 6= j ; and
Pm
Pm
i=1 r (Ai ) = r ( i=1 Ai ).
Linear Statistical Models: Random vectors
50/53
Random vectors
Random quadratic forms
Independence
P
When i Ai = I , the previous result can be seen as a special case
of the following result (which we will not prove):
Theorem (Cochran-Fisher Theorem)
Let y be a n 1 normal random vector with mean and variance
2 I . Decompose the sum of squares of y/ into the quadratic
forms
m
X 1
1 T
y
y
=
yT Ai y.
2
2
i=1
Then the quadratic forms are independent and have noncentral 2
distributions with parameters r (Ai ) and 21 2 T Ai , respectively,
if and only if
m
X
r (Ai ) = n.
i=1
Linear Statistical Models: Random vectors
51/53
Random vectors
Random quadratic forms
Independence
Example
> A <- matrix(1, 2, 2)
> B <- matrix(c(1,-1,-1,1), 2, 2)
> A %*% B
[1,]
[2,]
>
>
>
>
[,1] [,2]
0
0
0
0
y <- mvrnorm(200, c(0, 0), diag(c(2, 2)))
x1 <- apply(y, 1, quadform, A = A)
x2 <- apply(y, 1, quadform, A = B)
cor(x1, x2)
[1] 0.0662571
Linear Statistical Models: Random vectors
52/53
Random vectors
Random quadratic forms
Independence
15
0
10
x2
20
25
30
> plot(x1, x2)
10
Linear Statistical Models: Random vectors
20
x1
30
40
53/53