Analysis
y of Variance and
Design of Experiments
Experiments--I
MODULE - I
LECTURE - 3
SOME RESULTS ON LINEAR
ALGEBRA, MATRIX THEORY
AND DISTRIBUTIONS
Dr. Shalabh
D
Department
t t off M
Mathematics
th ti and
d Statistics
St ti ti
Indian Institute of Technology Kanpur
2
Linear model
Suppose there are n observations. In the linear model, we assume that these observations are the values taken by n
random variables Y1 , Y2 ,.., Yn satisfying the following conditions:
• E (Yi ) is a linear combination of p unknown parameters β1 , β 2 ,..., β p with
E (Yi ) = xi1β1 + xi 2 β 2 + ... + xip β p , i = 1, 2,..., n
where xij ' s are known constants.
• Y1 , Y2 ,..., Yn are uncorrelated and normality distributed with variance Var (Yi ) = σ 2 .
(0 σ 2 ) , as
The linear model can be rewritten by introducing independent normal random variables following N (0,
. Yi = xi1β1 + xi 2 β 2 + .... + xip β p + ε i , i = 1, 2,..., n.
These equations can be written using the matrix notations as
Y = Xβ +ε
where Y is a n x1 vector of observation, X is a n × p matrix of n observations ( xij 's) on each of X 1 , X 2 ,..., X p variables,
β is a p × 1 vector of parameters and ε is a n×1 vector of random error components with ε ~ N (0, σ 2 I n ). Here Y
is called study or dependent variable, X 1 , X 2 ,..., X p are called explanatory or independent variables and
β1 , β 2 ,..., β p are called as regression coefficients.
3
Alternatively since Y ~ N ( X β , σ I ) so the linear model can also be expressed in the expectation form as a normal
2
random variable Y with
E (Y ) = X β
Var (Y ) = σ 2 I .
Note that β and σ are unknown but X is known.
2
Estimable function
A linear
li parametric
t i ffunction
ti λ ' β off the
th parameter
t iis said
id to
t be
b an estimable
ti bl parametric
t i ffunction
ti or estimable
ti bl if th
there
exists a linear function of random variables A 'Y of Y where Y = (Y1 , Y2 ,..., Yn ) ' such that
E (A ' Y ) = λ ' β
with A = (A1 , A 2 ,..., A n ) ' and λ = (λ1 , λ2 ,..., λn ) ' being the vectors of known scalars.
4
Best Linear Unbiased Estimates (BLUE)
The unbiased minimum variance linear estimate A 'Y of an estimable function λ ' β is called the best linear unbiased
estimate of λ β .
'
• Suppose
S A 1' Y and th BLUE off λ1' β and
d A '2Y are the d λ2' β respectively.
ti l
Then (a1A1 + a2 A 2 ) ' Y is the BLUE of ( a1λ1 + a2 λ2 ) ' β .
• If λ ' β is estimable, its best estimate is λ ' βˆ where β̂ is any solution of the equations X ' X β = X ' Y .
Least squares estimation
The least squares estimate of β in Y = X β + ε is the value of β which minimizes the error sum of squares ε ' ε .
Let S = ε ' ε = (Y − X β ) '(Y − X β )
= Y 'Y − 2β ' X 'Y + β ' X ' X β .
Minimizing S with respect to β involves
∂S
=0
∂β
⇒ X ' X β = X 'Y
which is termed as normal equation.
5
This normal equation has a unique solution given by
βˆ = ( X ' X ) −1 X ' Y
∂2S
assuming rank ( X ) is a positive definite matrix. So βˆ = ( X ' X ) X ' Y is the value of β
−1
= p. Note that = X 'X
∂β∂β '
which minimizes ε ' ε and is termed as ordinary
y least squares
q estimator of β .
• In this case, β1 , β 2 ,..., β p are estimable and consequently all the linear parametric function are estimable.
• E ( βˆ ) = ( X ' X ) −1 X ' E (Y ) = ( X ' X ) −1 X ' X β = β
V ( βˆ ) = ( X ' X ) −1 X 'Var
• Var V (Y ) X ( X ' X ) −1 = σ 2 ( X ' X ) −1
• If λ ' βˆ and μ ' βˆ are the estimates of λ ' β and μ ' β respectively, then
Var (λ ' βˆ ) = λ 'Var ( βˆ )λ = σ 2 [λ '( X ' X ) −1 λ ]
Cov (λ ' βˆ , μ ' βˆ ) = σ 2 [ μ '( X ' X ) −1 λ ].
• Y − X βˆ is called the residual vector and
E (Y − X βˆ ) = 0.
0
6
Linear model with correlated observations
In the linear model
Y = Xβ +ε
with E (ε ) = 0, Var (ε ) = Σ and ε is normally distributed, we find
E (Y ) = X β , Var (Y ) = Σ.
Assuming Σ to be positive definite, we can write
Σ = P'P
where P is a nonsingular matrix. Premultiplying Y = X β + ε by P, we get
PY = PX β + Pε
or Y* = X * β + ε *
where Y * = PY , X * = PX and ε * = Pε .
Note that β and σ 2 are unknown but X is known.
7
Distribution of A 'Y
In the linear model Y = X β + ε , ε ~ N (0, σ I ) consider a linear function A 'Y which is normally distributed with
2
E (A ' Y ) = A ' X β ,
V (A ' Y ) = σ 2 (A ' A).
Var )
Then
A 'Y ⎛ A' X β ⎞
~ N⎜ ,1⎟ .
σ A 'A ⎝ σ A 'A ⎠
(A ' y ) 2
Further, has a noncentral Chi-square distribution with one degrees of freedom and noncentrality parameter
σ 2A ' A
(A ' X β ) 2
.
σ 2A ' A
Degrees of freedom
A linear function A 'Y of the observations (A ≠ 0) is said to carry one degrees of freedom. A set of r linear functions L ' Y ,
where L is r x n matrix, is said to have M degrees of freedom if there exists M linearly independent functions in the set
and no more. Alternatively, the degrees of freedom carried by the set L ' Y equals rank (L). When the set L ' Y are
the estimates of Λ ' β , the degrees of freedom of the set L ' Y will also be called the degrees of freedom for the
estimates of Λ ' β .
8
Sum of squares
Y 'A
If A 'Y is a linear function of observations, then the projection of Y on A is the vector .A . The square of this
A 'A
(A ' Y ) 2
projection is called the sum of squares (SS) due to A ' y is given by . Since A 'Y has one degree of freedom,
A 'A
so the SS due A 'Y to has one degree of freedom.
The sum of squares and the degrees of freedom arising out of the mutually orthogonal sets of functions can be added
together to give the sum of squares and degrees of freedom for the set of all the function together and vice versa.
Let X = ( X 1 , X 2 ,..., X n ) has a multivariate normal distribution with mean vector μ and positive definite covariance
matrix Σ . Let the two quadratic forms
• X ' A1 X is distributed as χ 2 with n1 degrees of freedom and noncentrality parameter μ ' A1μ and
• X ' A2 X is distributed as χ 2 with n2 degrees of freedom and noncentrality parameter μ ' A2 μ .
Then X ' A1 X and X ' A2 X are independently distributed if A1ΣA2 = 0.
9
Fisher-Cochran theorem
If X = ( X 1 , X 2 ,..., X n ) has multivariate normal distribution with mean vector μ and positive definite covariance
matrix Σ and let X ' Σ −1 X = Q1 + Q2 + ... + Qk
where
h Qi = X ' Ai X with k ( Ai ) = N i , i = 1,
ith rank 1 2,..., Th Qi ' s are independently
2 k . Then i d d tl distributed
di t ib t d noncentral
t l Chi-square
Chi
k
distribution with Ni degrees of freedom and noncentrality parameter μ ' Ai μ if and only if ∑N
i =1
i = N , in which case
k
μ ' Σ −1μ = ∑ μ ' Ai μ .
i 1
i=
10
Derivatives of quadratic and linear forms
Let X = ( x1 , x2 ,..., xn ) ' and f(X) be any function of n independent variables x1 , x2 ,..., xn ,
then ⎛ ∂f ( X ) ⎞
⎜ ∂x ⎟
⎜ 1 ⎟
⎜ ∂f ( X ) ⎟
∂f ( X ) ⎜ ⎟
= ⎜ ∂x2 ⎟ .
∂X
⎜ # ⎟
⎜ ⎟
⎜ ∂f ( X ) ⎟
⎜ ∂x ⎟
⎝ n ⎠
∂K ' X
If K = ( k1 , k 2 ,..., k n ) ' is a vector of constants, then = K.
∂X
∂X ' AX
If A is an n × n matrix, then = 2( A + A ') X .
∂X
∂X
Independence of linear and quadratic forms
• Let Y be an n ×1 vector having multivariate normal distribution N (μ , I ) and B be an m × n matrix.
matrix Then m × 1
vector linear form BY is independent of the quadratic form Y ' AY if BA = 0 where A is a symmetric matrix of known
elements.
• Let Y be an n ×1 vector having multivariate normal distribution N (μ , Σ) with rank (Σ) = n. If BΣA = 0, then the
quadratic form Y ' AY is independent of linear form BY where B is an m × n matrix.