0% found this document useful (0 votes)

155 views69 pages

MV - Principal Components Using SAS

This document discusses principal component analysis using SAS. It begins by explaining the basic principles of PCA, which are to reduce the dimensionality of data while retaining as much information as possible. It then provides details on how to calculate principal components for both population and sample data. Specifically, it shows how to derive the principal components from the covariance matrix by solving optimization problems to find linear combinations of the original variables with maximum variance. These principal components form a new coordinate system that requires fewer dimensions to explain the variance in the data.

Uploaded by

Nadia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

155 views69 pages

MV - Principal Components Using SAS

Uploaded by

Nadia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 69

Principal Components Using SAS

Prof. Dr. Mudassir Uddin

Department of Statistics
University of Karachi

1
Principal Components Analysis
A. The Basic Principle
We wish to explain/summarize the underlying variance-
covariance structure of a large set of variables through a
few linear combinations of these variables. The
objectives of principal components analysis are

- data reduction

- interpretation

The results of principal components analysis are often

used as inputs to

- regression analysis

- cluster analysis
B. Population Principal Components
Suppose we have a population measured on p random
variables X1,…,Xp. Note that these random variables
represent the p-axes of the Cartesian coordinate system in
which the population resides. Our goal is to develop a
new set of p axes (linear combinations of the original p
axes) in the directions of greatest variability:
X2

This is accomplished by rotating the axes.

Consider our random vector

 X1 
X 
X =  
2

  
 
 X p 

with covariance matrix 

~ and eigenvalues 1  2    p.

We can construct p linear combinations

Y1 = a'1 X = a11 X 1 + a12X 2 +  + a1p X p

' 
Y2 = a2 X = a21X 1 + a22X 2 +  + a2p X p
 

Yp = a'p X = a p1 X 1 + a p2X 2 +  + a pp X p
 
It is easy to show that

Var  Yi  = a'iΣai, i = 1,  , p
  '
Cov  Yi, Yk  = aiΣak, i, k = 1,  , p
 
The principal components are those uncorrelated linear
combinations Y1,…,Yp whose variances are as large as
possible.
Thus the first principal component is the linear
combination of maximum variance, i.e., we wish to solve
the nonlinear optimization problem
max a'1Σa1
source of a1
   restrict to
nonlinearit
y
st a'1a1 = 1 coefficient
  vectors of unit
length
The second principal component is the linear
combination of maximum variance that is uncorrelated
with the first principal component, i.e., we wish to solve
the nonlinear optimization problem
max a'2Σa2
a2
  
restricts
st a'2a2 = 1 covariance
 
'
a1Σa2 = 0 to zero
 
The third principal component is the solution to the
nonlinear optimization problem
max a'3Σa3
a3
  
st a'3a3 = 1
'  restricts
a1Σa3 = 0 covariance
'   s to zero
a2Σa3 = 0
Generally, the ith principal component is the linear
combination of maximum variance that is uncorrelated
with all previous principal components, i.e., we wish to
solve the nonlinear optimization problem
max a'iΣai
ai
  
st a'iai = 1
' 
ak Σai = 0 k < i
 
We can show that, for random vector X ~
with covariance
matrix ~ and eigenvalues 1  2    p  0, the ith
principal component is given by

Yi = e'i X = e'i1X 1 + e'i2X 2 +  + e'ip X p, i = 1,  , p

 
Note that the principal components are not unique if
some eigenvalues are equal.
We can also show for random vector X with covariance
~
matrix  and eigenvalue-eigenvector pairs (1 , e1), …, (p ,
~ ~
e~p) where 1  2    p,
p p
σ11 +  + σ pp =  Var  X  =
i =1
i λ1 +  + λ p =  Var Y 
i =1
i

so we can assess how well a subset of the principal

components Yi summarizes the original random variables
Xi – one common method of doing so is
λk proportion of total
p population variance
 λi due to the kth
i =1 principal
component
If a large proportion of the total population variance can
be attributed to relatively few principal components, we
can replace the original p variables with these principal
components without loss of much information!
We can also easily find the correlations between the
original random variables Xk and the principal
components Yi:

eik λi
ρYi,X k =
σkk

These values are often used in interpreting the principal

components Yi.
Example: Suppose we have the following population of
four observations made on three random variables X1, X2,
and X3:

X1 X2 X3
1.0 6.0 9.0
4.0 12.0 10.0
3.0 12.0 15.0
4.0 10.0 12.0

Find the three population principal components Y1, Y2,

and Y3:
First we need the covariance matrix :
~
1.50 2.50 1.00
 
Σ = 2.50 6.00 3.50

1.00 3.50 5.25

and the corresponding eigenvalue-eigenvector pairs:

 0.2910381
 
λ1 = 9.9145474, e1 =  0.7342493
 0.6133309
 0.4150386
 
λ2 = 2.5344988, e2 =  0.4807165
-0.7724340
 0.8619976
 
λ3 = 0.3009542, e3 = -0.4793640 
 0.1648350
so the principal components are:

Y1 = e'1X = 0.2910381X 1 + 0.7342493X 2 + 0.6133309X 3

'
Y2 = e2X = 0.4150386X 1 + 0.4807165X 2 - 0.7724340X 3
' 
Y3 = e3 X = 0.8619976X 1 - 0.4793640X 2 + 0.1648350X 3
 

Note that

σ11 + σ22 + σ33 = 2.0 + 8.0 + 7.0 = 17.0

= 9.9145474 + 2.5344988 + 0.3009542 = λ1 + λ2 + λ3
and the proportion of total population variance due to
the each principal component is
λ1 9.9145474
p
= = 0.777611529
17.0
λ
i=1
i

λ2 2.5344988
p
= = 0.198784220
17.0
λ
i=1
i

λ3 0.3009542
p
= = 0.023604251
17.0
λ
i=1
i

Note that the third principal component is relatively

irrelevant!
Next we obtain the correlations between the original
random variables Xi and the principal components Yi:

e11 λ1 0.2910381 9.9145474

ρY1,X1 = = = 0.610935027
σ11 1.50
e21 λ1 0.7342493 9.9145474
ρY1,X2 = = = 0.385326368
σ22 6.00
e31 λ1 0.6133309 9.9145474
ρY1,X3 = = = 0.367851033
σ33 5.25
e12 λ2 0.4150386 2.5344988
ρY2,X1 = = = 0.440497325
σ11 1.50
e22 λ2 0.4807165 2.5344988
ρY2,X2 = = = 0.127550987
σ21 6.00
e32 λ2 -0.7724340 2.5344988
ρY2,X3 = = = -0.234233023
σ33 5.25
e13 λ3 0.8619976 0.3009542
ρY3,X1 = = = 0.315257191
σ11 1.50
e23 λ3 -0.4793640 0.3009542
ρY3,X2 = = = -0.043829283
σ22 6.00
e33 λ3 0.1648350 0.3009542
ρY3,X3 = = = 0.017224251
σ33 5.25
We can display these results in a correlation matrix:

X1 X2 X3
Y1 0.6109350 0.3853264 0.3678510
Y2 0.4404973 0.1275510 -0.2342330
Y3 0.3152572 -0.0438293 0.0172243

Here we can easily see that

- the first principal component (Y1) is a mixture of all three
random variables (X1, X2, and X3)
- the second principal component (Y2) is a trade-off
between X1 and X3
- the third principal component (Y3) is a residual of X1
When the principal components are derived from an X ~
~
Np(,) distributed population, the density of X is
~~ ~
constant on the -centered
~
ellipsoids

   
'
x - μ Σ x - μ = c2
    

which have axes

c λi, i = 1,  , p

where (i,ei) are the eigenvalue-eigenvector pairs of .

~ ~
We can set  = 0 w.l.g. – we can then write
~ ~

1 ' 1 '
   
2 2
2 '
c = x Σx = e1x ++ epx
  λ1   λp  

where the e'i x are the principal components of x.

~
 '
Setting y i = ei x and substituting into the previous

expression yields

1 2
2 1 2
c = y1 +  + yp
λ1 λp
which defines an ellipsoid (note that i > 0  i) in a
coordinate system with axes y1,…,yp lying in the
directions of e~~1,…,e
~
~ p, respectively.
The major axis lies in the direction determined by the
eigenvector ei associated with the largest eigenvalue i -
~
the remaining minor axes lie in the directions
determined by the other eigenvectors.
Example: For the principal components derived from the
following population of four observations made on three
random variables X1, X2, and X3:

X1 X2 X3
1.0 6.0 9.0
4.0 12.0 10.0
3.0 12.0 15.0
4.0 10.0 12.0

plot the major and minor axes.

We will need the centroid :

 3.0
 
μ = 10.0
 11.5

The direction of the major axis is given by

e'1X = 0.2910381X 1 + 0.7342493X 2 + 0.6133309X 3

 
while the directions of the two minor axis are given by

e'2X = 0.4150386X 1 + 0.4807165X 2 - 0.7724340X 3

 
e'3 X = 0.8619976X 1 - 0.4793640X 2 + 0.1648350X 3
 
We first graph the centroid:

3.0,10.0,15 X
1
.0

X3
…then use the first eigenvector to find a second point on
the first principal axis:
X2

The line connecting these two points is the Y1 axis.

…then do the same thing with the second eigenvector:
Y2
X2

The line connecting these two points is the Y2 axis.

…and do the same thing with the third eigenvector:
Y2
X2

The line connecting these two points is the Y3 axis.

What we have done is a rotation…
Y2

X3
and a translation in p = 3 dimensions.
Y2 Y2
X2

Note that the rotated

axes remain
orthogonal! Y1

X3
Note that we can also construct principal components for
the standardized variables Zi:

X i - μi
Zi = , i = 1,  , p
σii
which in matrix notation is

   X - μ 
-1
12
Z = V
  
where V1/2 is the diagonal standard deviation matrix.
~
Obviously
E Z  = 0
 
   
-1 -1
Cov Z = V 1 2 Σ V 12
= ρ
   
This suggests that the principal components for the
standardized variables Zi may be obtained from the
eigenvectors of the correlation matrix ~! The operations
are analogous to those used in conjunction with the
covariance matrix.

We can show that, for random vector Z~

of standardized
variables with covariance matrix  and eigenvalues 1  2
~
   p  0, the i principal component is given by
th

 V   X - μ  , i
' ' -1
12
Yi = e Z = e
i i = 1,  , p
   
Note again that the principal components are not unique
if some eigenvalues are equal.
We can also show for random vector Z with covariance
~
matrix  and eigenvalue-eigenvector pairs (1 , e1), …, (p ,
~ ~
e~p) where 1  2    p,
p p

 Var Z 
i =1
i = λ1 +  + λ p =  Var Y 
i =1
i = p

and we can again assess how well a subset of the

principal components Yi summarizes the original random
variables Xi by using
proportion of total
λk population variance
p due to the kth
principal
component
If a large proportion of the total population variance can
be attributed to relatively few principal components, we
can replace the original p variables with these principal
components without loss of much information!
Example: Suppose we have the following population of
four observations made on three random variables X1, X2,
and X3:

X1 X2 X3
1.0 6.0 9.0
4.0 12.0 10.0
3.0 12.0 15.0
4.0 10.0 12.0

Find the three population principal components

variables Y1, Y2, and Y3 for the standardized random
variables Z1, Z2, and Z3:
We could standardize the variables X1, X2, and X3, then
work with the resulting covariance matrix ,
~ but it is
much easier to proceed directly with correlation matrix :
~
1.000 0.833 0.356 
 
ρ = 0.833 1.000 0.624 
 0.356 0.624 1.000 
and the corresponding eigenvalue-eigenvector pairs:
0.58437383
 
λ1 = 2.2149347, e1 = 0.63457754
0.50578527
-0.5449250 These results differ

λ2 = 0.6226418, e2 = -0.1549791
 from the
 0.8240377 covariance- based
principal
 0.6013018 components!
 
λ3 = 0.1624235, e3 = -0.7571610
 0.2552315
so the principal components are:

Y1 = e'1Z = 0.5843738Z1 + 0.6345775Z2 + 0.5057853Z3

' 
Y2 = e2Z = -0.5449250Z1 - 0.1549791Z2 + 0.8240377Z3
' 
Y3 = e3Z = 0.6013018Z1 - 0.7571610Z2 + 0.2552315Z3
 

Note that

σ11 + σ22 + σ33 = 1.0 + 1.0 + 1.0 = 3.0

= 2.2149347 + 0.6226418 + 0.1624235 = λ1 + λ2 + λ3
and the proportion of total population variance due to
the each principal component is
λ1 2.2149347
p
= = 0.738311567
3.0
λ
i=1
i

λ2 0.6226418
p
= = 0.207547267
3
λ
i=1
i

λ3 0.1624235
p
= = 0.054141167
3
λ
i=1
i

Note that the third principal component is again

relatively irrelevant!
Next we obtain the correlations between the original
random variables Xi and the principal components Yi:

ρY1,Z1 = e11 λ1 = 0.58437383 2.2149347 = 0.869703464

ρY1,Z2 = e21 λ1 = 0.6345775 2.2149347 = 0.944419907

ρY1,Z3 = e31 λ1 = 0.5057853 2.2149347 = 0.752742749

ρY2,Z1 = e12 λ2 = -0.5449250 0.6226418 = -0.429987538

ρY2,Z2 = e22 λ 2 = -0.1549791 0.6226418 = -0.122290294

ρY2,X3 = e32 λ2 = 0.8240377 0.6226418 = 0.650228824

ρY3,X1 = e13 λ3 = 0.6013018 0.1624235 = 0.242335443

ρY3,X2 = e23 λ3 = -0.7571610 0.1624235 = -0.305149504

ρY3,X3 = e33 λ3 = 0.2552315 0.1624235 = 0.102862886

We can display these results in a correlation matrix:

Z1 Z2 Z3
Y1 0.8697035 0.944420 0.7527427
Y2 -0.4299875 -0.122290 0.6502288
Y3 0.2423354 -0.305150 0.1028629

Here we can easily see that

- the first principal component (Y1) is a mixture of all three
random variables (X1, X2, and X3)
- the second principal component (Y2) is a trade-off
between X1 and X3
- the third principal component (Y3) is a trade-off between
SAS code for Principal Components Analysis:
OPTIONS LINESIZE=72 NODATE PAGENO=1;
DATA stuff;
INPUT x1 x2 x3;
LABEL x1='Random Variable 1'
x2='Random Variable 2'
x3='Random Variable 3';
CARDS;
1.0 6.0 9.0
4.0 12.0 10.0
3.0 12.0 15.0
4.0 10.0 12.0
;
PROC PRINCOMP DATA=stuff OUT=pcstuff N=3;
VAR x1 x2 x3;
RUN;
PROC CORR DATA=pcstuff;
VAR x1 x2 x3;
WITH prin1 prin2 prin3;
RUN;
PROC FACTOR DATA=stuff SCREE;
VAR x1 x2 x3;
RUN;
Note that the SAS default is to use the correlation matrix
to perform this analysis!
SAS output for Principal Components Analysis:
The PRINCOMP Procedure
Observations 4
Variables 3

Simple Statistics
x1 x2 x3
Mean 3.000000000 10.00000000 11.50000000
StD 1.414213562 2.82842712 2.64575131

Correlation Matrix
x1 x2 x3
x1 Random Variable 1 1.0000 0.8333 0.3563
x2 Random Variable 2 0.8333 1.0000 0.6236
x3 Random Variable 3 0.3563 0.6236 1.0000

Eigenvalues of the Correlation Matrix

Eigenvalue Difference Proportion Cumulative
1 2.22945702 1.56733894 0.7432 0.7432
2 0.66211808 0.55369318 0.2207 0.9639
3 0.10842490 0.0361 1.0000

Eigenvectors
Prin1 Prin2 Prin3
x1 Random Variable 1 0.581128 -0.562643 0.587982
x2 Random Variable 2 0.645363 -0.121542 -0.754145
x3 Random Variable 3 0.495779 0.817717 0.292477
SAS output for Correlation Matrix – Original Random
Variables vs. Principal Components:
The CORR Procedure

3 With Variables: Prin1 Prin2 Prin3

3 Variables: x1 x2 x3

Simple Statistics
Variable N Mean Std Dev Sum Minimum Maximum
Prin1 4 0 1.49314 0 -2.20299 1.11219
Prin2 4 0 0.81371 0 -0.94739 0.99579
Prin3 4 0 0.32928 0 -0.28331 0.47104
x1 4 3.00000 1.41421 12.00000 1.00000 4.00000
x2 4 10.00000 2.82843 40.00000 6.00000 12.00000
x3 4 11.50000 2.64575 46.00000 9.00000 15.00000

Pearson Correlation Coefficients, N = 4

Prob > |r| under H0: Rho=0

x1 x2 x3

Prin1 0.86770 0.96362 0.74027

0.1323 0.0364 0.2597

Prin2 -0.45783 -0.09890 0.66538

0.5422 0.9011 0.3346

Prin3 0.19361 -0.24832 0.09631

0.8064 0.7517 0.9037
SAS output for Factor Analysis

PRINCIPAL COMPONENTS ANALYSIS

FOR QA 610
SPRING QUARTER 2001
Using PROC FACTOR to obtain a Scree Plot for Principal Components Analysis

The FACTOR Procedure

Initial Factor Method: Principal Components

Prior Communality Estimates: ONE

Eigenvalues of the Correlation Matrix: Total = 3 Average = 1

Note that
this is
Eigenvalue Difference Proportion Cumulative consistent
with the
1 2.22945702 1.56733894 0.7432 0.7432
2 0.66211808 0.55369318 0.2207 0.9639
results
3 0.10842490 0.0361 1.0000 from PCA

1 factor will be retained by the MINEIGEN criterion.

SAS output for Factor Analysis
The FACTOR Procedure
Initial Factor Method: Principal Components
Scree Plot of Eigenvalues
‚
‚
‚
‚
‚
‚
2.5 ˆ
‚
‚
‚ 1
‚
‚
2.0 ˆ
‚
‚
E ‚
i ‚
g ‚
e 1.5 ˆ
n ‚
v ‚
a ‚
l ‚
u ‚
e 1.0 ˆ
s ‚
‚
‚
‚ 2
‚
0.5 ˆ
‚
‚
‚
‚
‚ 3
0.0 ˆ
‚
‚
‚
‚
‚
Šƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒ
0 1 2 3

Number
SAS output for Factor Analysis
The FACTOR Procedure
Initial Factor Method: Principal Components
Pearson Correlation
Factor Pattern Coefficients for the first
Factor1 principal component
with the three original
x1 Random Variable 1 0.86770 variables X1, X2, and X3
x2 Random Variable 2 0.96362
x3 Random Variable 3 0.74027

Variance Explained by Each Factor

Factor1
First eigenvalue 1
2.2294570

Final Communality Estimates: Total = 2.229457

x1 x2 x3

0.75291032 0.92855392 0.54799278

SAS code for Principal Components Analysis:
OPTIONS LINESIZE=72 NODATE PAGENO=1;
DATA stuff;
INPUT x1 x2 x3;
LABEL x1='Random Variable 1'
x2='Random Variable 2'
x3='Random Variable 3';
CARDS;
1.0 6.0 9.0
4.0 12.0 10.0
3.0 12.0 15.0
4.0 10.0 12.0
;
PROC PRINCOMP DATA=stuff OUT=pcstuff N=3 COV;
VAR x1 x2 x3;
RUN;
PROC CORR DATA=pcstuff;
VAR x1 x2 x3;
WITH prin1 prin2 prin3;
RUN;
PROC FACTOR DATA=stuff SCREE COV;
VAR x1 x2 x3;
RUN;
Note that here we use SAS to derive the covariance
matrix based principal components!
SAS output for Principal Components Analysis:
The PRINCOMP Procedure
Observations 4
Variables 3

Simple Statistics
x1 x2 x3
Mean 3.000000000 10.00000000 11.50000000
StD 1.414213562 2.82842712 2.64575131

Covariance Matrix
x1 x2 x3
x1 Random Variable 1 2.000000000 3.333333333 1.333333333
x2 Random Variable 2 3.333333333 8.000000000 4.666666667
x3 Random Variable 3 1.333333333 4.666666667 7.000000000

Total Variance 17

Eigenvalues of the Covariance Matrix

Eigenvalue Difference Proportion Cumulative
1 13.2193960 9.8400643 0.7776 0.7776
2 3.3793317 2.9780594 0.1988 0.9764
3 0.4012723 0.0236 1.0000

Eigenvectors
Prin1 Prin2 Prin3
x1 Random Variable 1 0.291038 0.415039 0.861998
x2 Random Variable 2 0.734249 0.480716 -.479364
x3 Random Variable 3 0.613331 -.772434 0.164835
SAS output for Correlation Matrix – Original Random
Variables vs. Principal Components:
The CORR Procedure

3 With Variables: Prin1 Prin2 Prin3

3 Variables: x1 x2 x3

Simple Statistics
Variable N Mean Std Dev Sum Minimum Maximum
Prin1 4 0 3.63585 0 -5.05240 3.61516
Prin2 4 0 1.83830 0 -1.74209 2.53512
Prin3 4 0 0.63346 0 -0.38181 0.94442
x1 4 3.00000 1.41421 12.00000 1.00000 4.00000
x2 4 10.00000 2.82843 40.00000 6.00000 12.00000
x3 4 11.50000 2.64575 46.00000 9.00000 15.00000

Pearson Correlation Coefficients, N = 4

Prob > |r| under H0: Rho=0

x1 x2 x3

Prin1 0.74824 0.94385 0.84285

0.2518 0.0561 0.1571

Prin2 0.53950 0.31243 -0.53670

0.4605 0.6876 0.4633

Prin3 0.38611 -0.10736 0.03947

0.6139 0.8926 0.9605
SAS output for Factor Analysis

PRINCIPAL COMPONENTS ANALYSIS

FOR QA 610
SPRING QUARTER 2001
Using PROC FACTOR to obtain a Scree Plot for Principal Components Analysis

The FACTOR Procedure

Initial Factor Method: Principal Components

Prior Communality Estimates: ONE

Eigenvalues of the Covariance Matrix: Total = 17 Average = 5.66666667

Note that
this is
Eigenvalue Difference Proportion Cumulative consistent
with the
1 13.2193960 9.8400643 0.7776 0.7776
2 3.3793317 2.9780594 0.1988 0.9764
results
3 0.4012723 0.0236 1.0000 from PCA

1 factor will be retained by the MINEIGEN criterion.

SAS output for Factor Analysis
The FACTOR Procedure
Initial Factor Method: Principal Components
Scree Plot of Eigenvalues
‚
‚
‚
14 ˆ
‚
‚ 1
‚
‚
12 ˆ
‚
‚
‚
‚
10 ˆ
‚
E ‚
i ‚
g ‚
e 8ˆ
n ‚
v ‚
a ‚
l ‚
u 6ˆ
e ‚
s ‚
‚
‚
4ˆ
‚
‚ 2
‚
‚
2ˆ
‚
‚
‚
‚ 3
0ˆ
‚
‚
‚
Šƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒ
0 1 2 3

Number
SAS output for Factor Analysis
The FACTOR Procedure
Initial Factor Method: Principal Components

Factor Pattern Pearson Correlation

Factor1 Coefficients for the first
principal component
x1 Random Variable 1 0.74824 with the three original
x2 Random Variable 2 0.94385 variables X1, X2, and X3
x3 Random Variable 3 0.84285

Variance Explained by Each Factor

Factor Weighted Unweighted

Factor1 13.2193960 2.16112149

First eigenvalue 1
Final Communality Estimates and Variable Weights
Total Communality: Weighted = 13.219396 Unweighted = 2.161121

Variable Communality Weight

x1 0.55986257 2.00000000
x2 0.89085847 8.00000000
x3 0.71040045 7.00000000
Covariance matrices with special structures yield
particularly interesting principal components:
- Diagonal covariance matrices – suppose  is the diagonal
~
matrix σ11 0  0 
0 σ  0
Σ =  
22

     
 
 0 0  σ pp 

since the eigenvector e~ i has a value of 1 in the ith position
and 0 in all other positions, we have
0
   so (ii,ei) is the i
th

σ11 0  0    eigenvalue-
0 σ  0  0 
 22  
eigenvecotr pair
Σei =
      σii  = σiie i
  0
 0 0  σ pp   
 
0
 
…so the linear combination

Yk = Σe'i X = X i
 
demonstrates that the set of principal components and
the original set of (uncorrelated) random variables are
the same!
Note that this result is also true if we work with the
correlation matrix.
- constant variances and covariance matrices – suppose  is
~
the patterned matrix  σ2 ρσ2  ρσ2 
 2 2 2
ρσ σ  ρσ 
Σ = 
     
 2 2 2 
ρσ ρσ  σ 

Here the resulting correlation matrix

1 ρ  ρ
ρ 1  ρ
ρ = 
    
 
ρ ρ  1

is also the covariance matrix of the standardized

variables Z Here the resulting correlation matrix
~
C. Using Principal Components to
Summarize Sample Variation
Suppose the data x~1,…,x~n represent n independent
observations from a p-dimensional population with
some mean vector  ~ and covariance
_ matrix 
~ – these data
yield a sample mean vector ~x, sample covariance matrix
S,
~ and sample correlation matrix R. ~

As in the population case, our goal is to develop a new

set of p axes (linear combinations of the original p axes)
in the directions of greatest variability:
y1 = a'1x = a11x1 + a12x 2 +  + a1p x p
' 
y 2 = a2x = a21x1 + a22x 2 +  + a 2p x p
 

y p = a'p x = a p1x1 + a p2x 2 +  + a pp x p
Again it is easy to show that the linear combinations
a'i x = ai1xj1 + ai2xj2 +  + aip x jp
 
have sample means a'i x and
 
 '

 
 '
Var ai x = aiSai, i = 1,  , p
 
 
Cov a'i x, a'k x = a'iSa k, i, k = 1,  , p
     
The principal components are those uncorrelated linear
combinations y^ 1,…,y^ p whose variances are as large as
possible.
Thus the first principal component is the linear
combination of maximum sample variance, i.e., we wish
to solve the nonlinear optimization problem
source of
max a'1Sa1
nonlinearit
a1
   restrict to
coefficient
y st a'1a1 = 1 vectors of unit
 
The second principal component is the linear
combination of maximum sample variance that is
uncorrelated with the first principal component, i.e., we
wish to solve the nonlinear optimization problem
max a'2Sa2
a2
  
restricts
st a'2a2 = 1 covariance
' 
a1Sa2 = 0 to zero
 
The third principal component is the solution to the
nonlinear optimization problem
max a'3Sa3
a3
  
st a'3a3 = 1
'  restricts
a1Sa3 = 0 covariance
'   s to zero
a2Sa3 = 0
Generally, the ith principal component is the linear
combination of maximum sample variance that is
uncorrelated with all previous principal components, i.e.,
we wish to solve the nonlinear optimization problem
max a'iSai
ai
  
st a'iai = 1
' 
akSai = 0 k < i
 
We can show that, for random sample^ X ~ with
^
sample
^
covariance matrix S~ and eigenvalues 1  2    p  0,
the ith sample principal component is given by
ˆi = ˆ
y ˆ'i1x1 + e
e'ix = e ˆ'i2x 2 +  + e
ˆ'ip x p, i = 1,  , p
 
Note that the principal components are not unique if
some eigenvalues are equal.
We can also show for random sample X with sample
~ ^
covariance matrix S and eigenvalue-eigenvector pairs (1 ,
^ ^ ^ ~ ^ ^ ^
e~1), …, (p , e~p) where ~1  ~2    ~p,
p p
ˆ +  + λˆ =
s11 +  + spp =  ii 1
s
i =1
= λ p  Var  y 
i =1
i

so we can assess how well a subset of the principal

components yi summarizes the original random sample X ~
– one common method of doing so is
ˆ
λ proportion of
k
p total sample
ˆ
 λ i
variance due to
i =1 the kth principal
component
If a large proportion of the total sample variance can be
attributed to relatively few principal components, we can
replace the original p variables with these principal
components without loss of much information!
We can also easily find the correlations between the
original random variables xk and the principal
components yi

eˆik λˆi
rYi,Xk =
skk

These values are often used in interpreting the principal

components yi.
Note that
- the approach for standardized data (i.e., principal
components derived from the sample correlation matrix
R) is analogous to the population approach
~
- when principal components are derived from sample
data, the sample data are frequently centered,
x-x
 
which has no effect on the sample covariance matrix S
~
and yields the derived principal components
yˆi = ˆe'i  x - x 
  
Under these circumstances, the mean value of the ith
principal component associated with all n observations
in the data set is
1 n
1 n
1 '
ˆyi =
n
 
j=1 
'



'

n  j=1 



ei xj - x = ei  xj - x = eˆi 0 = 0
ˆ ˆ
n  
Example: Suppose we have the following sample of four
observations made on three random variables X1, X2, and
X3:

X1 X2 X3
1.0 6.0 9.0
4.0 12.0 10.0
3.0 12.0 15.0
4.0 10.0 12.0

Find the three sample principal components y1, y2, and y3

based on the sample covariance matrix S:
~
First we need the sample covariance matrix S:
~
2.00 3.33 1.33
 
S = 3.33 8.00 4.67

1.33 4.67 7.00

and the corresponding eigenvalue-eigenvector pairs:

 0.291000
ˆ = 13.21944, e
ˆ  
λ 1 1 =  0.734253 
 0.613345
 0.415126
ˆ = 3.37916, eˆ  
λ 2 2 =  0.480690 
-0.772403
 0.861968
ˆ =
λ ˆ3 = -0.479385
0.40140, e
3  
 0.164927
so the principal components are:

ˆy1 = e'1x = 0.291000x1 + 0.734253x2 + 0.613345x3

'
ˆy2 = e2x = 0.415126x1 + 0.480690x2 - 0.772403x3
' 
ˆy3 = e3x = 0.861968x1 - 0.479385x2 + 0.164927x3
 

Note that

s11 + s22 + s33 = 2.0 + 8.0 + 7.0 = 17.0

= 13.21944 + 3.37916 + 0.40140 = λˆ1 + λˆ2 + λˆ3
and the proportion of total population variance due to
the each principal component is
ˆ
λ 13.21944
p
1
= = 0.777613814
ˆ 17.0

i=1
λ i

ˆ
λ 3.37916
p
2
= = 0.198774404
ˆ 17.0

i=1
λ i

ˆ
λ 0.40140
p
3
= = 0.023611782
17.0
 λˆ
i =1
i

Note that the third principal component is relatively

irrelevant!
Next we obtain the correlations between the observed
values xi of the original random variables and the sample
principal components yik
ê11 λˆ1 0.291000 13.21944
ry1,x1 = = = 0.529016407
s11 2.0
eˆ λˆ
21 1 0.734253 13.21944
ry1,x2 = = = 0.333704415
s22 8.0
ê λˆ
31 1 0.613345 13.21944
ry1,x3 = = = 0.318576185
s33 7.0
ê12 λˆ2 0.415126 3.37916
ry2,x1 = = = 0.381552972
s11 2.0
ê22 λˆ2 0.480690 3.37916
ry2,x2 = = = 0.110453671
s21 8.0
eˆ32 λˆ2 -0.772403 3.37916
ry2,x3 = = = -0.202838600
s33 7.0
eˆ λˆ
13 3 0.861968 0.40140
ry3,x1 = = = 0.273055007
s11 2.0
ê λˆ
23 3 -0.479385 0.40140
ry3,x2 = = = -0.037964991
s22 8.0
ê λˆ
33 3 0.164927 0.40140
ry3,x3 = = = 0.014927318
s33 7.0
We can display these results in a correlation matrix:

X1 X2 X3
Y1 0.529016 0.333704 0.318576
Y2 0.381553 0.110454 -0.202839
Y3 0.273055 -0.037965 0.014927

How would we interpret these results?

Note that results based on the sample correlation matrix

R
~
will not differ from results based on the population
correlation matrix  (why?).
~
SAS code for Principal Components Analysis:
OPTIONS LINESIZE=72 NODATE PAGENO=1;
DATA stuff;
INPUT x1 x2 x3;
LABEL x1='Random Variable 1'
x2='Random Variable 2'
x3='Random Variable 3';
CARDS;
1.0 6.0 9.0
4.0 12.0 10.0
3.0 12.0 15.0
4.0 10.0 12.0
;
PROC PRINCOMP DATA=stuff COV OUT=pcstuff;
VAR x1 x2 x3;
TITLE4 'Using PROC PRINCOMP for Principal Components Analysis';
RUN;
PROC CORR DATA=pcstuff;
VAR x1 x2 x3; used to instruct SAS to
WITH prin1 prin2 prin3;
run;
perform the principal
components analysis on
the sample covariance
rather than the default
SAS output for Principal Components Analysis:
The PRINCOMP Procedure
Observations 4
Variables 3

Simple Statistics
x1 x2 x3
Mean 3.000000000 10.00000000 11.50000000
StD 1.414213562 2.82842712 2.64575131

Covariance Matrix
x1 x2 x3
x1 Random Variable 1 2.000000000 3.333333333 1.333333333
x2 Random Variable 2 3.333333333 8.000000000 4.666666667
x3 Random Variable 3 1.333333333 4.666666667 7.000000000

Total Variance 17

Eigenvalues of the Covariance Matrix

Eigenvalue Difference Proportion Cumulative
1 13.2193960 9.8400643 0.7776 0.7776
2 3.3793317 2.9780594 0.1988 0.9764
3 0.4012723 0.0236 1.0000

Eigenvectors
Prin1 Prin2 Prin3
x1 Random Variable 1 0.291038 0.415039 0.861998
x2 Random Variable 2 0.734249 0.480716 -0.479364
x3 Random Variable 3 0.613331 -0.772434 0.164835
SAS output for Correlation Matrix – Original Random
Variables vs. Principal Components:
The CORR Procedure

3 With Variables: Prin1 Prin2 Prin3

3 Variables: x1 x2 x3

Pearson Correlation Coefficients, N = 4

Prob > |r| under H0: Rho=0

x1 x2 x3

Prin1 0.86770 0.96362 0.74027

0.1323 0.0364 0.2597

Prin2 -0.45783 -0.09890 0.66538

0.5422 0.9011 0.3346

Prin3 0.19361 -0.24832 0.09631

0.8064 0.7517 0.9037

Exam With Model Answers
No ratings yet
Exam With Model Answers
4 pages
Logit Model For Binary Data
No ratings yet
Logit Model For Binary Data
50 pages
Big Data Analysis Assig.2
100% (1)
Big Data Analysis Assig.2
5 pages
Quadratic Forms and Characteristic Roots Prof. NasserF1
No ratings yet
Quadratic Forms and Characteristic Roots Prof. NasserF1
65 pages
UGC Statistics Curriculum 2001
No ratings yet
UGC Statistics Curriculum 2001
101 pages
Solution CH # 5
No ratings yet
Solution CH # 5
39 pages
Classification Metrics in Machine Learning
No ratings yet
Classification Metrics in Machine Learning
6 pages
Multivariate Mean Comparisons
100% (1)
Multivariate Mean Comparisons
9 pages
Quadratic Forms
No ratings yet
Quadratic Forms
4 pages
Comparison of Several Multivariate Means
No ratings yet
Comparison of Several Multivariate Means
111 pages
Intro to Simple Linear Regression
No ratings yet
Intro to Simple Linear Regression
11 pages
Outlier Detection Techniques
No ratings yet
Outlier Detection Techniques
55 pages
Assignment of Econometrics
No ratings yet
Assignment of Econometrics
12 pages
Statistics
No ratings yet
Statistics
41 pages
Manova PDF
No ratings yet
Manova PDF
18 pages
Multivariate Normal Distribution Guide
No ratings yet
Multivariate Normal Distribution Guide
59 pages
OUTLIERS
100% (1)
OUTLIERS
5 pages
Ch. 9 Multiple Choice Review Questions: 1.96 B) 1.645 C) 1.699 D) 0.90 E) 1.311
100% (1)
Ch. 9 Multiple Choice Review Questions: 1.96 B) 1.645 C) 1.699 D) 0.90 E) 1.311
5 pages
Example of Two Group Discriminant Analysis
No ratings yet
Example of Two Group Discriminant Analysis
7 pages
1.1 Basic Time Series Decomposition PDF
No ratings yet
1.1 Basic Time Series Decomposition PDF
38 pages
Advanced Statistical Distributions
No ratings yet
Advanced Statistical Distributions
13 pages
Cross-Validation and Model Selection
No ratings yet
Cross-Validation and Model Selection
46 pages
Jolliffe Principalcomponentanalysis 2016
No ratings yet
Jolliffe Principalcomponentanalysis 2016
17 pages
Econometric Modelling: Module - 1
No ratings yet
Econometric Modelling: Module - 1
20 pages
Rohatgi Expl
No ratings yet
Rohatgi Expl
192 pages
2.1 Descriptive Statistics Contd
No ratings yet
2.1 Descriptive Statistics Contd
20 pages
Wishart Distribution
No ratings yet
Wishart Distribution
6 pages
SAS Part001
No ratings yet
SAS Part001
15 pages
Discriminant Analysis PDF
No ratings yet
Discriminant Analysis PDF
9 pages
ANOVA for Diet Efficiency Analysis
No ratings yet
ANOVA for Diet Efficiency Analysis
11 pages
Linear Regression: Major: All Engineering Majors Authors: Autar Kaw, Luke Snyder
100% (1)
Linear Regression: Major: All Engineering Majors Authors: Autar Kaw, Luke Snyder
25 pages
Decision Trees
No ratings yet
Decision Trees
25 pages
Cluster Analysis
No ratings yet
Cluster Analysis
5 pages
Statistics Lesson 1
No ratings yet
Statistics Lesson 1
111 pages
Covariance Matrix
No ratings yet
Covariance Matrix
6 pages
Weekly Quiz 6: Bootstrap & Statistics
100% (1)
Weekly Quiz 6: Bootstrap & Statistics
8 pages
Hotelling T-Square
No ratings yet
Hotelling T-Square
16 pages
Arima
100% (1)
Arima
4 pages
Linear Statistical Models The Less Than Full Rank Model: Yao-Ban Chan
100% (1)
Linear Statistical Models The Less Than Full Rank Model: Yao-Ban Chan
140 pages
Time Series Forecasting Guide
No ratings yet
Time Series Forecasting Guide
30 pages
Data Organization & Visualization
No ratings yet
Data Organization & Visualization
113 pages
Moving Average 2
No ratings yet
Moving Average 2
11 pages
Lecture 10 Randomized Complete Block Design Last Lecture
100% (1)
Lecture 10 Randomized Complete Block Design Last Lecture
4 pages
Agra University Journal Scie
No ratings yet
Agra University Journal Scie
69 pages
STAT3006 Lecture Notes 2021 Aug8 2021
No ratings yet
STAT3006 Lecture Notes 2021 Aug8 2021
110 pages
Probability Rules & Distributions Guide
No ratings yet
Probability Rules & Distributions Guide
3 pages
Econometrics Note
No ratings yet
Econometrics Note
13 pages
Assignment-Based Subjective Questions/Answers
No ratings yet
Assignment-Based Subjective Questions/Answers
3 pages
New Multivariate Time-Series Estimators in Stata 11
100% (1)
New Multivariate Time-Series Estimators in Stata 11
34 pages
Time Series Questions
100% (1)
Time Series Questions
9 pages
Gaussian Noise Detection & Estimation
No ratings yet
Gaussian Noise Detection & Estimation
55 pages
CH 12
No ratings yet
CH 12
30 pages
Lecture Note 3 - Introduction To Vector and Matrix Differentiation
No ratings yet
Lecture Note 3 - Introduction To Vector and Matrix Differentiation
6 pages
Statistics Formula Sheet-With Tables
No ratings yet
Statistics Formula Sheet-With Tables
5 pages
Session 1 (The Nature of Probability and Statistics) PDF
No ratings yet
Session 1 (The Nature of Probability and Statistics) PDF
173 pages
Bia b350f Unit 4
No ratings yet
Bia b350f Unit 4
38 pages
Lecture Note5
No ratings yet
Lecture Note5
53 pages
Principal Components
No ratings yet
Principal Components
5 pages
Lecture FPCA
No ratings yet
Lecture FPCA
67 pages
Chapter2 PCA
No ratings yet
Chapter2 PCA
65 pages
Quoquab 2019
No ratings yet
Quoquab 2019
26 pages
8401 33667 3 PB PDF
No ratings yet
8401 33667 3 PB PDF
15 pages
The Socioemotional Well-Being Index (SEWBI) : Theoretical Framework and Empirical Operationalisation
No ratings yet
The Socioemotional Well-Being Index (SEWBI) : Theoretical Framework and Empirical Operationalisation
29 pages
Initial Pages + Index
No ratings yet
Initial Pages + Index
20 pages
Aging and Life Satisfaction
No ratings yet
Aging and Life Satisfaction
24 pages
Testing The Effects of Food Quality
No ratings yet
Testing The Effects of Food Quality
11 pages
Characterization of Healthy Housing in Africa Method, Profiles, and Determinants
No ratings yet
Characterization of Healthy Housing in Africa Method, Profiles, and Determinants
18 pages
Tebbe 2014
No ratings yet
Tebbe 2014
13 pages
Ethical Leadership in Healthcare
No ratings yet
Ethical Leadership in Healthcare
9 pages
Autoregressive and Lag Models Guide
No ratings yet
Autoregressive and Lag Models Guide
8 pages
Mediating Role of Green Supply Chain Management Between Lean Manufacturing Practices and Sustainable Performance
No ratings yet
Mediating Role of Green Supply Chain Management Between Lean Manufacturing Practices and Sustainable Performance
11 pages
Amr Project: Dinshaw's Ice Cream: Group - 8
No ratings yet
Amr Project: Dinshaw's Ice Cream: Group - 8
33 pages
Cattel
100% (1)
Cattel
12 pages
T3 MR Course Outline
No ratings yet
T3 MR Course Outline
5 pages
Adolescent Resilience Questionnaire
100% (1)
Adolescent Resilience Questionnaire
11 pages
Models For Location Selection
No ratings yet
Models For Location Selection
13 pages
Articol 1
No ratings yet
Articol 1
21 pages
PAPER - Usability of Transparency Portals Examination of Perceptions of Journalist As Information Seeker
No ratings yet
PAPER - Usability of Transparency Portals Examination of Perceptions of Journalist As Information Seeker
34 pages
Personal Assessment of Intimacy in Relationships: Validity and Measurement Invariance Across Gender
No ratings yet
Personal Assessment of Intimacy in Relationships: Validity and Measurement Invariance Across Gender
8 pages
Validation and Structural Analysis of The Kinematics Concept Test
No ratings yet
Validation and Structural Analysis of The Kinematics Concept Test
13 pages
Cole Et Al. (2006) - JOB
No ratings yet
Cole Et Al. (2006) - JOB
22 pages
Principal Components and Factor Analysis
No ratings yet
Principal Components and Factor Analysis
22 pages
A Study On Parameters of Online Reviews Content TH PDF
No ratings yet
A Study On Parameters of Online Reviews Content TH PDF
14 pages
Psychometric Assessment of Beck Scale For Suicidal
No ratings yet
Psychometric Assessment of Beck Scale For Suicidal
10 pages
Critical Factors and Results of Quality Management - An Empirical Study
No ratings yet
Critical Factors and Results of Quality Management - An Empirical Study
29 pages
Determining Hiking Experiences in Nature
No ratings yet
Determining Hiking Experiences in Nature
13 pages
Organizational Climate & Generational Relations
No ratings yet
Organizational Climate & Generational Relations
26 pages
Machine Unit4
No ratings yet
Machine Unit4
55 pages
Total Quality Management
No ratings yet
Total Quality Management
16 pages
Hodge Spiritual Competence Scale
No ratings yet
Hodge Spiritual Competence Scale
11 pages