Name: _________________________________
EE/ME 701 Advanced Linear Systems
Lecture Notes
Prof. Brian Armstrong
September 11, 2012
EE/ME 701: Advanced Linear Systems
Linear Algebra
State Space Modeling, Basics of Linear Algebra
EE/ME 701: Advanced Linear Systems
1.11 Eigenvalues and eigenvectors
Linear Algebra
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.11.1 Some properties of Eigenvectors and Eigenvalues
Contents
. . . . . . . . . . . . . 24
1.11.2 Additional notes on Eigenvectors and Eigenvalues . . . . . . . . . . . . . 28
1.11.3 One final fact about the eigensystem: V diagonalizes A . . . . . . . . . . . 32
1 Elements of Linear Algebra
1.1
Scalars, Matrices and Vectors
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
Transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3
Basic arithmetic: +, - and * are well defined . . . . . . . . . . . . . . . . . . . . .
1.3.1
Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Commutative, Associative, Distributive and Identity Properties . . . . . . . . . . .
1.4.1
Commutative property . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.2
Associative property . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.3
Distributive property . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.4
Identity Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.4.5
Doing algebra with vectors and matrices
. . . . . . . . . . . . . . . . . .
1.5
Linear Independence of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.6
Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4
1.6.1
Computing the determinant . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.6.2
Some relations involving determinants
1.6.3
The determinant of triangular and diagonal matrices . . . . . . . . . . . . 12
1.7
Rank
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.8
The norm of a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.9
Basic properties of the singular value decomposition (svd) . . . . . . . . . . . . . 16
1.8.1
1.9.1
Example norms
2 Two equations of interest
2.1
2.2
35
Algebraic Equations, y = A b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.1.1
Case 1: Where we have n = p independent equations, the exactly
constrained case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.1.2
Case 2: Where n > p, we have more equations than unknowns, the over
constrained case (Left pseudo-inverse case): . . . . . . . . . . . . . . . . 39
2.1.3
Case 3: Where n < p we have fewer equations than unknowns, the underconstrained case (Right pseudo-inverse case): . . . . . . . . . . . . . . . 41
Differential Equations, . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3 Summary
45
. . . . . . . . . . . . . . . . . . . 12
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
SVD, condition number and rank . . . . . . . . . . . . . . . . . . . . . . 17
1.10 The condition number of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.10.1 Condition number and error . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.10.2 Example of a badly conditioned matrix . . . . . . . . . . . . . . . . . . . 20
1.10.3 How condition number is determined . . . . . . . . . . . . . . . . . . . . 21
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 1
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 2
EE/ME 701: Advanced Linear Systems
Section 1.2.0
1 Elements of Linear Algebra
EE/ME 701: Advanced Linear Systems
Section 1.3.0
1.3 Basic arithmetic: +, - and * are well defined
Define:
1.1 Scalars, Matrices and Vectors
A=
Matrices are rectangular sets of numbers or functions, examples:
Matrices have zero or more rows and columns:
v (t)
1 0 7
31
,
x
(t)
R
x (t) =
A
=
w (t)
4 5 0
z (t)
2 3 6
A R33
1 2
3 4
B=
x=
2
3
1 1
w=
C = 4 1
5 2
3 4
Operations +, - and * are well defined. The dimensions of the operands
must be compatible.
Vectors are special cases of Matrices, with only one row or column,
x is a column vector,
h
i
w = 3 4 R12 is a row vector
For addition and subtraction, the operation is element-wise, and the
operands must be the same size:
Scalar values (numbers or functions with one output variable) can also be
treated as matrices or vectors:
3 = [3] R11
A+B =
0 3
5 7
AB =
2 1
1 1
If not the same size, there is no defined result (the operation is
impossible)
A + C = undefined
Array is a synonym for Matrix
1.2 Transpose
Transposing an array rearranges each column to a row:
#
"
3 1
3 4 5
T
C
C=
=
4 1
1 1 2
5 2
C R32 , CT R23
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 3
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 4
EE/ME 701: Advanced Linear Systems
Section 1.3.1
EE/ME 701: Advanced Linear Systems
Section 1.4.2
1.4 Commutative, Associative, Distributive and Identity
1.3.1 Multiplication
For multiplication, the operation is by row on the left and column on the
right.
To produce one element of the result, go across each row on the left and
multiply with elements of the column on the right,
A B = AB =
"
(1 (1) + 2 2) (1 1 + 2 3)
(3 (1) + 4 2) (3 1 + 4 3)
"
3 7
5 15
A and B dont have to be the same size to multiply
Ax =
"
1 2
3 4
#"
2
3
"
(1 2 + 2 3)
(3 2 + 4 3)
"
18
The size of the result matrix is determined by the number of rows in
A and columns in B
AC =
1 2
3 4
Unlike scalar algebra, multiplication does not generally commute:
A B 6= B A
3 7
5 15
BA =
11 16
Generally: There are many properties of linear algebra which may or
may not be true for some special cases.
Does not generally commute means that perhaps special matrices can
be found that do commute, but that not all matrices commute.
Like scalar algebra, +, - and * have the associative property:
3 1
4 1 = undefined
5 2
(n = 3 , m = 2 , j = 2 , k = 2)
"
#
6 10
3 1
1 2
=
CA =
1 4
4 1
3 4
11 18
5 2
Lecture: Basics of Linear Algebra
Like scalar algebra, subtraction commutes with a - sign:
A B = (B A)
1.4.2 Associative property
To multiply A and B, we must have m = j , then C Rnk
Like scalar algebra, addition commutes: A + B = B + A
AB =
The number of columns in A must match the number of rows in B
that is, m = j
"
1.4.1 Commutative property
Example:
For A B = C , given A Rnm and B R jk
Examples:
Properties
(Revised: Sep 06, 2012)
(A + B) + C = A + (B + C)
(A + B) C = A + (B C)
(A B) C = A (B C)
Page 5
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 6
EE/ME 701: Advanced Linear Systems
Section 1.4.4
1.4.3 Distributive property
EE/ME 701: Advanced Linear Systems
Section 1.4.5
1.4.5 Doing algebra with vectors and matrices
Like scalar algebra, * distributes over + and - :
Starting with an equation,
(A + B) C = A C + B C
We can add, subtract or multiply on the left or right by any allowed term
and get a new equation.
C (A B) = C A C B
Examples, given:
Note: left-to-right order of multiplications in second example.
A+B = C
1.4.4 Identity Matrix
Then
Like scalar algebra, linear algebra has a multiplicative identity:
(A + B) + D = C + D
IL C = C IR = C
E (A + B) = E A + E B = E C
Examples:
1 0 0
3
0 1 0 4
0 0 1
5
3 1
4 1
0
5 2
3 1
=
1 4 1
2
5 2
3 1
=
4 1
1
5 2
(A + B) F = C F
Where A, B, C, D, E and F are compatible sizes.
Matrices that are the appropriate size for an operation are called
commensurate .
If A is square, then IL = IR , and we just call it I, the identity matrix:
1 0
0 1
1 2
3 4
Lecture: Basics of Linear Algebra
1 2
3 4
1 0
0 1
1 2
3 4
(Revised: Sep 06, 2012)
Page 7
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 8
EE/ME 701: Advanced Linear Systems
Section 1.5.0
1.5 Linear Independence of Vectors
EE/ME 701: Advanced Linear Systems
Section 1.6.0
1.6 Determinant
The determinant is a scalar measure of the size of a square matrix.
Linear Dependence: A set of p n-dimensional vectors,
v1 , v2 , v p , vi Rn
det (A) = |A| R1
is linearly dependent if there exists a set of scalars {ai } , i = 1...p , not all
of which are zero, such that:
p
..
a1 v1 + a2 v2 + + a p v p = ai vi = 0 0 = .
(1)
i=1
0
Linear Independence: The set of vectors is said to be linearly independent if
there is no set of values {ai } which satisfies the conditions for Linear
Dependence. In other words,
p
ai vi = 0
(2)
(5)
The determinant is not defined for a non-square matrix.
The determinant of a matrix will be non-zero if and only if the rows (or,
equivalently, columns) of a matrix are linearly independent. Examples:
(a)
1 2 3
4 5 6 =0
10 14 18
(b)
1 2 3
4 5 6
7 8 9
= 54
(6)
In case (a), the third column is given by 2xCol2 - 1xCol1.
In case (b), the three columns are independent.
i=1
implies that the {ai } are all zero.
Notice that in case (a), the third row is given by 2xRow1 + 2xRow2.
Written another way, vectors {vi } are linearly independent if and only if
(iff)
p
ai vi = 0
ai = 0 , i
Always for a square matrix, if the columns are dependent, the rows
will be dependent.
(3)
i=1
or written in the form of Eqn (1):
| |
v1 v2
| |
a1
| |
a
2
=0
vp
...
| |
ap
Lecture: Basics of Linear Algebra
iff
a
1
a2
.. = 0
.
ap
(Revised: Sep 06, 2012)
(4)
Page 9
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 10
EE/ME 701: Advanced Linear Systems
Section 1.6.1
1.6.1 Computing the determinant
ai j i j
for any i = 1, 2, ..., n
(7)
j=1
where Mi j is called a minor. Mi j is the same matrix as A, with the ith row and jth
column removed.
For example, with i = 1,
2. det (A B) = det (A) det (B)
3. Invertibility of a matrix: Given M Rnn, M is invertible iff det (M) 6= 0
j=2
j=3
a b c
d f
e f
d e
+ c det
b det
det d e f = a det
g i
h i
g h
g h i
Closed-form expressions for 1x1, 2x2 and 3x3 matrices are sometimes
handy. They are:
det [a] = a ,
(b) Similarity relation:
det M A M 1 = det (M) det (A) 1/ det (M) = det (A)
det
a b
c d
= ad bc
(Revised: Sep 06, 2012)
1. If a matrix has the upper-triangular form of Au or the lower-triangular form
of Al
0
d1
T
d1
d2
d2
, or Al =
Au =
...
...
T
dn
0
dn
Then det (Au ) = nk=1 dk , or det (Al ) = nk=1 dk .
Example:
>> A = [ 1 2 3 ; 0 4 5 ; 0 0 6]
A = 1
2
3
0
4
5
0
0
6
2. A diagonal matrix is special case of 1).
a b c
det d e f = a e i a h f b d i + b g f + c d h c g e
g h i
Lecture: Basics of Linear Algebra
(a) det M 1 = 1/ det (M)
1.6.3 The determinant of triangular and diagonal matrices
j=1
1. det (I) = 1, where I is the identity matrix
4. Given M, an invertible matrix,
where ai j is the element from the ith row and jth column of A; and i j is called the
cofactor, given by:
(8)
i j = (1)i+ j det Mi j
Section 1.6.3
1.6.2 Some relations involving determinants
The determinant of a square matrix (the only kind !) is defined by Laplaces
expansion (following Franklin et al.):
det A =
EE/ME 701: Advanced Linear Systems
Page 11
Lecture: Basics of Linear Algebra
>> det(A)
ans = 24
(Revised: Sep 06, 2012)
Page 12
EE/ME 701: Advanced Linear Systems
Section 1.7.0
1.7 Rank
EE/ME 701: Advanced Linear Systems
Section 1.8.0
1.8 The norm of a vector
|||| :
The rank of a matrix is the number of independent rows (or columns)
r = rank (A) ,
A Rnm
The number of independent rows is always equal to the
independent columns.
Example:
1 2 3 4
rT1
r =
A = rT2 = 5 6 7 8
1
6 8 10 12
rT
3
(9)
number of
r3 = 1.0 r1 + 1.0 r2, so the set of 3 vectors is linearly dependent. Because
there are 2 independent vectors,
(10)
The norm of a vector, written ||x|| is a measure of the size of the vector. A norm is
any function of a vector x with these properties, for any vector x and scalar a, then
1. Positivity, the norm of any vector x is a non-negative real number
Rn 7 R
||x|| 0
2. Triangle inequality
||x + v|| ||x|| + ||v||
3. Positive homogeneity or positive scalability
||a x|| = |a| ||x||
rank (A) = 2
where |a| is the absolute value of a .
The third row is the sum of the first two.
Notice that the 3rd column is 2 x Col2 - Col1
And the 4th column is 3 x Col2 - 2 x Col1
Actually, property 1 follows from properties 2 and 3, so 2 and 3 are sufficient for
the definition.
The rank of a matrix can not be greater than the number of rows or columns
rank (A) min (n , m)
A matrix is said to be full rank if rank (A) takes its maximum possible value,
that is
rank (A) = min (n , m)
Otherwise the matrix is said to be rank deficient.
A square matrix is invertible if it is full rank.
The determinant of a square matrix is zero if the matrix is rank deficient.
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 13
Additionally, the norms we will use have the property
4. Positive definiteness
||x|| = 0
if and only if x = 0
That is, the norm is zero only if the vector is a vector of all zeros
(called the null vector). Technically, a norm with property 4 is
called a seminorm, but this distinction will not be important
for us.
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 14
EE/ME 701: Advanced Linear Systems
Section 1.8.1
EE/ME 701: Advanced Linear Systems
Section 1.9.0
1.9 Basic properties of the singular value decomposition (svd)
1.8.1 Example norms
The 2-norm, (a.k.a. the Euclidean norm)
||x||2 =
example:
h
i
xT = 2 3 4 ,
||x||2 =
A matrix A Rnm has p singular values, where p = min (n, m)
n
x2i
(11)
r
22 + (3)2 + (4)2 = 5.39 (12)
Other common norms are the 1-norm and the -norm, these are all special
cases of the p-norm
We can write the 2-norm as:
x2i
||x||2 =
The p-norm is given as:
|xi|
||x|| p =
||x||1 =
|xi|
i
!1
||x|| = lim
Examples:
h
i
2 3 4 = 9
1
Lecture: Basics of Linear Algebra
|xi|
i
1
4
7
2
2
5
6
1
3
6
5
1
>> S = svd(A)
S = 14.1542
2.5515
0.3857
i , the singular values, are a measure of how a matrix scales a vector.
For example, for matrix A,
(13)
!1
(14)
another vector v3 so that A v3 is scaled by 0.3857.
With example matrix A, choosing v1 and v3 to illustrate the largest and
smallest singular values (more later on choosing v1 and v3) gives:
v1 = -0.5763,
-0.5735
-0.5822
= |xi |
v3 = 0.3724
-0.8184
0.4375
(15)
>> y1 = A*v1
y1 = -3.4700
-8.6660
10.3862
-2.3083
>> y3 = A*v3
y3 = 0.0482
0.0227
-0.1159
0.3640
Now ||y1|| = 14.1542 ||v1 || and ||y3|| = 0.3857 ||v3 ||
The -norm is the maximum absolute value:
!1
A =
there is a vector v1 so that A v1 is scaled by 14.1542, and
!1
The 1-norm is the sum of the absolute values:
(also called the Manhattan metric)
1
For example:
i=1
= max |xi |
(16)
h
i
2 3 4 = 4
Since 1 = 14.1542 is the largest singular value, there is no vector v that
gives a y = A v larger than y1 .
Since 3 = 0.3857 is the smallest singular value, there is no vector v that
gives a y = A v smaller than y3
(Revised: Sep 06, 2012)
Page 15
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 16
EE/ME 701: Advanced Linear Systems
Section 1.10.0
1.9.1 SVD, condition number and rank
The SVD is used to determine the rank and condition number of a matrix.
rank (A) = # singular values > 0 ,
rank (A) = 3
(Read # singular values significantly different from zero)
cond (A) = ratio largest / smallest singular value,
For square matrices, |det (A)| = ni=1 i ,
EE/ME 701: Advanced Linear Systems
Section 1.10.1
But suppose the ideal values y are unknown, and instead we have
measurements y given by:
y = y + e
y
(20)
Where
y is a measurement of y,
y is the ideal value of y (generally unknown), and
cond (A) = 36.70
det (A) = undefined
We will be looking at the SVD in detail in several weeks (cf. Bay, Chap. 4).
e
y represents measurement noise,
For example e
y = N (0, y ) are samples with a normal (Gaussian) random
distribution, with zero mean and y standard deviation.
Then what we can compute is
1.10 The condition number of a matrix
x = A1
1
The condition number of a matrix indicates how errors may be scaled when
the matrix inverse is used.
For example if we have these two equations in two unknowns,
7
= 3 x1 + 4 x2
y = Ax
becomes
Where x is an estimate of x . Notation:
x:
a vector in R p (to be estimated)
x :
The true value of x
x :
The estimated values of x
x = A1
1 y=
Lecture: Basics of Linear Algebra
0.2 0.4
1 2
x1
x2
One answer is
(18)
In the worst case, the error is amplified by the condition number of A1
max
01.
0.3
e
x = x x
An important question is, how much error does measurement noise e
y
introduce into the calculation of x ?
We can solve for unknowns x1 , x2 with
e
x: Misadjustment of estimates
1.10.1 Condition number and error
Writing in the matrix-vector form
7
(21)
(17)
9 = x1 + 2 x2
7
9
5
2
(Revised: Sep 06, 2012)
||e
x||
||e
y||
= cond (A1 )
||x||
||y ||
or
(19)
Page 17
||e
x|| ||x|| cond (A1 )
Lecture: Basics of Linear Algebra
(22)
||e
y||
||y||
(Revised: Sep 06, 2012)
Page 18
EE/ME 701: Advanced Linear Systems
Section 1.10.1
Example:
EE/ME 701: Advanced Linear Systems
Section 1.10.3
1.10.2 Example of a badly conditioned matrix
>> A1 = [3 4 ;-1 2]
>> ystar = [7; -9]
A1 =
ybar =
3
-1
Rather than Eqn (17) suppose we have the data
4
2
2 = 2 x1 + 4 x2
198 = 200 x1 + 401 x2
7
-9
Then x is estimated by:
>> xhat0 = A1 \ ystar
xhat0 =
5
-2
>> cond(A1)
ans =
2.6180
x 2 = A1
2 y2
%% Notice left-division operator
%% Given by calculation with no noise
>> A2 = [2 4 ;200 401],
%% This is the condition number.
2
200
4,
401
ystar2 =
2
198
The Condition number of A2 is large
>> cond(A2),
If ||e
y|| = 0.01 , then
max (||e
x|| / ||x ||) = 2.6180 (0.01/11.4) = 0.0023 , or
(23)
||e
x|| 2.6180 (0.01/11.4) 5.39 = 0.012
ans =
>> ytilde1 = 0.01 * rand(2,1)
ytilde1 = 0.0087
-0.0093
>> xhat1 = A1 \ (ystar1 + ytilde1)
xhat1 =
5.0055
%% Errors of about 0.0023 * ||x||
-2.0019
%%
= 0.0023 * 5.39 = 0.012
Estimating x 2 using A2
>> ytilde2 = 0.01 * rand(2,1)
ytilde2 =
0.0121
-0.0140
>> ybar2 = ystar2 + ytilde2
ybar2 =
2
+
0.0121
198
-0.0140
>> xhat2 = A2 \ ybar2
xhat2 =
7.4592
-3.2266
(Revised: Sep 06, 2012)
1.0041e+05 = 100410.
>> xstar2 = A2 \ ystar2
xstar2 = 5
%% Calculating with ideal data still gives
-2
%% ideal results
Example numerical values:
Lecture: Basics of Linear Algebra
A2 =
Page 19
Lecture: Basics of Linear Algebra
%% An example sample of noise
%% An example measurement of Y
2.0121
197.9860
%% This is way off !
(Revised: Sep 06, 2012)
Page 20
EE/ME 701: Advanced Linear Systems
Section 1.10.3
EE/ME 701: Advanced Linear Systems
Section 1.11.0
1.11 Eigenvalues and eigenvectors
1.10.3 How condition number is determined
Condition number is given as the ratio of the largest to smallest singular
value
A
(24)
cond (A) =
A
Where A is the largest singular value of A , and
A is the smallest singular value of A .
A square matrix has eigenvectors and eigenvalues, making up the
eigensystem of the matrix.
The main properties of eigenvectors and eigenvalues are introduced here.
Consider: A vector has both a direction and a magnitude. For example:
We will learn about the singular values later in this course
v1 =
>> SingularValues = svd(A1)
SingularValues =
5.117
1.954
%% cond(A1) = 2.618
>> SingularValues = svd(A2)
SingularValues =
448.1306
0.0045
%% cond(A2) = 1.0041 x 105
1
2
, v2 =
1.5
3.0
, v3 =
1.58
1.58
v1 has the same direction as v2 , but has a different magnitude.
v3 has the same magnitude as v1 , but has a different direction.
x2
v1
In general, multiplying a vector v by a matrix A
v3
x1
y4 = A v4
What matrices have condition numbers:
Singular values are defined for any matrix, and so condition number can
be computed for any matrix.
introduces both a change of magnitude and a change of
direction. For example:
y4 = A v4 =
A rank-deficient matrix B has B = 0 , and so cond (B) = .
(The error multiplier goes to infinity !)
2.0 0.5
1.0
0.5
1
1
2.5
0.5
v2
x2
y4
x1
v4
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 21
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 22
EE/ME 701: Advanced Linear Systems
Section 1.11.0
For any square matrix A, certain vectors have the property that the matrix
changes magnitude only, not direction. That is, writing
EE/ME 701: Advanced Linear Systems
Section 1.11.1
In general, we can write:
i vi = A vi
where the i
C1
(25)
are the eigenvalues and the vi are the eigenvectors.
y = Av
then if v is an example special vector, y and v have the same direction, but
possibly different magnitude. If the directions are the same,
y = v or
Notice that in general a x 6= A x, because multiplication by matrix A will
rotate a general vector x
(Choosing vectors x at random, what is the probability of selecting an
eigenvector ?)
v = Av
1.11.1 Some properties of Eigenvectors and Eigenvalues
These special vectors are called the eigenvectors.
1. Only square matrices have an eigensystem
For example matrix A, the special vectors are:
1.5
1.5
2.0 0.5
1.0
0.5
1
1
2. The eigenvalues are the solutions to the equation
2.0 0.5
1.0
0.5
1
2
A vi i vi = 0
An eigenvector is scaled by the matrix
1.5
1.5
1
2
2.0 0.5
1.0
0.5
2.0 0.5
1.0
0.5
Lecture: Basics of Linear Algebra
1
1
1
2
det (A i I) = 0
= 1.5
= 1.0
1
1
1
2
, so v1 =
, so v2 =
, 1 = 1.5
(26)
Proof: From Eqn (25)
so
(A i I) vi = 0
But, given that vi 6= 0, the second expression is only possible if (A i I) is
rank deficient.
QED
, 2 = 1.0
(Revised: Sep 06, 2012)
Page 23
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 24
Student
Exercise
EE/ME 701: Advanced Linear Systems
Section 1.11.1
Example using det (A i I) = 0: Starting with
det
a b
c d
EE/ME 701: Advanced Linear Systems
5. A complete set of eigenvectors exists when an n n matrix has n linearly
independent eigenvectors (so V, the matrix of eigenvectors, is invertible).
= ad bc
For the example matrix
A=
and plugging in
det
2.0 0.5
1.0
0.5
1 0
0 1
= det
Section 1.11.1
2.0
0.5
1.0
0.5
= (2.0 ) (0.5 ) 1.0 (0.5) = 2 2.5 + 1.5 = 0
1.0
0.5
the matrix of eigenvectors is
V=
The expression leads to a polynomial equation in :
2.0 0.5
v1 v2
1 1
1.5 0
1 2
and the matrix of eigenvalues is
3. The polynomial equation given by Eqn (26) is called the characteristic
equation of the matrix A
(a) A 3x3 matrix gives an equation in 3, a 4x4 gives an equation in 4 , etc.
(b) Abels theorem states: there is no closed-form solution for the roots of
a polynomial of 5th order and above, therefore there is no closed-form
solution for the eigenvalues a 5x5 matrix or larger.
U=
1 0
0 2
6. A matrix that lacks a complete set of eigenvectors is said to be defective,
but this name is meaningful only from a mathematical perspective. For
many control systems, some combinations of parameter values will give a
defective system matrix.
4. Special case, when A is upper-triangular, lower-triangular or diagonal
n
det (A I) = (dk )
(27)
k=1
A defective matrix can only arise when there are repeated eigenvalues.
Where dk is the kth diagonal element.
Eqn (27) shows that for upper-triangular, lower-triangular and diagonal
matrices, the eigenvalues are the diagonal elements.
Lecture: Basics of Linear Algebra
This case corresponds to the case of repeated roots in the study of
ordinary differential equations, where solutions of the form y (t) = t et
arise.
(Revised: Sep 06, 2012)
Page 25
In this case, a special tool called the Jordan Form is required to solve
the equation x (t) = A x (t) + B u (t).
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 26
EE/ME 701: Advanced Linear Systems
Section 1.11.1
7. The eigensystem is computed in Matlab with the eig() command,
1. As mentioned, some matrices do not have a complete set of eigenvectors
(Student Exercise: What is the name for this ?)
V is the matrix of eigenvectors, U is a matrix with the corresponding
eigenvalues on the main diagonal.
1
2
-2
4
5
6
>> [V,U] = eig(A)
V =
0.5276
0.2873
-0.7994
U =
Section 1.11.2
1.11.2 Additional notes on Eigenvectors and Eigenvalues
>> [V,U] = eig(A)
>> A =
EE/ME 701: Advanced Linear Systems
For a symmetric real matrix (or Hermitian complex matrix), it is
guaranteed that all eigenvalues are real.
3
1
3 ]
Furthermore, it is guaranteed that a complete set of eigenvectors exists !
0.5267 - 0.0250i
-0.1294 + 0.1030i
0.8335
-2.4563
0
0
0
5.2282 + 0.5919i
0
0.5267 + 0.0250i
-0.1294 - 0.1030i
0.8335
0
0
5.2282 - 0.5919i
8. Notice that eigenvalues can be real or complex.
>> Poly = poly(diag(u))
Poly =
1.0000
-8.0000
2.0000
68.0000
3 8 2 + 2 + 68 = 0
(28)
As with many things, wikipedia has a very nice article on eigenvectors and
eigenvalues:
Everything you wanted to know about the eigensystem of A, in 3 pages.
(Revised: Sep 06, 2012)
System matrices (A in x = Ax) are rarely symmetric, but certain other
important matrices are.
3. The determinant of a matrix is equal to the product of the eigenvalues of the
matrix
n
det (A) = i
i=1
Example:
>> aa = [ 1 2 3 ; 4 -1 -3; 5 2 1]
aa =
1
2
3
4 -1 -3
5
2
1
>> [Evecs, Evals] = eig(aa)
The characteristic equation corresponding to 3 3 matrix A is
Lecture: Basics of Linear Algebra
2. Symmetric matrices are a special case.
Page 27
Evecs = 0.6128
0.0136
0.7901
0.4667
-0.8751
-0.1276
0.2146
-0.8542
0.4734
>> prod(diag(Evals))
ans = 6.0000
>> det(aa)
ans = 6.0000
Lecture: Basics of Linear Algebra
%% Find the eigensystem
Evals = 4.9126
0.0000
0.0000
0.0000
-3.5706
0.0000
0.0000
0.0000
-0.3421
%% Product of the eigenvalues
%% Equals the determinant
(Revised: Sep 06, 2012)
Page 28
EE/ME 701: Advanced Linear Systems
Section 1.11.2
EE/ME 701: Advanced Linear Systems
Section 1.11.2
5. Matlab does not create the characteristic equation using det (A I) = 0,
and then solving for the polynomial roots.
4. The determinant equation
det (A I) = 0
(29)
Matlab actually goes the other way, and solves for the roots of a polynomial
by forming a matrix and finding the eigenvalues of that ! The polynomial
gives a polynomial in
1 3 8 2 + 2 + 68 = 0
+ an1
n
n1
+ + a1 + a0 = 0
(30)
is represented in Matlab as the vector
Eqn (30) is the characteristic equation of matrix A. The eigenvalues of
A are the roots of Eqn (30).
Poly = [ 1, -8, 2, 68]
and the command roots() finds the roots:
Eqn (29) is useful for theoretical results, but generally not a practical method
for calculating the eigenvalues, for n > 2:
>> roots(Poly)
Once you get Eqn (30), how are you going to find the roots of the
polynomial ? (Answer on the next page)
Working with the determinant does not lead to a numerically stable
algorithm.
ans = [ 5.2282 + 0.5919i
5.2282 - 0.5919i
-2.4563
Actually under the hood Matlab forms a matrix (called the companion
matrix) and applies the eigenvalue routine to that:
>> compan(Poly)
ans =
The determinant does not give the eigenvectors
8.0000
1.0000
0
-2.0000
0
1.0000
-68.0000
0
0
Going from Eqn (29) to Eqn (30) by solving the determinant involves
symbolic manipulation of matrix A. It is a lot of work.
>> eig(compan(Poly))
ans = [ 5.2282 + 0.5919i
5.2282 - 0.5919i
-2.4563 ]
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 29
Lecture: Basics of Linear Algebra
3 + a2 2 + a1 + a0 = 0
a2 a1 a0
C= 1
0
0
0
1
0
(Revised: Sep 06, 2012)
Page 30
EE/ME 701: Advanced Linear Systems
Section 1.11.2
Relationship
EE/ME 701: Advanced Linear Systems
Section 1.11.3
1.11.3 One final fact about the eigensystem: V diagonalizes A
Roots of Characteristic Equation Eigenvalues of the Companion Matrix
For a state-variable model, the poles of the system are the eigenvalues of the
A matrix
When a complete set of eigenvectors exists, the matrix of the eigenvectors
can be used to transform A into a diagonal matrix, by similarity transform.
The general form for the similarity transform is:
If the A matrix has poles in the right half plane, the system is unstable.
Note: I havent told you an algorithm for eig(A).
b = T A T 1
A
T must be invertible
The algorithm is: Use Matlab !
(31)
THEOREM: The similarity transform preserves eigenvalues, that is
Matlabs eig() command uses one from a library of algorithms,
depending the details of the matrix. The study of efficient algorithms
to find eigenvectors and eigenvalues has been an active area of research
for at least 200 years.
b
eig (A) = eig(A)
Proof: starting with Eqn (26),
det (A I) = 0
Left multiplying by T and right multiplying by T 1
det T (A I ) T 1 = det (T ) det (A I)
1
=0
det (T )
(32)
Where the right hand equality arises because det (A I) = 0 and det (T ) is finite.
From (32) it follows that
b I = 0
det T (A I ) T 1 = det T A T 1 T I T 1 = det A
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 31
b
Therefore any value that is an eigenvalue of A is an eigenvalue of A.
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
QED
Page 32
EE/ME 701: Advanced Linear Systems
Section 1.11.3
THEOREM: When a square matrix A has a complete set of independent
eigenvectors
i
h
V = v1 , v2 , , vn ,
EE/ME 701: Advanced Linear Systems
V diagonalizes A (continued)
A numerical example:
the array V of eigenvectors provides a similarity transform
where U is a diagonal matrix with the eigenvalues on the main diagonal.
Proof: Since the vectors vi are eigenvectors,
V=
A vi = i vi
AV = A
v1 , v2 , , vn
1 v1 , 2 v2 , , n vn
1 v1 , 2 v2 , , n vn
v1 , v2 , , vn
1 0 0
0 2 0
.. .. . . . ..
.
. .
0
0 n
1 0
[V, U] = eig (A)
0.2320 0.7858
16.1168
1.1168
0.0000
= VU
Left multiplying both sides by V1 gives:
b = U.
Equation (33) is a similarity transform with T = V1 and A
(Revised: Sep 06, 2012)
= 0.5253 0.0868 0.8165
0.8187 0.6123 0.4082
V1 A V =
16.1168
1.1168
0.0000
1 2 3
V U V1 = 4 5 6
7 8 9
VU = AV
V1 V U = U = V1 A V
0.4082
We have:
So
Lecture: Basics of Linear Algebra
v1 v2 v3
U = 0 2 0 =
0 0 3
The matrix on the right can be expressed:
And so
h
1 2 3
A= 4 5 6
7 8 9
U = V1 A V
Section 1.11.3
(33)
(and, of course,
=U
=A
i vi = A vi )
QED
Page 33
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 34
EE/ME 701: Advanced Linear Systems
Section 2.0.0
2 Two equations of interest
y, A : known;
Section 2.1.0
2.1 Algebraic Equations, y = A b
1. Solving for unknown constants (an algebraic equation)
y = Ab
EE/ME 701: Advanced Linear Systems
solve for: b
(34)
Think of n equations in p unknowns, for example an experiment with a
process:
y (k) = b1 v (k) + b2 v2 (k)
(35)
where v (k) is an independent variable and y (k) is the measurement.
The objective is to determine the elements of the b vector from Eqn (35)
with v (k) known and y (k) measurements.
2. Solving a differential equation
x (t) = A x (t) + B u (t)
A, B, x (t = 0) : known;
solve for x (t)
Introducing some notation, we can write:
y (k) = (k) b =
T
v (k) v2 (k)
b1
b2
(36)
where (in this example)
(k) =
For example, if b =
2 3
iT
y (1) = (1) b =
T
y (2) = T (2) b =
..
.
y (n) = (n) b =
T
. See Strang for a nice treatment of the two problems of linear algebra.
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 35
Lecture: Basics of Linear Algebra
h
h
h
v (k)
v2 (k)
(37)
and v (1) = 2, v (2) = 3, ... v (n) = 1 then
2 4
3 9
1 1
= 2 2 + 4 3 = 16
= 3 2 + 9 3 = 33
2
3
= 1 2 + 1 3 = 1
(Revised: Sep 06, 2012)
Page 36
EE/ME 701: Advanced Linear Systems
Section 2.1.0
With several measurements (equations) we get a matrix vector equation (the
- emphasize that T (k) is a row vector:
y (1)
T (1)
..
y = ... =
b = Ab
.
T
y (n)
(n)
Vocabulary
EE/ME 701: Advanced Linear Systems
2.1.1 Case 1: Where we have n = p independent equations, the exactly
constrained case
In this case A Rnn is a square matrix.
(38)
When A is full rank an inverse matrix exists so that
A1 A = A A1 = I
Example
b R p is the parameter vector.
b=
y Rn is the data vector
b1 b2
iT
v (k) R1 is an independent variable that determines the regressor
vector
iT
h
(k) = v (k) v2 (k)
(k) R p is the regressor vector
A Rnp is the regressor matrix
The columns of A, which correspond to the elements of () are called
the basis vectors or basis functions of model A
The solution to Eqn (34) is given as:
y
b
y
e
y
: Measured values
b
b
e
b
: Estimated values
: Misadjustment
Generally, the true values are unknown
b
y = Ab
b
(left multiply by A
Example,
y = A1 A b
(39)
(40)
T (k) =
based on Eqn (36), i.e.,
v (k) v2 (k)
i
Suppose we have data for k = 1, 2, with v (1) = 1.5, v (2) = 3 and
y (1) = 12.75, y (2) = 43.5
Page 37
b
b = A1 y
12.75
43.5
y (1)
y (2)
b
b=
(Revised: Sep 06, 2012)
which gives the solution
and so
Lecture: Basics of Linear Algebra
A1 y = I b = b
e
y=b
yy
y = Ab
(original equation)
Notation
y b : True values
Section 2.1.1
T (1)
T (2)
1.5 2.25
3.0 9.0
Lecture: Basics of Linear Algebra
b
b = Ab
b=
y,
b
b=
2.5
4.0
1.5 2.25
3.0 9.0
b
b
(Revised: Sep 06, 2012)
Page 38
EE/ME 701: Advanced Linear Systems
Section 2.1.2
2.1.2 Case 2: Where n > p, we have more equations than unknowns, the
over constrained case (Left pseudo-inverse case):
This is a common case. For example, a model has 3 parameters, and we run
an experiment and collect 20 data. We have 20 equations in 3 unknowns.
b.
In this case there is generally no value of b which exactly satisfies y = A b
Define e
y = y Ab
b,
minimizing ||e
y||2
EE/ME 701: Advanced Linear Systems
Section 2.1.2
Consider the dimensions of A and A# :
h
:
AT
AT
Student exercise: what are the dimensions of A# ?
the left pseudo-inverse gives the solution
To derive the left pseudo-inverse, start with:
y = Ab
b
: [] =
[]
1. Left multiply each side by AT :
AT y = AT A b
b
Note that AT A is an p p matrix, for example a 3 3 matrix when
there are 3 unknowns.
1
2. When AT A R pp is full rank, left multiply each side by AT A
gives:
1
1
b
(41)
AT A b
b = Ib
AT A
AT y = AT A
3. Which gives
1
b
AT y = A# y
b = AT A
(42)
1 T
A is called the left pseudo-inverse of A .
where A# = AT A
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 39
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 40
EE/ME 701: Advanced Linear Systems
Section 2.1.3
2.1.3 Case 3: Where n < p we have fewer equations than unknowns, the
under-constrained case (Right pseudo-inverse case):
In the under-constrained case, A is a row matrix, so our equation has the
shape:
h
i
y = Ab
: [] =
b which exactly satisfy
In this case there are generally many values of b
b
b
y = A b , the right pseudo-inverse gives the solution with minimum b
2
EE/ME 701: Advanced Linear Systems
Section 2.1.3
Remarks
1. For each of cases 1, 2, and 3 we had the requirement that A , AT A or
A AT must be full rank. When these matrices are not full rank, a more
general method based on the Singular Value Decomposition (SVD) is
needed.
2. With the SVD and four fundamental spaces of matrix A we will be able to
find the set of all vectors b
b which solve
y = Ab
The right pseudo-inverse is given by:
1
T
A+
A AT
R = A
AT
When A AT Rnn is full rank then:
AT
We will see the SVD and four fundamental spaces of a matrix in Bay,
Chapter 4.
1
T
A AT
=I
A A+
R = AA
In this case, one solution for b
b is given by:
(43)
Plugging Eqn (43) into y = A b
b gives:
(44)
b
b = A+
Ry
1
= AT A AT
y
1
y = A AT A AT
y = Iy
Demonstrating that b
b = A+
R y is a solution.
Student exercise: what are the dimensions of A+
R ?
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 41
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 42
EE/ME 701: Advanced Linear Systems
Section 2.2.0
2.2 Differential Equations,
EE/ME 701: Advanced Linear Systems
Section 2.2.0
The state-variable model is the topic of the second half of the course.
The linear state variable model is a very general form for modeling dynamic
systems in control theory, economics and operations research, biology and
other fields.
The state-variable model is a differential equation with the form:
x (t) = A x (t) + B u (t)
Here we only note that solving the differential equation
x (t) = A x (t) + B u (t)
is fundamentally different from solving the algebraic equation
(45)
where
y = Ab
and involves the eigensystem of the system matrix A and using a similarity
transform and transformation to modal coordinates to determine the
solution.
x (t) Rn is the state vector,
A Rnn is the system matrix,
The differences between the solution of algebraic and differential equations
is summarized in table 1.
B Rnm is the input matrix, and
u (t) Rm is the input vector.
Equation Type
Just as solving an algebraic equation,
Algebraic
Differential
Nature of Solution
A value
A function of time
Main Tools:
Gaussian Elimination
Matrix Inverse
SVD
Eigenvalues
Eigenvectors
Singular Matrix
Problems,
Solution requires
SVD
O.K.
Complete set of Eigenvectors
Not important
Important
Rectangular matrix
OK
Impossible
3x = 7
is very different from solving a scalar differential equation,
3 x (t) = 2 x (t) + 7 u (t)
Solving a matrix differential equation requires tools very different from
those for solving a matrix algebraic equation.
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 43
Table 1: Differences in the approach to algebraic and differential equations. A
singular matrix is rank deficient.
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 44
EE/ME 701: Advanced Linear Systems
Section 3.0.0
3 Summary
Basics of linear algebra have been reviewed.
The two problems of linear algebra have been discussed.
Algebraic equations
y = Ab
(46)
take us into vector subspaces, projection and the Singular Value
Decomposition (SVD)
Differential equations
x (t) = A x (t) + B u (t)
(47)
take us into the eigensystem and coordinates transformations.
Lecture: Basics of Linear Algebra
(Revised: Sep 06, 2012)
Page 45
EE/ME 701: Advanced Linear Systems
EE/ME 701: Advanced Linear Systems
Vectors and Vector Spaces
Sn : the standard basis . . . . . . . . . . . . . . . . . .
28
4.1.3
Representations on vector spaces other than Rn . . . . . .
30
4.2
A Span of a Vector Space
. . . . . . . . . . . . . . . . . . . . .
31
4.3
Change of basis . . . . . . . . . . . . . . . . . . . . . . . . . . .
32
Contents
1 Introduction
4.1.2
4.3.1
Notation for a change of basis . . . . . . . . . . . . . . .
33
1.1
Definitions: three ways to multiply vectors . . . . . . . . . . . . .
4.3.2
Chaining changes of basis . . . . . . . . . . . . . . . . .
34
1.2
Definitions: Properties of Vectors
. . . . . . . . . . . . . . . . .
4.3.3
Change of basis example . . . . . . . . . . . . . . . . . .
34
1.2.1
Parallel . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2.2
Perpendicular (or Orthogonal) . . . . . . . . . . . . . . .
Change of basis viewed geometrically (this section is connected
to Bay, section 2.2.4) . . . . . . . . . . . . . . . . . . . . . . . .
36
1.2.3
Norm (magnitude) . . . . . . . . . . . . . . . . . . . . .
4.4.1
Example, change of basis viewed geometrically
. . . . .
37
1.3
Direction Cosine . . . . . . . . . . . . . . . . . . . . . . . . . .
10
4.4.2
1.4
Parallel, Perpendicular and the Zero Vector . . . . . . . . . . . .
11
Numerical example based on representing from basis
vectors on the to basis vectors . . . . . . . . . . . . . .
38
Summary . . . . . . . . . . . . . . . . . . . . . . . . . .
40
4.4
4.4.3
2 Vector Spaces
12
2.1
Properties the scalars in Eqn (12) must have . . . . . . . . . . . .
14
2.2
Properties a Vector Space must have . . . . . . . . . . . . . . . .
15
3 Properties of ensembles of vectors
3.1
3.2
17
5.1
5.2
41
Example proper vector subspace . . . . . . . . . . . . . . . . . .
43
5.1.1
Observations on subspaces of Euclidean spaces . . . . . .
45
What about other dimensions of A ? . . . . . . . . . . . . . . . .
46
Defining a vector space . . . . . . . . . . . . . . . . . . . . . . .
19
5.2.1
The set-theoretic meaning of almost all . . . . . . . . .
46
3.1.1
A vector space defined by a spanning set of vectors . . .
20
5.2.2
A vector y orthogonal to a proper subspace B . . . . . . .
47
3.1.2
A vector space defined by a set of basis vectors . . . . . .
20
Dimension of a Vector Space . . . . . . . . . . . . . . . . . . . .
21
4 Basis of a Vector Space
4.1
5 Vector Subspace (following Bay 2.4)
23
Representation of vector x on basis V . . . . . . . . . . . . . . .
4.1.1
(Revised: Sep 10, 2012)
Page 1
48
6.1
Projection Theorem . . . . . . . . . . . . . . . . . . . . . . . . .
50
6.2
Projection of a vector onto a proper subspace . . . . . . . . . . .
51
6.2.1
First projection example, projection onto a 1-D subspace .
51
Normalization of the basis vectors . . . . . . . . . . . . . . . . .
54
24
Introducing notation to help keep track of vectors and bases 26
Part 2: Vectors and Vector Spaces
6 Projection Theorem
6.3
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 2
EE/ME 701: Advanced Linear Systems
6.4
EE/ME 701: Advanced Linear Systems
Projection Matrices . . . . . . . . . . . . . . . . . . . . . . . . .
56
6.4.1
Bay example 2.10, projecting f onto a 2-D subspace . . .
57
9.1
Important properties of Inner Products . . . . . . . . . . . . . . .
87
6.4.2
Projection matrix for the orthogonal complement . . . . .
60
9.2
Important properties of Norms . . . . . . . . . . . . . . . . . . .
88
6.4.3
Projection with normalized basis vectors . . . . . . . . .
61
9.3
Study tip: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
7 Gram-Schmidt ortho-normalization
7.1
63
7.1.1
Example Gram-Schmidt Ortho-normalization . . . . . . .
63
7.1.2
Tolerance value for Gram-Schmidt algorithm . . . . . . .
67
7.2
Projection matrix with GS Ortho-normalization . . . . . . . . . .
68
7.3
Projection Coefficients . . . . . . . . . . . . . . . . . . . . . . .
69
7.4
Projection onto the orthogonal complement. . . . . . . . . . . . .
70
7.5
Projection and fitting parameters to experimental data . . . . . . .
71
8 Additional Topics
8.2
72
The Four Fundamental Spaces of a Matrix . . . . . . . . . . . . .
72
8.1.1
Numerical Examples of the four fundamental spaces, . . .
74
8.1.2
Computing bases for the four fundamental spaces . . . . .
78
8.1.3
Bases for the Four Fundamental Spaces, Numerical
Example . . . . . . . . . . . . . . . . . . . . . . . . . .
80
8.1.4
The Four Fundamental Spaces of a Matrix, revisited . . .
81
8.1.5
Questions that can be answered with the four fundamental
spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
Two ways to determine the four fundamental spaces . . .
83
Rank and degeneracy . . . . . . . . . . . . . . . . . . . . . . . .
84
8.1.6
Part 2: Vectors and Vector Spaces
86
62
Process of Gram-Schmidt Ortho-normalization . . . . . . . . . .
8.1
9 Summary and Review
(Revised: Sep 10, 2012)
Page 3
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 4
EE/ME 701: Advanced Linear Systems
Section 1.0.0
1 Introduction
EE/ME 701: Advanced Linear Systems
Section 1.1.0
1.1 Definitions: three ways to multiply vectors
Three types of vector products are
A vector: an n-tuple of numbers or functions
A 3-vector: x = 1
An n-vector: z =
z1
z2
..
.
zn
sin (2k/4)
cos (2k/4)
A 4-vector: x (k) =
sin (4k/4)
cos (4k/4)
A 2-vector: u (t) =
2.0 sin (t)
2.0
Euclidean Vector: A Euclidean n-space corresponds to our intuitive notion of
2-D or 3-D space:
Inner product (also scalar or dot product):
gives a scalar value
Outer product (also tensor or matrix product):
gives a matrix, and
Vector product (also cross product):
gives a vector result.
Inner (scalar) product (also dot product):
A scalar measure of the interaction of two vectors of the same dimension
hv, wi = vT w = wT v =
vi wi
where v , w , Rn
(1)
i=1
If v is complex, then vT is the complex-conjugate transpose, (note: in
Matlab the transpose operation forms the complex conjugate).
The inner product of vectors gives a scalar,
hv, wi = vT w = (a scalar)
Outer (matrix) Product:
Axes are orthogonal
The outer product gives a matrix,
Each vector corresponds to a point
v >< w = v wT = (a matrix)
Example:
Intuitive because we live in Euclidean 3-space, R3
Length (L2 -norm) corresponds to our every-day notion of length.
v= 2
3
w= 3
1
hv , wi = 11
2 3 1
v >< w = 4 6 2
6 9 3
Example of using the Outer Product: each contribution to the covariance
of parameter estimates is given by:
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 5
Part 2: Vectors and Vector Spaces
2eb = e
be
bT
(Revised: Sep 10, 2012)
Page 6
EE/ME 701: Advanced Linear Systems
Section 1.1.0
EE/ME 701: Advanced Linear Systems
Section 1.2.2
1.2 Definitions: Properties of Vectors
Cross (vector) product:
1.2.1 Parallel
vw = z
(2)
Two vectors v and w are parallel if they can be written
Produces a vector perpendicular to each of v and w
w = av
The magnitude of the vector is give by:
(4)
where a is a scalar.
||z|| = ||v|| ||w|| sin ()
(3)
Example:
v = 2 , w = 6 ,
3
9
where is the angle between v and w.
The cross-product is only defined for vectors in 3-space. It is used, for
example, electro-magnetics, 3D geometry and image metrology.
w=3 2 ,
3
so
vkw
Co-linear is another term for parallel vectors, as in
Cross product and Outer product are not much used in EE/ME 701
Vectors v and w are co-linear.
Inner (or dot) product will be used All the time.
1.2.2 Perpendicular (or Orthogonal)
If
hv, wi = 0
(5)
then v and w are said to be perpendicular (or, equivalently, orthogonal).
Example:
v = 2 , w = 5 ,
3
3
vT w =
1 2 3
so
5 = 1 10 + 9 = 0
vw
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 7
Part 2: Vectors and Vector Spaces
(6)
(Revised: Sep 10, 2012)
Page 8
EE/ME 701: Advanced Linear Systems
Section 1.2.3
EE/ME 701: Advanced Linear Systems
Section 1.3.0
1.3 Direction Cosine
1.2.3 Norm (magnitude)
The norm of a vector x is a non-negative real number, ||x|| 0
The inner prod. indicates how closely two vectors are related.
A norm is a measure of size (of a vector or matrix)\
If two vectors are parallel
(This is the correct test for ||)
The L p norms are most familiar.
|hv, wi| = ||v|| ||w||
L: Henri Lebesgue (1875-1941), set calculus (integration and
differentiation) on a rigorous foundation based on set theory.
The L2 norm, ||v||L2 (or ||v||2 , or simply ||v||) is the familiar notion of
distance:
r
||v||2 = v2i
(7)
i
hv, wi = 0
||v|| p =
|vi| = (|v1| + |v2| + ... + |vn| )
p
or
hv, wi
=0
||v|| ||w||
For vectors forming angles in between || and ,
Define :
1
p
1.0
(This is the correct test for )
If two vectors are perpendicular
The general L p norm ||v||L p (or simply ||v|| p):
r
hv, wi
= 1.0 or
||v|| ||w||
or
(8)
hv, wi
= cos ()
||v|| ||w||
(10)
The L2 norm is the default, throughout the literature, ||v|| refers to ||v||2,
unless expressly defined to be a different norm (or any norm).
The L2 norm is related to the dot product of a vector with itself:
1
2
||v||2 = hv, vi =
vT v
v2i
A handy corollary is:
hv, wi
= cos1
(11)
||v|| ||w||
w
(9)
Figure 1: Direction cosine.
In a general framework, vectors may be made with elements which are
not numbers, but so long as the dot product is defined, the 2-norm is still
given by Eqn (9).
The L2 -norm is said to be induced by the dot product. It is called the
induced norm.
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 9
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 10
EE/ME 701: Advanced Linear Systems
Section 1.4.0
Section 2.0.0
2 Vector Spaces
1.4 Parallel, Perpendicular and the Zero Vector
Looking at Eqn (4)
w = av
EE/ME 701: Advanced Linear Systems
(4, repeated)
When we talk about vector spaces we talk about sets of vectors. Closure is
the first important property of sets.
Rn is the most basic vector space. It is the set of all vectors with n elements.
If w = 0 (the zero vector) then
Closure: A set is said to be closed under an operation, if the operation is
guaranteed to return an element of the set.
0
0
= w = 0v
..
.
0
Examples:
The integers are closed under addition, subtraction and multiplication.
The integers are not closed under division.
with a = 0 .
The positive integers are closed under addition and multiplication.
The positive integers are not closed under subtraction or division.
Vector spaces are collections of vectors which are closed under scaled
addition (like superposition).
And likewise, if w = 0, then clearly
hv, wi = 0
(5, repeated)
So in a formal sense, we can say that every vector is both parallel and
perpendicular to the zero vector.
For a vector space S
S : {a set of vectors}
if
then
x1 , x2 S
x3 = a1 x1 + a2 x2 S
(12)
The central topic of Bay, chapter 2, is describing the ai , xi and S in the most
general possible way.
Lets start with a specific example, S = R3, the space in which we live.
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 11
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 12
EE/ME 701: Advanced Linear Systems
Section 2.0.0
EE/ME 701: Advanced Linear Systems
Section 2.1.0
2.1 Properties the scalars in Eqn (12) must have
Defining a vector x:
x
1
x = x2
x3
Eqn (12) comprises scalars and vectors
,
x1 , x2 , x3 R
(13)
The scalars must be elements of a field. For a set with operations to form a
field, it must have:
1. An additive identity 0:
Then S = {x} = R3 , the vector space S is the set of all vectors x
comprised of 3 real numbers.
a+0 = a
(15)
a1 = a
(16)
2. A multiplicative identify 1:
Examples:
2 ,
1 ,
5
0
0
3. Every element must have a negative:
if a F, then a F
Verifying that a1 x1 + a2 x2 S with an example (choosing the ai to be
real numbers):
1
2
0
2.0 2 + 1.0 1 = 5
1
5
3
(14)
a + a = 0
(17)
4. Operations of addition, multiplication and division must be defined, and
the set must be closed under these operations.
Examples:
The integers do not form a field (not closed under division)
The rational numbers do form a field (+, , , all work)
The field we will use most often is the real numbers, a R .
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 13
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 14
EE/ME 701: Advanced Linear Systems
Section 2.2.0
2.2 Properties a Vector Space must have
EE/ME 701: Advanced Linear Systems
Section 2.2.0
4. Closure under scalar multiplication: For all x S and for all a F, the
vector y = a x must be an element of S.
These are the fundamental properties of vector spaces and operations on
vectors. Other important properties, quantities and calculations are built out
of these basics.
A vector space is a set of vectors. It must have at least these properties:
x S,
a F , y = ax S
(23)
: for all
5. Associativity of scalar multiplication:
1. S must have at least 2 elements.
a, b F and x S ,
2. The linear vector space S must contain a zero element, 0 such that:
x+0 = x,
xS
(18)
(24)
6. Distributivity of scalar multiplication over vector addition:
3. Vector space S must contain the additive inverse of each element,
if x S , then y S s.t. x + y = 0
a (b x) = b (a x)
(19)
(a + b) x = a x + b x
(25)
a (x + y) = a x + a y
(26)
: there exists
The vector space (a set of vectors) and its operations must also have the properties:
1. The vector space S must be closed under addition:
if x + y = v, then v S
(20)
The fundamental properties of fields are such standard components of
standard algebra that we forget to think about them.
Any Vector Space (that is, set of vectors which satisfies the fundamental
properties above) will have the derived properties of vector spaces which
we will develop, such as
Orthogonality
2. Commutative property for addition:
Projection
x+y = y+x
(21)
length
etc.
3. Associative property for addition:
For 99% of what we will do in EE/ME 701, we will work with vectors in Rn
over the field of the real numbers.
(x + y) + z = x + (y + z)
(22)
Which is to say the familiar vectors and scalars, such as in Eqn (14).
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 15
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 16
EE/ME 701: Advanced Linear Systems
Section 3.0.0
3 Properties of ensembles of vectors
In many cases it is interesting to consider a finite set of vectors, such as:
v1 , v2 , ..., v p S
(27)
Example:
V=
1
2 ,
0
0 ,
0
1 ,
1
Section 3.0.0
The expression for linear dependence can also be written:
b1
| | |
|
b2
v p = v1 v2 v p1 .
=V b
..
| | |
|
b p1
(29)
And recall that a set of vectors is linearly independent if there is no set of
values {bi } which satisfies the conditions for Linear Dependence. In other
words,
1
1
1
EE/ME 701: Advanced Linear Systems
R3
bi vi = 0
i=1
implies that the {bi } are all zero.
We shall shortly see that a finite set of vectors can define a Vector Space.
Notation for a finite set of vectors: let V connote a set with p vectors:
V = v1 , v2 , ..., v p ,
or equivently ,
V = v1 v2 v p (28)
| | | |
New vectors can be formed as linear combinations of vectors v1 v p :
w = a1 v1 + a2 v2 + + a p v p
Recall that a set of vectors is linearly dependent if one or more of the
vectors can be expressed as a linear combination of the others, such as:
v p = b1 v1 + b2 v2 + + b p1 v p1
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 17
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 18
EE/ME 701: Advanced Linear Systems
Section 3.1.0
3.1 Defining a vector space
EE/ME 701: Advanced Linear Systems
Section 3.1.2
3.1.1 A vector space defined by a spanning set of vectors
In many cases, vector spaces contain an infinite number of vectors, so it
is not possible to define a vector space by writing down all its elements.
However, we can define a vector space in terms of a finite set of vectors.
Most often, we define a vector space as the set of all vectors which can be
created by linear combinations of specified vectors. For example:
2
1
S1 = x : x = a1 1 + a2 0 , a1, a2 R
1
2
(30)
S = x : x = a1v1 + a2v2 + ... + a pv p
(31)
where S is a vector space;
S is the set of all vectors formed as linear combinations of the spanning
vectors v1 , v2 , v p .
We can say that S is the space spanned by v1 , v2 , v p, and that vectors
v1 , v2 , v p span vector space S.
Vector space S1 is the set of all vectors x such that
Given an arbitrary set of p n-dimensional vectors V = v1 , v2 , v p , vi Rn ,
the set of vectors defines a vector space
Any set of vectors, at least one of which is non-zero, spans a vector space.
3.1.2 A vector space defined by a set of basis vectors
x = a1 1 + a2 0
1
2
A span is a very general concept, perhaps a bit too general. The concept of a basis
for a vector space is more restrictive. It works like a span, we write that the set of
basis vectors {vi } defines a vector space according to Eqn (31) , but for a set {vi }
to be a set of basis vectors they must additionally be linearly independent. When
the vectors vi are independent and we write
for any values of a1 and a2 , which are real numbers.
Read Eqn (30) as saying:
x = a1v1 + a2v2 + ... + ar vr
(32)
S1 =
Vector space S1 is
{}
the set
of vectors x
A set of basis vectors is the minimum set of vectors that spans a space
such that
For a given vector space S, the choice of basis vectors is not unique.
x =
x satisfies the conditions given
Any basis set for S is also a span for S
Part 2: Vectors and Vector Spaces
there is a unique solution {a1, a2... ar } for every vector x S. Because the basis
vectors vi are linearly independent, there is exactly one solution for Eqn (32).
(Revised: Sep 10, 2012)
Page 19
Part 2: Vectors and Vector Spaces
(is any span S also a basis S?)
(Revised: Sep 10, 2012)
Page 20
EE/ME 701: Advanced Linear Systems
Section 3.2.0
3.2 Dimension of a Vector Space
EE/ME 701: Advanced Linear Systems
Definition of vector space S1
Dimension: The dimension of a linear vector space is the largest number of
linearly independent vectors that can be taken from that space.
Examples:
The dimension of vector spaces Rn is n
The dimension of the vector space given by Eqn (33) is 2
A vector subspace is itself a vector space (if we call S1 a subspace, we
are emphasizing that it is embedded in a larger vector space).
Example:
Universe is R3
S1 is 2-D
(Eqn (33) repeated)
2
1
3
S1 = x : x = a1 1 + a2 0 , a1, a2 R , xi R
1
2
Set S1 is a vector space. Thus, if v and w are elements of S1 , then
2
1
3
S1 = x : x = a1 1 + a2 0 , a1 , a2 R , xi R
(33)
1
2
Section 3.2.0
z = b1 v + b2 w ,
z S1 ,
b1 , b2 R
Consider the properties of a vector space one by one, you will see that
they all hold.
The dimension of S1 is the number of linearly independent vectors that can
be selected from S1 .
Note that even though v, w, z R3 , it is still true that dim (S1 ) = 2.
1
0
1
2
3
3
2
1
0
2
Y
2
3
Figure 2: S1 , a 2-dimensional vector subspace embedded in R3 .
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 21
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 22
Student
Exercise
EE/ME 701: Advanced Linear Systems
Section 4.0.0
4 Basis of a Vector Space
EE/ME 701: Advanced Linear Systems
Section 4.1.0
4.1 Representation of vector x on basis V
Working with a particular vector space V of vectors v, x, y Rn , if we select
a set of vectors BV = {v1, v2, ..., vr } from V such that
1. We have r vectors, where r is the dimension of vector space V, and
VECTOR REPRESENTATION THEOREM: Given a set of vectors BV =
{vi } = {v1 , v2 , ..., vr } that are basis vectors for r-dimensional vector space V
embedded in Rn , then any vector sx V can be represented as a unique linear
combination of basis vectors
2. The vectors are linearly independent
s
x = a1 v1 + a2v2 + ... + ar vr
Then we have created a basis for vector space V.
Basis: A set of linearly independent vectors, BV , in vector space V is a basis for
V iff every vector in V can be written as a unique linear combination of
vectors from this set.
x = a1 v1 + a2 v2 + + ar vr
BV = {v1 , v2 , ..., vr } is a basis for V iff
(35)
Since V is r-dimensional and the set of vectors {x, v1, v2, ..., vr } contains
r + 1 vectors, the set must be linearly dependent. Since the set is linearly
dependent there exists a set of scalars ai (called the basis coefficients) such
that x + a1 v1 + a2 v2 + + ar vr = 0, which gives
Or written mathematically,
Example basis:
x, v1 ..vr Rn
Proof:
Existence:
A more formal definition is given:
exactly one a = {a1, ..., ar }
x V,
Uniqueness:
s.t. x = a1v1 + + ar vr
1
2
BV = 1 , 0
1
2
(36)
Suppose the ai giving vector x were not unique. This means there is a second
set of basis coefficients {bi } distinct from the values {ai } such that
x = b1 v1 + b2 v2 + + br vr .
(37)
(34)
Subtracting Eqns (36) and (37) gives
BV is a basis for a Vector Space V, which has dimension = 2.
0 = (a1 b1) v1 + (a2 b2) v2 + + (ar br ) vr
(38)
Basis BV is illustrated by the two vectors in figure 2.
A basis for V is not unique, any set of r independent vectors in V is a basis
for V
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 23
But since the vi are linearly independent, Eqn (38) is possible only if each
of the (ai bi) = 0, which is to say that {ai } = {bi }, and so the basis
coefficients representing vector x on {vi } are unique.
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 24
EE/ME 701: Advanced Linear Systems
Section 4.1.0
Eqn (36) gives us a way to represent a vector x on the basis vectors.
x = 2
8
Inspection shows that
x = 2 v1 + 3 v2 =
Definition:
The vector a =
2
3
Sometimes we are interested in multiple bases. In some cases a vector x
will have a representation on each basis. It helps to have some notation to
keep track of things.
1
2
V = {v1 , v2 } = 1 , 0
1
2
v1 v2
2
3
Given a vector space S1 of vectors sx Rn with a basis V = [v1 , v2] and also
with a different basis W = [w1 , w2 ], then a vector sx S1 can be represented
3 ways:
s
x , Vx , Wx
(39)
where the left superscript indicates the basis on which the vector is
represented.
The vector sx is represented on the standard or Euclidean basis. This is
the ordinary representation of the vector (more on this later).
is the representation of the vector x on basis V .
Example: given the bases V and W for vector subspace S1 , and given the
vector sx below, find the representation for sx on each of bases V and W .
x = 2 ,
Define V to be the vector subspace spanned by V .
Notice that
x , v1 , v2 R3
while
2
aR
Vectors x , v1 , v2 have the dimension of the universe in which vector
subspace V is embedded.
The representation of a vector on a subspace always has the dimension
of the subspace (which may be less than the dimension of the universe).
(Revised: Sep 10, 2012)
Page 25
V = [v1 , v2 ] = 1
0 ,
x = a1 v1 + a2 v2 = [v1 , v2]
Part 2: Vectors and Vector Spaces
a1
a2
=V
W = [w1 , w2 ] = 1 2
3 4
Solution: since the columns of V are one basis for S1 , we know there exist
basis coefficients a1 and a2 to represent sx on V . The relationships can be
written:
The dimension of vector subspace V is 2
Part 2: Vectors and Vector Spaces
Section 4.1.1
4.1.1 Introducing notation to help keep track of vectors and bases
Example:
EE/ME 701: Advanced Linear Systems
x,
so V x =
a1
a2
(Revised: Sep 10, 2012)
Page 26
EE/ME 701: Advanced Linear Systems
Section 4.1.1
When r < n , we can solve for the basis coefficients with the left-pseudo
inverse. Recalling that it is given that sx S1 , then
Since sx = V V x ,
Example:
1
V
x = V TV
V T sx
>> x
x =
-1
0
2
x = b1w1 + b2 w2 = [w1 , w2 ]
(40)
Solving, we find
Confirm that x is represented by Vx =
>> V * Vx
ans =
1
-2
8
V x R2
2
3
Wx =
b2
=W
1
x = WTW
WT x
2.4
0.2
b1
, which gives.
2 = 2.4 1 0.2 2
8
7
4
4.1.2 Sn : the standard basis
%% should give sx
A basis that we have been using all along is the standard basis in Rn. For R3
the basis vectors are:
is the representation of vector x R3 on basis V .
The calculation of Eqn (40) always gives a well defined result.
1
The only calculation that might not give a valid result is V T V
. We
are assured the solution exists by the fact that the columns of V (the basis
vectors) are linearly independent.
Notice that x R3 while V x R2
The vector has the dimension of the universe, while the representation of
the vector has the dimension of the subspace
Part 2: Vectors and Vector Spaces
>> Vx = inv(V*V) * V * x
Vx =
2
3
1
-2
8
Section 4.1.2
We can also represent vector x on basis W :
so
>> V
V =
2
-1
1
EE/ME 701: Advanced Linear Systems
(Revised: Sep 10, 2012)
Page 27
1 0 0
S3 = 0 1 0
0 0 1
h
i
= s1 s2 s3
(41)
So we might write x = S3 sx. In words,
Every vector x is represented on the standard basis by representation sx,
with sx = x and Sn = I (the n n identity matrix) is the set of standard basis
vectors.
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 28
EE/ME 701: Advanced Linear Systems
Section 4.1.2
EE/ME 701: Advanced Linear Systems
Section 4.1.3
4.1.3 Representations on vector spaces other than Rn
Thus we have found three ways to represent vector x
We are chiefly concerned with vectors of numbers, but there are other types
of vectors and vector spaces, and our notions of basis and representation
carry over.
1. On the standard basis
x = S3 sx
2. On the V basis
x =V
x = a1 v1 + a2 v2
x=W
x = b1w1 + b2w2
Consider, for example, the set of polynomials of s of degree up to 2, such as
p (s) = 3 + 7 s + s2. It turns out that this set is a vector space. We can define
the vector space P by
3. And on the W basis
n
o
P = p (s) : p (s) = a1 + a2s + a3s2
If we choose the set of basis functions
Basis Facts:
A representation is always on a specific basis.
representation of vector x on basis V .
such as
Vx
is the
You cant talk about the representation of a vector without specifying the
basis.
The dimension of a representation:
Notice that
vi , wi
R3 ,
whereas
Each vector in a universe has the dimension of the universe, but
A representation of a vector has the dimension of the basis on which it is
represented.
(Revised: Sep 10, 2012)
with
a1
x = a2
a3
Even when the vectors of the vector space are not vectors of numbers, a
representation of a vector will be (because the representation is always an
r-dimensional vector of basis coefficients).
V x, W x R2 .
This is because, for this example, the universe has dimension n = 3, while
the vector subspace has dimension r = 2.
Part 2: Vectors and Vector Spaces
then the representation of a polynomial p (s)is simply
p (s) = P px
The representation of a given vector on a given basis is unique.
sx ,
o
n
P = p1 (s) = 1 , p2 (s) = s , p3 (s) = s2
Page 29
There is a one-to-one correspondence between a vector p P and its
representation px Rr on basis P. The representation is the vector of
basis coefficients.
Equivalently, the vector and its representation are isomorphic.
Useful tools, such as dot product and norm, can be applied to px .
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 30
EE/ME 701: Advanced Linear Systems
Section 4.2.0
4.2 A Span of a Vector Space
EE/ME 701: Advanced Linear Systems
4.3 Change of basis
Just as a set of basis vectors defines a vector space, a set of spanning vectors
v1, v2, , v p defines a vector space:
Vector Space V = x : x = a1v1 + a2v2 + + a pv p
(42)
The vector space spanned by vectors v1 , v2 , , v p is the set of all vectors
given as linear combinations of the spanning vectors
A vector space V is spanned by a set of vectors {vi } if every x V can be written
as a linear combination of the vi s .
It may be interesting to transform a vector from its representation on one
basis to its representation on another basis. A change of basis is important,
for example, for exploring some properties of state variable models.
Example:
3
2 1
V Vx = 1 0 = sx = 4
14
1 2
Find
We can say that the vi s span Vector Space V ,
Write V = span {vi }
Section 4.3.0
Wx
1 7
W = 1 2
3 4
with
, the representation of sx on basis W .
We actually know how to solve this problem, since we know how to find
from sx using W (see Eqn (40) in section 4.1.1).
( V is the vector space spanned by {vi } )
Wx
However, it will be handy to introduce a more general notation that express
from what basis and to what basis a vector is being transformed.
All bases are a spans, but not all spans are bases, the difference:
A span is any set of vectors, for example:
1
0
3
1
,
,
V = span ,
2
5
1
1
The solution is:
W
(43)
Since
defines vector space V .
Compare Eqn (42) with Eqn (43), both define vector space V
The example shows that a span may have redundant vectors
Eqn (44) gives
1
1
x = WTW
W T sx = W T W
WTV
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 31
(44)
1
0.60
0.40
= W
WTV =
WTW
VT
0.20 0.20
Equivalently, the vectors of the spanning set may be linearly dependent.
Shortly, we will learn how to form a basis from any span
x=W
VT x=
Part 2: Vectors and Vector Spaces
0.60
0.40
0.20 0.20
Vx =
4.4
0.2
(Revised: Sep 10, 2012)
Page 32
EE/ME 701: Advanced Linear Systems
Section 4.3.1
EE/ME 701: Advanced Linear Systems
Note: when V is a square matrix
4.3.1 Notation for a change of basis
To change from one representation to another requires a transformation
matrix
W
V
x= W
VT x
where
Wx
Section 4.3.3
V
sT
= V 1
(46)
is the representation on basis W , V x is the representation on V .
The symbol
4.3.2 Chaining changes of basis
W
VT
We can chain together changes of basis. For example, given
indicates that transformation matrix T converts a vector from representation
on basis V (left subscript) to basis W (left superscript). Using the left suband super-scripts leaves the right positions open, for example
W
V T1 ,
the transformation from basis V to the standard basis can be written
s
V
VT x
s
VT
with
1
x = V TV
V T sx ,
so
V
sT
=V
1
= V TV
VT
Part 2: Vectors and Vector Spaces
(47)
G
sT
(48)
it follows that
F
=G
FT s T
Find transformation matrix to convert vector sx to its representation on basis
F , and find the transformation matrix to convert this to basis G
(45)
which gives:
V
F s
x= G
FT s T x
4.3.3 Change of basis example
For the transformation from s to V , weve seen that the basis coefficients
are given by:
V
F
sT
Solution, since
Transformation from basis V to the standard basis: Since, by definition (see
Eqn (39))
s
x = V Vx
x=
find G
sT
W
V T2
might be the transformations from V to W at t1 and t2
G
FT
10
s
x= 4 ,
11
2 2
F = 1 2 ,
1 3
2 4
G = 4 1
1 4
x = Vs T sx
(Revised: Sep 10, 2012)
Page 33
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 34
EE/ME 701: Advanced Linear Systems
Section 4.3.3
EE/ME 701: Advanced Linear Systems
Section 4.4.0
4.4 Change of basis viewed geometrically (this section is
Solution: find the two transformation matrices
connected to Bay, section 2.2.4)
F
sT
and then evaluate
Fx = FT sx
s
and
and
G
FT
In section 4.3 we saw vectors represented on the standard basis and two
particular bases,
s
x , Fx , and Gx
Gx = GT Fx.
F
with
%% Find the two transformation matrices
>> FsT = inv(F*F)*F
FsT =
0.3117
-0.3506
0.0260
0.0260
0.2208
0.1688
>> GFT = inv(G*G)*G*F
GFT =
0.3333
-0.3333
0.3333
0.6667
F = 1 2 ,
1 3
G = 4 1
1 4
(49)
And we found the coordinate transformation matrix to transform between
bases
(see Eqn (44))
G
FT
1
GT G
GT F
(50)
which was derived by solving for the basis coefficients.
Now, evaluating Fx and Gx
>> sx =[10; 4; 11]
sx =
10
4
11
>> Fx = FsT * sx
Fx =
2
3
>> Gx = GFT * Fx
Gx =
-0.3333
2.6667
Bay presents a second way to derive the coordinate transformation in section
2.2.4, based on representing the basis vectors of the first basis in the second
basis.
Since F and G are sets of basis vectors for the same vector space, we can
represent each basis vector in the other basis.
Double check that both Fx and Gx give the original sx
>> F*Fx
ans =
10
4
11
Part 2: Vectors and Vector Spaces
>> G*Gx
ans = 10.0000
4.0000
11.0000
(Revised: Sep 10, 2012)
Page 35
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 36
EE/ME 701: Advanced Linear Systems
Section 4.4.1
Writing the basis vectors, explicitly showing them to be represented on the
standard basis (these are just the basis vectors shown in Eqn 49,
h
iT
)
for example s f1 = 2 1 1
F=
sf
sf
G=
sg
sg
(51)
We can form G
F T by representing the F basis vectors on the G basis:
h
i
F = G G f1 G f2
EE/ME 701: Advanced Linear Systems
Section 4.4.2
4.4.2 Numerical example based on representing from basis vectors on the
to basis vectors
>> F = [ 2 2
-1 2
1 3 ]
>> G = [ 2 4
-4 1
-1 4 ]
Now
h
x = F Fx = G
Gf
Gf
i
x=G
h
Gf
Gf
x = G Gx (52)
Using the uniqueness of the representation of the vector sx,
i
h
G
F
F
Gf
Gf
x=
x =G
1
2
FT x
So
G
FT
Gf
Gf
(53)
(54)
The transformation from one coordinate frame to another is given as the
representation of the from basis vectors on the to basis vectors.
>> sf1 = F(:,1)
>> sf2 = F(:,2)
sf1 = 2
sf2 =
2
-1
2
1
3
%% Find the representation of the F basis vectors on G
>> Gf1 = inv(G*G) * G * sf1
Gf1 =
0.3333
0.3333
>> Gf2 = inv(G*G) * G * sf2
Gf2 = -0.3333
0.6667
4.4.1 Example, change of basis viewed geometrically
In robotics, rotations are used to change the expression of a vector from one
coordinate frame to another (that is, a change of basis). The rotation matrix
from B to A is given by the axes of B expressed in A
|
|
|
A
A
AY
AZ
(55)
BR = X B
B
B
|
|
|
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 37
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 38
EE/ME 701: Advanced Linear Systems
Section 4.4.3
EE/ME 701: Advanced Linear Systems
Section 4.4.3
4.4.3 Summary
%% Transformation from F to G
>> GFT = [Gf1 Gf2]
GFT =
0.3333
-0.3333
0.3333
0.6667
Looking at the way G
F T is given,
G
FT
%% Representation of x on F
>> Fx = [ 2
3 ]
Fx =
2
3
Gf
Gf
A coordinate transformation is given by representing the from basis
vectors on the to basis.
Note that the equations are, of course, the same. In section 4.3 we wrote
G
FT
%% Representation of x on G
>> Gx = GFT * Fx
Gx =
-0.3333
2.6667
1
= GT G
GT F
In section 4.4 we wrote
G
FT
=
=
%% Finding sx from each of the representation of F and on G
>> sx = F * Fx
sx = 10
4
11
Gf
Gf
1 T s
G f1 ,
GT G
1
h
GT s f1
GT G
sf
1 T s i
GT G
G f2
1
GT F
= GT G
By representing the basis vectors of F in G , we can write the transformation
from F to G .
>> sx = G * Gx
sx = 10.0000
4.0000
11.0000
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 39
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 40
EE/ME 701: Advanced Linear Systems
Section 5.0.0
EE/ME 701: Advanced Linear Systems
5 Vector Subspace (following Bay 2.4)
Section 5.0.0
Recall the terminology from set theory: subset and proper subset.
From the set of colors (the universe) {blue, green, red},
Vector subspace: A vector space of dimension r , comprising vectors of ndimensional elements, where r n.
{blue, green, red} is a subset but not a proper subset.
A vector subspace is a type of vector space, it must demonstrate all of
the properties of a vector space.
{blue,green,red} Colors
{blue, red}, is both a subset and a proper subset.
Proper vector subspace: A vector space of dimension r , comprising vectors of
n elements, where r < n.
{blue,red} Colors
A vector subspace is defined by a basis or a span, for example
1
2
S1 = x : x = a1 1 + a2 0 ,
2
1
Then B =
1 ,
1 ,
B = 1 ,
B =
a1 , a2 R
A vector space is a subset of the universe in which it is embedded. For
example
(56)
S2
is a basis for S1 .
S2 R3
1 is also a basis for S1 .
1
1
0 , 1 is one possible span for S1 .
2
3
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
0
1
0
x : x = a1 0 + a2 1 + a3 0 ,
=
0
1
0
Page 41
Exercise:
A proper vector subspace is a proper subset of the universe in which it is
embedded. For example
Make up
Another
span
and basis
for S1 .
a1 , a2 , a3 R
S1
2
1
x : x = a1 1 + a2 0 ,
=
1
2
S1 R3
Part 2: Vectors and Vector Spaces
a1 , a2 R
(Revised: Sep 10, 2012)
Page 42
EE/ME 701: Advanced Linear Systems
Section 5.1.0
5.1 Example proper vector subspace
EE/ME 701: Advanced Linear Systems
Section 5.1.0
Terminology
An example 2-D subspace embedded in R3 :
2
1
S1 = x : x = a1 1 + a2 0 ,
1
2
The set given by Eqn (57) is illustrated in figure 3.
a1 , a2 R
Given S and B are vector spaces, if B S (if B is a proper subset of S),
then we can say that vector space B is embedded in vector space S .
Such as: Vector space S1 is embedded in R3.
(57)
Proper vector subspace is used to emphasize that
r = dim (B) < dim (S) .
Often we omit the words proper and vector, such as saying:
Subspace B is embedded in S, which is isomorphic to R3 .
Sub can also be omitted:
Vector space B is embedded in space S.
If a vector space is 2 dimensional, then dim (B) = 2 and we say that B is
2-D, such as:
1
0
B is a 2-D subspace in R7 ,
1
2
If dim (B) = r:
3
3
B is a r-D subspace in Rn .
1
0
2
Y
2
3
Figure 3: 2D Vector subspace embedded in R3 .
Recall that vector space S1
must include the zero element.
must be closed under scalar multiplication.
must be closed under vector addition.
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 43
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 44
EE/ME 701: Advanced Linear Systems
Section 5.1.1
EE/ME 701: Advanced Linear Systems
Section 5.2.1
5.2 What about other dimensions of A ?
5.1.1 Observations on subspaces of Euclidean spaces
Proper subspaces of R3 have dimension 1 or 2.
When B A, and thus r = dim (B) < dim (A), there must be vectors in A
not lying in B.
B, dim (B) = 0 contains only one vector: the zero vector.
Does the zero vector, 0 Rn, by itself satisfy the conditions to be a
vector space ?
B, dim (B) = 1: the vectors in B lie on a line passing through the origin.
B, dim (B) = 2: the vectors in B lie in a plane which includes the origin.
Proper subspaces of Rn have dimension 1 ... (n 1).
A 2-D vector subspace can be referred to as a plane.
Student
Exercise
Actually, when dim (B) < dim (A), almost all vectors in A are
outside B.
Consider a class room with the origin in the center of the top of the
desk in front. Almost all vectors in R3 (the class room) do not lie in the
surface of the desktop (a 2-D subspace). The surface of the desktop
has zero volume, and so 0% of the volume of the room.
5.2.1 The set-theoretic meaning of almost all
An r-D vector subspace, with r < n, can be referred to as a hyperplane or
hypersurface.
Almost all y A have property w has a precise meaning in set theory:
The elements of set A which do not have property w have measure
0.
The set of states of the system forms a 4-D hyperplane in R12
A vector spaces of dim (S) > 3 can be referred to as hyperspace, as in
The points lie on the surface of a 6-dimensional hyper-cube lying in
an 8-dimensional hyper-space.
Measure theory is whole subject unto itself (remember Lebesgue). Think of
it this way:
if you are choosing y randomly from A, the chance of getting one with
property w is 100% (almost all y have property w).
There are elements of A which do not have property w, but the chance of
choosing one is infinitesimal. In set theory we say
The set of y in A without property w has zero measure.
or, equivalently
Almost all y in A have property w.
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 45
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 46
EE/ME 701: Advanced Linear Systems
Section 5.2.2
EE/ME 701: Advanced Linear Systems
6 Projection Theorem
5.2.2 A vector y orthogonal to a proper subspace B
When proper subspace B is embedded in A, vector y A, but outside
subspace B can have two relationships to B:
Before introducing the projection theorem, we need to introduce a
generalized notion of orthogonality.
1. Vector y can partially overlap subspace B , formally:
x B
s.t.
hx, yi =
6 0
Generalized Orthogonality: three flavors of orthogonality
For this discussion (as elsewhere) b, u, v, w, x, y and z are vectors in Rn ;
bold capitals, such as B, U, S and S1 , refer to subspaces.
(58)
Case 1: A vector is orthogonal to a vector
2. Vector y can be orthogonal to subspace B, which is to say that y does
not overlap B at all. Formally:
if
hx, yi = 0
x B then
Section 6.0.0
y B
This is our familiar case, based on the the inner product:
(59)
vw
Or, equivalently,
hv, wi = 0
(63)
Case 2: A vector is orthogonal to a subspace
y B x B , hx , yi = 0
Student
exercise
Student exercise, show that if
y bi ,
This implies that the vector is orthogonal to each vector in the
subspace:
vB
hv, bi = 0 b B
(64)
(60)
Case 3: A subspace is orthogonal to a subspace
This implies that each vector in the first subspace is orthogonal to each
vector in the second subspace:
{bi } a basis for B
(61)
BU
hb, ui = 0
b B u U
(65)
(Revised: Sep 10, 2012)
Page 48
then
yB
Part 2: Vectors and Vector Spaces
(62)
(Revised: Sep 10, 2012)
Page 47
Part 2: Vectors and Vector Spaces
EE/ME 701: Advanced Linear Systems
Section 6.0.0
EE/ME 701: Advanced Linear Systems
Section 6.1.0
6.1 Projection Theorem
Now lets define a projection:
Projection: A projection of a vector onto a subspace is a vector. It is the
component of the original vector lying in the subspace. The remainder
of the original vector must be orthogonal to the subspace.
Looking at figure 4, u is the component of x lying in subspace U. This
means that the other part of x (that part not in subspace U, given by
w = x u) is orthogonal to U, or w U.
Observation: the subspace can have dimension 1, that is, we can project
a vector onto a vector.
u U
PROJECTION THEOREM: Given vector spaces U and S with U a
proper subspace of S, then for all vectors x S there exists a unique
vector u U such that w = x u and w U.
Alternatively, in formal notation, given U S ,
x S u U s.t. w = x u , and w U
(66)
The projection theorem is illustrated in figure 4.
The projection theorem tells us that given any vector and subspace (not
necessarily proper !) the vector can be broken down into 2 parts:
1. A part lying in the subspace, and
w = xu
w U
2. A part that is orthogonal to the subspace.
Notation: Introduce the notation PU x for the projection of x onto U. When u is
the projection of x onto U, we may write:
u = PU x
Consider the part which does not lie in subspace U.
Define w = x u , we may say that
Figure 4: Orthogonal projection of vector x onto subspace U (from Bay).
vector w is orthogonal to subspace U.
The set of all possible ws forms a vector subspace !
Orthogonal Complement: if U is a subspace of S, the set {w : w S , w U}
is the orthogonal complement of U in S , written: U .
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 49
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 50
EE/ME 701: Advanced Linear Systems
Section 6.2.1
6.2 Projection of a vector onto a proper subspace
Starting with
6.2.1 First projection example, projection onto a 1-D subspace
with a 1-D subspace U = span {u1 }
, u1 =
f=
hw, u1 i = 0
(70)
hf, u1 i
hf, u1 i
=
hu1 , u1 i ||u1 ||2
(71)
1 =
(67)
The inner product gives us the projection of one vector onto another.
Projection onto a vector is illustrated in figure 5.
Suppose we want to find the projection of f onto U. Whatever is left over,
must be orthogonal to U, so:
PU f = 1 u1
(68)
w = f PU f
hf, u1 i 1 hu1 , u1 i = 0
we can solve 1 :
Because the inner product operation is linear, Eqn (69) can be broken into two
parts. Keeping in mind that w = (f PU f) and for the 1-D case PU f = 1 u1
hw, u1 i = h(f PU f) , u1 i = hf, u1 i hPU f, u1 i = hf, u1 i 1 hu1 , u1 i = 0
This is the problem of projecting a vector onto a subspace.
Consider vectors in
Section 6.2.1
Example: Projection onto a 1-D subspace (continued)
In section 6.1 the projection theorem is laid out. The question arises: given
S, U and x, how can u and w be determined.
R4,
EE/ME 701: Advanced Linear Systems
(69)
Notice that the magnitude of the projected vector (PU f) is independent of
the length of the basis vector (u1 ).
u1
Pf
Pf
w
u1
(a) Projection of f onto u1
(b) Projection of f onto a similar u1
Figure 5: Projection of f onto u1 and onto u1 .
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 51
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 52
EE/ME 701: Advanced Linear Systems
Section 6.2.1
Putting numbers to the example:
EE/ME 701: Advanced Linear Systems
Eqn (71) is:
Using f and u1 as given:
i
h
4 0 2 1
PU f =
2
PU f =
0.1
0.2
4 0 2 1
0.2
0.1
PU f =
2
0.1
0.2
0.2
0.1
0.7
2 1.4
2
= 7 =
10
2 1.4
2
1
0.7
1
||u1 ||2
u1
(72)
1. The direction, given by u1 .
The projection of f must lie in the subspace spanned by u1 .
2. The magnitude, given by the scaled inner product of f and u1 . Call this
magnitude the projection coefficient.
6.3 Normalization of the basis vectors
If the length of u1 is one, then Eqn (72) reduces to:
PU f = hf, u 1 i u 1
hf, u1 i
which has 2 parts:
Using f and u1 , a shorter version of u1 :
Section 6.3.0
0.1
0.7
0.2 1.4
0.2
0.7
=
0.1
0.2 1.4
0.2
0.1
0.1
0.7
0.1
(73)
and the projection coefficient is simply the inner product. This property is
sufficiently handy that we need to call it something:
Normal basis vector: a basis vector with length 1 is said to be normal,
and is sometimes written with a hat : u 1 .
Normalization: the process of making a vector normal.
Matlab example:
>> u1 = [ 1 2 2 1 ];
>> u1hat = u1 / norm(u1);
Of course, the projection of f onto u1 is the same as onto u1 .
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 53
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 54
EE/ME 701: Advanced Linear Systems
Section 6.3.0
EE/ME 701: Advanced Linear Systems
6.4 Projection Matrices
Consider example 1 one more time, with u 1 normalized:
Normalized :
Now
0.316
Section 6.4.0
One more interesting fact about projecting f onto U: since hf, u 1 i is a scalar,
Eqn (73) can be written:
0.632
u1
=
u 1 =
||u1 ||
0.632
0.316
(74)
PU f = hf, u 1 i u 1 = u 1 hu 1 , fi = u 1 u T1 f = u 1 u T1 f
(76)
M = u 1 u T1 is thus a term that multiplies a vector to give the projection of
the vector onto a subspace:
||u 1 || = 1
And so
PU f =
hf, u 1 i
||u 1 ||2
u 1 = hf, u 1 i u 1
(75)
In the example, this gives
0.316
0.316
0.316
0.70
0.632 0.632
0.632 1.40
= (2.214)
=
PU f =
4
0
2
1
0.632 0.632
0.632 1.40
0.316
0.316
0.316
0.70
Also, defining
0.316
0.1 0.2 0.2 0.1
i
0.632 h
0.2 0.4 0.4 0.2
0.316 0.632 0.632 0.316 =
M=
0.632
0.2 0.4 0.4 0.2
0.316
0.1 0.2 0.2 0.1
Now
0.1 0.2 0.2 0.1
0.70
0.2 0.4 0.4 0.2 0 1.40
Mf =
0.2 0.4 0.4 0.2 2 1.40
0.1 0.2 0.2 0.1
1
0.70
W = U
notice that
Now for any vector f, the projection is given as: PU f = M f .
3.3
1.4
W
w = f PU f =
0.6
1.7
M is pretty handy also, and so lets call it a projection matrix or
projection operator.
Student exercise, what is the dimension of W ?
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 55
Student
exercise
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 56
EE/ME 701: Advanced Linear Systems
Section 6.4.1
EE/ME 701: Advanced Linear Systems
Notice that if we write
6.4.1 Bay example 2.10, projecting f onto a 2-D subspace
then:
0
2
0
, u1 = , u2 =
f=
2
2
1
1
1
1
(77)
and also:
Solution:
and so:
Since PU f U , PU f can be written:
PU f = 1 u1 +2 u2
1 , 2 R
u1 , u2 R4 ,
U = span {u1 , u2 }
(78)
Writing f = 1 u1 + 1 u1 + w, and using Eqn (78) and the linearity of
inner product
hf , u1 i = h1 u1 + 2 u2 + w , u1 i = 1 hu1 , u1 i + 2 hu2 , u1 i + 0
1
2
Part 2: Vectors and Vector Spaces
hu1 , u1 i hu2 , u1 i
hu1 , u2 i hu2 , u2 i
hf , u1 i
hf , u2 i
(Revised: Sep 10, 2012)
hu1 , u1 i hu2 , u1 i
hu1 , u2 i hu2 , u2 i
hf , u1 i
hf , u2 i
hu1 , u1 i hu2 , u1 i
hu1 , u2 i hu2 , u2 i
= UT U
hu1 , fi
hu2 , fi
(82)
= UT f
hf , u1 i
hf , u2 i
(83)
(84)
1
= UT U
U T f (85)
The projection coefficients are given by Eqn (85).
Equation (85) for projecting f onto basis vectors U is related to what
equation that we have seen before ?
Finally, the projection is give by:
(80)
Which solves to give:
(79)
hf , u2 i = h1 u1 + 2 u2 + w , u2 i = 1 hu1 , u2 i + 2 hu2 , u2 i + 0
Eqn (79) may be written:
hu1 , u1 i hu2 , u1 i
hf , u1 i
1
hf , u2 i
hu1 , u2 i hu2 , u2 i
2
U = u1 u2
| |
Consider again these vectors in R4, and project f onto the
subspace U = span {u1 , u2 }
Section 6.4.1
PU f = 1 u1 + 2 u2 = U
1
2
1
= U UT U
UT f
(86)
And in projection matrix form:
(81)
PU f = M f with
Page 57
Part 2: Vectors and Vector Spaces
1
M = U UT U
UT
(Revised: Sep 10, 2012)
(87)
Page 58
Student
Exercise
EE/ME 701: Advanced Linear Systems
Section 6.4.1
Running the numbers:
1 0
1 0
Given a vector subspace U with an array of basis vectors U
1
2
2
1
2
0
10
3
=
UT U =
0 0 1 1 2 1
3 2
1 1
hf , u1 i
1
2
2
1
0
7
=
= UT f =
hf , u2 i
0 0 1 1 2
1
or
Using the projection PU f we can find the orthogonal complement, U
1
2
10 3
3 2
7
1
1
1
w = f Pf = (I M) f
where M is the projection matrix. We can say that
w = M f
where M = I M is the projection matrix for projecting a vector onto the
orthogonal complement, and w U .
Using the data of the example,
(88)
0.091 0.091
0.182 0.182
,
0.545 0.455
0.455 0.545
Part 2: Vectors and Vector Spaces
Since
1
2
PU f = 1.0 u1 1.0 u2 =
1
0
1
M = U UT U
UT
0.182 0.364
0.364 0.727
M =
0.091 0.182
0.091 0.182
w = f PU f =
Section 6.4.2
6.4.2 Projection matrix for the orthogonal complement
2 0
U =
2 1
1 1
EE/ME 701: Advanced Linear Systems
1.0
2.0
PU f = M f =
1.0
0.0
(Revised: Sep 10, 2012)
Page 59
0.818
0.364 0.091
0.091
0.364
0.273
0.182
0.182
M = I M =
0.091 0.182 0.455 0.455
0.091 0.182 0.455 0.455
and w = M f =
3 2 1 1
Part 2: Vectors and Vector Spaces
iT
(Revised: Sep 10, 2012)
Page 60
EE/ME 701: Advanced Linear Systems
Section 6.4.3
It would be nice to get rid of the matrix inversion step in projecting a vector
onto a subspace
b=
bT U
U
0.671
0.671
For many reasons, not least because the matrix inverse can be badly
conditioned.
Starting with
hf , u 1 i
0.316 0.632 0.632 0.316 0
2.214
T
=
b f =
=U
0
0
0.707 0.707 2
0.707
hf , u 2 i
1
1
0.671
3.162
2.214
=
=
2
0.671
1
0.707
1.414
1.0
2.0
PU f = 3.16 u 1 1.414 u 2 =
1.0
0.0
Of course, the projection matrix does not change by rescaling
vectors:
0.182 0.364 0.091 0.091
1
0.364 0.727 0.182 0.182
b
bT =
b U
bT U
U
M =U
0.091 0.182 0.545 0.455
0.091 0.182 0.455 0.545
Part 2: Vectors and Vector Spaces
Section 7.0.0
7 Gram-Schmidt ortho-normalization
6.4.3 Projection with normalized basis vectors
With normalized vectors:
0.316
0
0.632
0
b =
U
0.632 0.707
0.316 0.707
EE/ME 701: Advanced Linear Systems
(Revised: Sep 10, 2012)
the basis
(90)
UT U = I
so that
Since
1
M = U UT U
UT = U I 1 UT = U UT
hu1 , u1 i hu2 , u1 i
UT U = hu1 , u2 i hu2 , u2 i
..
..
...
.
.
(91)
(92)
Computing a projection without a matrix inverse requires that
hu1 , u1 i = hu2 , u2 i = = hur , ur i = 1
hu1 , u2 i = hu2 , u1 i = = ui , u j , i 6= j = 0
If
1
UT
M = U UT U
In other words, the basis vectors must be normal and orthogonal.
(89)
Page 61
Orthonormal: Basis vectors which are both orthogonal and normal are
called orthonormal.
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 62
EE/ME 701: Advanced Linear Systems
Section 7.1.1
7.1 Process of Gram-Schmidt Ortho-normalization
EE/ME 701: Advanced Linear Systems
Section 7.1.1
1. Start with zero basis vectors, V = {}.
Gram-Schmidt Ortho-normalization is a process that starts with any span of
a vector subspace, and produces an orthonormal basis, V .
2. Take any vector from the subspace:
u1 =
2
1
In broad terms, the process to make orthogonal basis vectors works this way:
1. Start with zero basis vectors
2. Choose any vector u in the subspace
Choosing, for example, a vector from any spanning set.
V=
3. Subtract off the projection onto the existing basis vectors.
3. Subtract from u the projection of u onto the existing basis vectors.
4. If a sufficient vector remains after the subtraction, normalize the
remaining vector and add it to the set of basis vectors.
7.1.1 Example Gram-Schmidt Ortho-normalization
Starting with the spanning set, let subspace S
[u1 , u2 , u3 ]
1
0
2 0
S = span
, ,
1
1
Find a set of ortho-normal basis vectors for S .
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
0.316
0.632
u1
v1 = =
u
0.632
1
0.316
2
3
4. If a sufficient vector remains after the subtraction (if u1 > tol), normalize
the remaining bit and add it to the set of basis vectors.
be spanned by vectors A =
2 0 2
u1 =
=
2 0 2
1
0
1
5. Repeat steps 2-4 until the basis set is complete
The basis set is complete when r = n, or when all of the spanning
vectors have been used.
V = {v1} =
0.632
0.632
0.316
0.316
5. Repeat
Page 63
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 64
EE/ME 701: Advanced Linear Systems
Section 7.1.1
2. Take any vector from the subspace:
0
0
u2 =
,
1
1
V=
0.632
0.632
0.316
0.1 0.2 0.2 0.1
0.3
u3
0.286
0.572
u2
,
v2 = =
u
0.381
2
0.667
V = {v1 , v2} =
(93)
0.286
0.632 0.572
,
0.632 0.381
0.316
0.667
0.316
0.182
0.364
0.091 0.091
0.364 0.727 0.182 0.182
u3
= u3 V V u3 = u3
0.091 0.182 0.545 0.455
0.091 0.182 0.455 0.545
In general, vector u2 is not parallel to u2 , that is u2 u2 .
If u2 > tol, u2 can not perpendicular to u2 either.
4. If a sufficient vector remains after the subtraction (if u2 > tol), normalize
the remaining bit and add it to the set of basis vectors.
Note that u2 is the projection of u2 onto the orthogonal complement of V.
0.286
0.632 0.572
,
0.632 0.381
0.316
0.667
0.316
3. Subtract off the projection onto the existing basis vectors.
V=
1
2
u3 =
,
3
2
0.316
0.6
0.2 0.4 0.4 0.2
u2 =
u2 = u2 V V u2 = u2
0.4
0.2 0.4 0.4 0.2
0.7
0.1 0.2 0.2 0.1
Section 7.1.1
2. Take any vector from the subspace:
3. Subtract off the projection onto the existing basis vectors.
EE/ME 701: Advanced Linear Systems
= 10
0.444
0.888
0.444
0.444
15
4. If a sufficient vector remains, normalize the remaining bit and add it to the set
of basis vectors.
u3 < tol, u3 lies in V = span (v1, v2)
No vector added to V
5. All out of vectors in the original spanning set. Done.
5. Repeat
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 65
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 66
EE/ME 701: Advanced Linear Systems
Section 7.1.2
In step 4 of the Gram-Schmidt algorithm, a vector ui is accepted to be a basis
vector if sufficient magnitude remains after projection onto the orthogonal
complement of V :
,
include ui / ui
Using
S = span
(94)
is given by:
where
A=
u1 u2 u p
0.316
0.286
The projection matrix onto the subspace given by
One possible answer is a value that scales with the number and size of
vectors and the machine precision,
0.632 0.572
0.632 0.381
0.667
0.316
V=
This begs the question, what should be the value of tol ?
tol = max (n, p) max ||ui ||
Section 7.2.0
7.2 Projection matrix with GS Ortho-normalization
7.1.2 Tolerance value for Gram-Schmidt algorithm
if ui > tol
EE/ME 701: Advanced Linear Systems
Rnp is the initial set of p spanning vectors,
maxi ||ui || is the maximum of the norms of the columns ui Rn ,
is the machine precision
tol given by Eqn (94) reflects the fact that round-off errors in step 4 depend
on the dimensions of A and the magnitude of the vectors that make up the
calculation.
0 2
1 3
2
1
2
2
1
0.182
0.364
0.091 0.091
0.364
0.727
0.182
0.182
M = V VT =
0.091 0.182 0.545 0.455
0.091 0.182 0.455 0.545
(95)
And (compare with Eqn (88))
1.0
0
2.0
= Mf =
PU f = P
2
1.0
1
0.0
(96)
Importantly, nowhere in the whole process is a matrix inversion required !
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 67
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 68
EE/ME 701: Advanced Linear Systems
Section 7.3.0
EE/ME 701: Advanced Linear Systems
7.3 Projection Coefficients
7.4 Projection onto the orthogonal complement.
Note that when the set of basis vectors V is orthonormal, then:
PU f = 1 v1 + 2 v2 = V
1
2
= V VT f = V VT f
At each iteration of the GS Ortho-normalization, in step 3, we are
subtracting the projection onto the existing basis vectors from the candidate
vector.
(97)
This is the same as taking the projection onto the orthogonal complement of
the existing basis vectors. For example:
and so the projection coefficients to project f onto V are given simply by:
1
2
= VT f
x2 = x2 V V T x2 = I V V T x2
(98)
is the projection of x2 onto V .
Given a basis set V, the projection matrix to project onto the orthogonal
complement is given by:
Compare with Eqn (81) (repeated here)
1
2
Section 7.4.0
hu1 , u1 i hu2 , u1 i
hu1 , u2 i hu2 , u2 i
hf , u1 i
hf , u2 i
hu1 , u1 i hu2 , u1 i
hu1 , u2 i hu2 , u2 i
Eqn (98) requires no matrix inverse operation.
inv(X*X)
can be computed in
We are accustomed to thinking
Eqn (81), and this is true for reasonably conditioned matrices that arent
too big.
UT f
M = I V V T
x2 = M x2
(99)
Eqn (99) is not too surprising, it is saying:
The portion of x2 not lying in V is the total, minus the bit that is lying
in V.
Eqn (99) can be quite handy.
But for singular, poorly conditioned or even just large matrices (50x50
or larger)
inv(X*X)
may not exist, or the computation may lead
to large errors.
Gram-Schmidt is one of the most scalable and robust algorithms in linear
algebra.
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 69
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 70
EE/ME 701: Advanced Linear Systems
Section 7.5.0
7.5 Projection and fitting parameters to experimental data
EE/ME 701: Advanced Linear Systems
Section 8.1.0
8 Additional Topics
Think back to y = A b and modeling experimental data
8.1 The Four Fundamental Spaces of a Matrix
When we write a model:
Consider matrix A and the operation y = A b :
y (k) = T (k) b
and then have
y = A b
b
with
b Rp
y Rn , b
Where p is the number of parameters, and n is the number of data points.
Where y is measured data and y is estimated from the model.
Then
1
b
AT y ,
b = AT A
b ,
y = A b
= y y
The model set (also called the reachable set) is set of outputs, y , given
by the model for any possible tuning of parameters b
b.
For a linear model, the model set forms a linear vector space:
The columns of A are the spanning set for the r-dimensional model
set in the n-dimensional output space.
b are the projection coefficients of the data y onto the
The parameters b
basis vectors of the model set (namely the columns of A).
The G-S algorithm applied to the columns of A gives a basis for the
model set.
The projection theorem tells us that with b
b given by Eqn (100):
(Revised: Sep 10, 2012)
0
1
1
y = Ab
(101)
2. Output space: y Rn
It turns out that each of these spaces is further divided in 2, so there are a total of
four fundamental spaces of a matrix. For an n p matrix:
Input Space:
1. Null Space: Set of vectors b R p such that A b = 0
2. Row Space:
Orthogonal complement of the Null Space.
The row space is spanned by the columns of AT
A basis for the row space is given by Vrow = GS AT
Output Space:
3. Column space:
b Rp
A basis for the column space is given by Vcol = GS (A)
4. Left null space:
Said another way, there is no signal power remaining in which can
possibly be modeled by y (k) = T (k) b.
Part 2: Vectors and Vector Spaces
2
A=
The set of y given by the equation y = A b
Then lies in the orthogonal complement of the model set
(write A ).
Two vector spaces associated with multiplying a vector by matrix A are:
1. Input space: b R p
(100)
Page 71
The orthogonal complement of the column space
The set of all y such that y A b , b R p
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 72
EE/ME 701: Advanced Linear Systems
Section 8.1.0
Space of b vectors
Space of y vectors
Ro
w
row Sp
(A) ace
A b=y
br
Ab
ft N
lnu ull S
ll(A pac
)
e
b
0=A n
bn
Input Space, Rp
0
2
= A bn =
0
2
0
1
Le
Nu
l
Nu l Spa
ll(A ce
)
b=br+bn
8.1.1 Numerical Examples of the four fundamental spaces,
Null Space: The set of input vectors that give no output
Y=
Section 8.1.1
Input Space = Row Space + Null Space
Co
lu
co mn
l(A Sp
)
ac
Y=A br
EE/ME 701: Advanced Linear Systems
0
0
1
1
1
2
1
3
1
2
1
1 null (A)
(102)
Row Space: Vectors with a component from the row space give a nonzero output
Output Space, Rn
Figure 6: Pictorial representation of the four fundamental spaces of a matrix.
Row space and Null space lie in the input space, while Column space
and Left-null space lie in the output space. (Adapted from Strang).
The equation illustrated in figure 6 is:
y = A b = A br + A bn = y + 0
3
1
6
=A
1
9
2
6
1
1 contains a component from row (A)
2
(103)
Consider for the Row space:
Inputs come from
All vectors which give a non-zero output contain a component from
the row space.
A part from the Row space and a part from the Null space.
Outputs lie in the Column space
Any component from the null space adds to length ||b||2 , but
contributes nothing to the output.
The Left-Null space is unreachable by
So the minimum b vector giving y lies entirely in the row space.
y = Ab
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 73
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 74
EE/ME 701: Advanced Linear Systems
Section 8.1.1
Row and Null Space example
Section 8.1.1
Column Space example
3
1
2
=A
2 =A 1
9
2
1
6
b1 = 2 = br +bn = 1 + 1 ,
1
1
2
Vector b1 has a component from the row space and from the null space.
The contribution
determines the output
EE/ME 701: Advanced Linear Systems
1
6
=A
1
9
2
6
6
col (A)
9
6
(104)
The dimension of the column space is equal to the rank of A
b1r = 1
2
The contribution
The Column Space is the set of output vectors possible from A
dim col A = rank A
The columns of A span the column space
col A = {y : y = A b }
b1n = 1
Left-Null Space
The Left-Null Space is the set of vectors in the output space that can
not be reached by
y = Ab
makes no contribution to the output.
If
for any value of b
A b = y 6= 0 ,
and
It is the orthogonal complement of col A .
b has the minimum ||b||2 such that A b = y
then b row (A) .
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 75
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 76
EE/ME 701: Advanced Linear Systems
Section 8.1.1
Left-Null Space example:
Row Space:
1
lnull A
yln =
h
iT
Since 1 1 1 1 lies in the Left Null space of A , then the choice
for b which minimizes
is
1
1
1
1
Section 8.1.2
8.1.2 Computing bases for the four fundamental spaces
The Left-Null Space is the
orthogonal complement of the
column space. For example,
e
y =
EE/ME 701: Advanced Linear Systems
(105)
b= 0
0
where
(107)
R is a set of basis vectors spanning the Row Space of A, and
GS AT indicates applying the Gram-Schmidt algorithm to AT
y=
(106)
, gives b Vr , y Rn
with the suitable choice of y .
0
than b = 0
0
(Revised: Sep 10, 2012)
Since the Row Space is spanned by the rows of A, every vector in the
row space is given by:
b = AT y
What Eqns (106) and (105) are saying is that there is no choice for b
which gets any closer to
Part 2: Vectors and Vector Spaces
R = GS AT
Note: Recall that the GS algorithm takes any set of vectors, and returns an
ortho-normal basis on the space spanned by the vectors.
A b
The Row Space is spanned by the rows of A, therefore an ortho-normal
basis on the row space is given by:
Column Space:
The Column Space is spanned by the columns of A
C = GS (A)
(108)
where C is a set of basis vectors spanning the column space of A.
Since the Column Space is spanned by the columns of A, every vector y
in the column space is given by:
y =Ab
with the suitable choice of b .
Page 77
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 78
EE/ME 701: Advanced Linear Systems
Section 8.1.2
Null Space:
EE/ME 701: Advanced Linear Systems
Section 8.1.3
8.1.3 Bases for the Four Fundamental Spaces, Numerical Example
Note that
Using the GS algorithm, we can determine bases for the four fund. spaces
Mr = R RT
(109)
is a projection matrix, projecting any vector b onto the Row Space.
Since the null space is the orthogonal complement of the row space, the
projection matrix onto the null space is given by:
Mn = I Mr = I R RT
(110)
Since the columns of any projection matrix span the space onto which
the matrix projects, a basis set for the null space is given by:
null (A) = GS (Mn) = GS I R RT
(111)
>> A = [ 1 0 1; 2 0 2 ; 2 1 3; 1 1 2]
A =
1
0
1
2
0
2
2
1
3
1
1
2
n = 4
p = 3
r = 2
>> Col = GramSchmidt(A)
Col =
0.3162
-0.2860
0.6325
-0.5721
0.6325
0.3814
0.3162
0.6674
dim Col =
dim Row =
rank A = 2
Left-Null Space
The Left-Null Space is the orthogonal complement of the Column Space.
The projection matrix onto the column space is given as
Mc = CCT
(112)
and so the projection matrix onto the left-null is given as:
Mln = I Mc
(113)
And finally, a set of basis vectors for the left-null space is given by:
lnull (A) = GS (Mln) = GS I CCT
Part 2: Vectors and Vector Spaces
>> Row = GramSchmidt(A)
Row =
0.7071
-0.4082
0
0.8165
0.7071
0.4082
(Revised: Sep 10, 2012)
(114)
Page 79
>> Null = GramSchmidt( eye(3) - Row*Row)
Null =
0.5774
0.5774
-0.5774
dim Null =
p - dim Row = 1
>> LNull = GramSchmidt( eye(4) - Col*Col)
LNull =
0.9045
-0.0000
-0.4020
0.3333
-0.1005
-0.6667
0.1005
0.6667
dim LNull =
n - dim Col = 2
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 80
EE/ME 701: Advanced Linear Systems
Section 8.1.4
EE/ME 701: Advanced Linear Systems
Section 8.1.4
8.1.4 The Four Fundamental Spaces of a Matrix, revisited
Space of b vectors
Space of y vectors
Ro
w
row Sp
(A) ace
A b=y
Co
lu
co mn
l(A Sp
)
ac
Y=A br
br
Ab
b=br+bn
0=A
bn
Input Space, Rp
bn
Le
Nu
l
Nu l Spa
ll(A ce
)
ft N
lnu ull S
ll(A pac
)
e
Y=
Output Space, Rn
Figure 7: Pictorial representation of the four fundamental spaces of a matrix
(repeated).
Looking back to figure 6,
The row space and null space are orthogonal complements
The input space, R p for an n p matrix, is the union of the row and
null spaces:
p = dim row (A) + dim null (A)
(115)
The column space and left-null space are orthogonal complements
The output space, Rn for an n p matrix, is the union of the column
and left-null spaces:
n = dim col (A) + dim lnull (A)
(116)
Additionally, the dimensions of the row and column spaces must be equal
and are equal to the Rank of A
dim col (A) = dim row (A) = rank (A)
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
(117)
Page 81
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 82
EE/ME 701: Advanced Linear Systems
Section 8.1.6
8.1.5 Questions that can be answered with the four fundamental spaces
Given y = A b
EE/ME 701: Advanced Linear Systems
Section 8.2.0
8.2 Rank and degeneracy
Rank: In matrix theory the rank of a matrix is defined as the size of the largest
sub-array that gives a non-zero determinant.
What is the set of all possible b
b which give a specific b
y?
This is satisfactory as a formal, mathematical definition
When y is specified, is there an exact solution for b ?
But determinant is an unsatisfactory numerical calculation, because it is
numerically very sensitive, and cant handle non-square matrices.
If there is no exact solution, so = y A b 6= 0 ,
What is the set of all possible ,
Using the Gram-Schmidt algorithm, we can find a set of basis vectors for
the column space (or row space) of A, to determine the rank of A.
What is the dimension and a basis for this set
Is there any non-zero value e
b such that A e
b=0
In Matlab, Rank is obtained by the singular value decomposition, and
counting the number of singular values larger than a tolerance value. The
help message of rank() is instructive
What is the set of all possible e
b,
What is the dimension and a basis for this set
Given y, what is the smallest y A b
b
8.1.6 Two ways to determine the four fundamental spaces
1. With Gram-Schmidt ortho-normalization
Ortho-normalize the columns of A to get the column space, the left-null
space is the orthogonal complement.
>> help rank
RANK
Matrix rank.
RANK(A) provides an estimate of the number of linearly
independent rows or columns of a matrix A.
RANK(A,tol) is the number of singular values of A
that are larger than tol.
RANK(A) uses the default tol = max(size(A)) * norm(A) * eps.
(emphasis added)
Ortho normalize the rows of A to get the row space, the null space is the
orthogonal complement.
Range: The range of any function is the set of all possible outputs of that
function.
2. With the singular value decomposition (SVD)
The range space of matrix A is another name for the column space of A.
It is the range of function y = A b. It is the vector space spanned by the
columns of A.
Gives additional insight.
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 83
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 84
EE/ME 701: Advanced Linear Systems
Section 8.2.0
EE/ME 701: Advanced Linear Systems
The nullity of a matrix is the dimension of its null space, denoted by q (A)
9 Summary and Review
with A Rnp ,
q (A) = p r (A)
Part 2 offers many definitions and concepts. However, as is often the case
with mathematical domains, there are only a few essential ideas:
(118)
where r (A) is the rank of A.
A vector space is a set of vectors, and in general will not include all
vectors of the universe.
Degeneracy
Simple operations, such as y = A b , lead naturally to vector spaces, and
our understanding of the solution can be in terms of vector spaces.
If rank (A) = min (n, p) we say the matrix is full rank. (It has the greatest
possible rank).
A vector space is defined by a span or basis.
If rank (A) < min (n, p) we say
The inner product is a measure of the degree of overlap (parallelism) of
two vectors.
The matrix is rank deficient
The norm is a measure of the length of a vector.
The matrix has lost rank (if something happened that made it rank
deficient, such as a robot has reached a singular pose)
Vectors and a vector space can be parallel, orthogonal, or somewhere in
between (include a component of each).
The matrix is degenerate.
The projection operation determines the components of a vector lying in
or orthogonal to a vector space.
THEOREM: The rank of a matrix product.
Given A Rnm and B Rmp, and form C = A B Rnp , then the following
properties hold:
rank (C) + q (C) = p
Gram-Schmidt ortho-normalization produces a basis that is handy for
determining projections.
(119)
rank (C) min (rank (A) , rank (B))
q (C) q (A) + q (B)
Naturally, there are a variety of details fleshing out each of these essential
ideas.
(120)
(121)
The rank and dimension of the Null Space of C are determined by how the
column space of B falls on the row space of A .
rank (C) = dim intersection (col (B) , row (A))
Part 2: Vectors and Vector Spaces
Section 9.0.0
(Revised: Sep 10, 2012)
Page 85
Student
Thought
Problem
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 86
EE/ME 701: Advanced Linear Systems
Section 9.1.0
9.1 Important properties of Inner Products
EE/ME 701: Advanced Linear Systems
Section 9.2.0
9.2 Important properties of Norms
An inner product is an operation on two vectors producing a scalar result.
A norm is a measure (in a general sense) of the size of something.
The following properties hold for inner products:
A norm is a function of a single vector that produces a real scalar result
1. Commutativity: hx, yi = hy, xi (overbar indicates complex conjugate).
||x|| R
2. Distributivity (linearity): hx, y1 + y2i = hx, y1 i + hx, y2i
3. Can be an induced norm: hx, xi 0 x , and hx, xi = 0 iff x = 0 .
A norm must have the following properties:
1. Positive definiteness: ||x|| 0 , ||x|| = 0 if and only if x = 0 .
It follows that:
2. Scalar Multiplication: || x|| = || ||x||
1. Scalar multiplication, right term: hx, yi = hx, yi
3. Triangle Inequality:
2. Scalar multiplication, left term: h x, yi = hx, yi
||x + y|| ||x|| + ||y||
The length of the sum of two vectors can not be greater than the
individual lengths of the vectors.
Additionally: hx, yi may be zero, positive or negative.
4. Cauchy-Schwarz Inequality: |hx, yi| ||x||2 ||y||2
Two vectors can not be more parallel than fully parallel.
Technically, a vector space can be a vector space without having any norm
defined.
To be a vector space requires only a set of elements and the 8 properties
described in section 2.2.
But for the familiar vector spaces of Rn we have seen several norms,
||x||1 , ||x||2 , ||x|| , etc.
Specifically called a normed vector space.
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 87
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 88
EE/ME 701: Advanced Linear Systems
Section 9.3.0
9.3 Study tip:
EE/ME 701: Advanced Linear Systems
Section 9.3.0
Embedded vector subspace
Proper vector subspace
Learn the definitions !
Hyperplane, Hypersurface
Terms appear in italic bold where they are introduced and defined.
All most elements of A have property p
Working together and talking about the subject mater will help toward this
goal.
Flash cards and drill may also be useful.
Orthogonality of subspaces
Representation of a vector
Transformation from one representation to another
Vocabulary :
Projection
Vector
Projection operator, Projection matrix, Projection coefficients
Euclidean vector
Non-orthogonal projection
Inner product (dot product)
Normalization of a vector, Normalized vector
Outer product (matrix product)
Ortho-normal vectors
Cross product (vector product)
Gram-Schmidt orthogonalization
Orthogonality
Model set, reachable set
Norm, Induced norm
Four fundamental spaces
Parallel, Co-linear, Orthogonal, Direction cosine
Row, Column, Null, Left-null Spaces
Definition Vector space
Range, Range space
Scalar, Vector, Closure, Linear combination of vectors
Rank, degeneracy
Additive identity, Multiplicative identity
Full rank, Rank deficient, degenerate
Span, Spanning set
Basis vectors
Dimension of a vector space
Standard basis
Vector universe
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 89
Part 2: Vectors and Vector Spaces
(Revised: Sep 10, 2012)
Page 90
EE/ME 701: Advanced Linear Systems
EE/ME 701: Advanced Linear Systems
Linear Operators on Vector Spaces
4.3.2
4.4
Contents
Operator from data, example
. . . . . . . . . . . . . . .
46
Conclusions change of basis as a tool for analysis . . . . . . . . .
47
5 Operators as Spaces (Bay section 3.2)
1 Linear Operator
2 Rotation and Reflection Matrices
5.1
Operator Norms
. . . . . . . . . . . . . . . . . . . . . . . . . .
49
Operator norm properties . . . . . . . . . . . . . . . . . .
49
Determining the value of Operator norms . . . . . . . . . . . . .
50
5.1.1
5.2
48
2.1
Example Rotation and Reflection Matrices . . . . . . . . . . . . .
5.2.1
The L1 norm of an operator . . . . . . . . . . . . . . . . .
50
2.2
Three theorems and a corollary defining rotation matrices . . . . .
5.2.2
The L2 -norm of an operator . . . . . . . . . . . . . . . .
52
2.3
Summary of mathematical properties of rotation matrices . . . . .
14
5.2.3
The L -norm of an operator . . . . . . . . . . . . . . . .
53
2.4
Multi-axis rotations comprise rotations about each axis.
. . . . .
15
5.2.4
The Frobenius norm . . . . . . . . . . . . . . . . . . . .
54
Rotation matrix in terms of the from-frame axes
expressed in to-frame coordinates . . . . . . . . . . . .
18
Example: Photogrammetry, measurement from images . .
20
2.4.1
2.4.2
3 Linear Operators in Different Bases, or
A Change of Basis (Bay section 3.1.3)
3.1
21
Linear Operators and a change of Bases . . . . . . . . . . . . . .
4 Change of Basis as a Tool for Analysis
22
25
4.1
Example: non-orthogonal projection onto a plane . . . . . . . . .
28
4.2
Looking at the Fourier Transform as a change of basis
. . . . . .
37
4.2.1
The Fourier transform as a change of basis . . . . . . . .
42
4.2.2
Using the Fourier transform . . . . . . . . . . . . . . . .
43
4.3
Additional examples using change of basis
4.3.1
. . . . . . . . . . . .
44
Matching input and output data to discover an operator . .
44
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 1
5.3
Boundedness of an operator
. . . . . . . . . . . . . . . . . . . .
55
5.4
Operator Norms, conclusions
. . . . . . . . . . . . . . . . . . .
55
5.5
Adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . .
56
6 Bay Section 3.3
57
7 Forming the intersection of two vector spaces
58
7.1
Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
8 Conclusion
Part 3: Linear Operators
59
60
(Revised: Sep 10, 2012)
Page 2
EE/ME 701: Advanced Linear Systems
Section 1.0.0
1 Linear Operator
EE/ME 701: Advanced Linear Systems
Section 2.0.0
2 Rotation and Reflection Matrices
An operator is a generalization of the notion of a function. Operators are
functions of numerical arguments, and also functions of functions.
See: http://mathworld.wolfram.com/Operator.html
Rotation Matrix: A rotation matrix R is an n n matrix with the properties:
Well be focusing on functions of numerical arguments, so an operator is
essentially a synonym for a function.
Linear Operator: An operator A from vector space X to vector space Y, denoted
A : X Y, is linear if it verifies superposition:
A (1 x1 + 2 x2 ) = 1 A x1 + 2 A x2
Rotation and Reflection matrices are good examples of linear operators.
They are used extensively and will play an important role in the Single Value
Decomposition.
1. The length of vectors is preserved:
||R x|| = ||x||
x Rn , R Rnn and a rotation matrix
(2)
2. Angles between vectors are preserved:
hR x1 , R x2 i
hx1, x2i
=
||x1|| ||x2 || ||R x1 || ||R x2 ||
(1)
x1 x2 Rn
(3)
(recall the direction cosine).
For vectors x Rm
and y Rn , then Eqn (1) corresponds to ordinary matrix-
Since the denominators are the same, Eqn (3) implies
vector multiplication with A Rnm .
hx1 , x2i = hR x1, R x2 i
Bay discusses linear operators using examples 3.1 - 3.13.
(4)
3. Handedness: Rotation matrices preserve handedness, equivalently for
rotation matrix R
det (R) = +1
Example: The projection operator is a linear operator
(mapping a vector from Rn to a subspace of dimension n).
Reflection Matrix:
1. Reflection matrices preserve length
2. Reflection matrices preserve angles
3. Handedness: Reflection matrices reverse handedness, for reflection
matrix Q
det (Q) = 1
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 3
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 4
EE/ME 701: Advanced Linear Systems
Section 2.1.0
EE/ME 701: Advanced Linear Systems
Setup rotation matrix R2 with = +40o, det (R2) = +1
2.1 Example Rotation and Reflection Matrices
2-D, Rotation Matrix:
cos () sin ()
R=
sin () cos ()
(5)
(Determinant will equal +1)
2-D, Rotation Matrix with Reflection:
Section 2.1.0
>>
>>
>>
>>
%% Setup a +40 deg rotation matrix
theta2 = 40
Ct2 = cosd(theta2)
St2 = sind(theta2)
Rot2 = [ Ct2, -St2; St2, Ct2 ]
Rot2 =
0.7660
-0.6428
0.6428
0.7660
>> det(Rot2)
ans =
1.0000
1 0 cos () sin () cos () sin ()
Q=
sin () cos ()
sin () cos ()
0 1
(6)
Add a reflection to R2 , det (Q) = 1
(Determinant will equal -1)
Setup rotation matrix R1 with = 40o , det (R1) = +1
%% Setup a -40 deg rotation matrix
>> theta=-40
>> Ct = cosd(theta); St = sind(theta);
>> Rot1 = [ Ct, -St; St, Ct ]
Rot1 =
0.7660
0.6428
-0.6428
0.7660
>> det(Rot1);
ans = 1.0000
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 5
%% A reflection operator can be an identity matrix
%% with an odd number -1 elements.
>> Rot2PlusReflection = [-1
0 ] * Rot2;
[ 0
1 ]
Rot2PlusReflection =
-0.7660
0.6428
>> det(Rot2PlusReflection)
ans =
-1.0000
Part 3: Linear Operators
0.6428
0.7660
(Revised: Sep 10, 2012)
Page 6
EE/ME 701: Advanced Linear Systems
Section 2.1.0
EE/ME 701: Advanced Linear Systems
2.2 Three theorems and a corollary defining rotation matrices
Process points with rotation and reflection matrices.
The rotation and reflection matrices are linear operators, they map points
in R2 onto points in R2 .
>> P1 = [ 0.5000
0.5000
0.7000
0.7500
0.2500
0.2500]
>> P2
= Rot *
P1
P2 =
0.8651
0.5437
0.6969
0.2531
-0.1299
-0.2584
>> P3
= Rot2 * P1
P3 =
-0.0991
0.2223
0.3755
0.8959
0.5129
0.6415
>> P3b = Rot2PlusReflection * P1
P3b =
0.0991
-0.2223
-0.3755
0.8959
0.5129
0.6415
THEOREM 3.1: A rotation matrix must have the property that RT = R1.
Proof: Because lengths must be preserved, the angles condition can be rewritten:
hx1, x2i = xT1 x2 = xT1 RT R x2 = hR x1, R x2i
(7)
For Eqn (7) to be true x1 , x2 Rn, RT R must be the identity matrix, ergo
RT = R1.
QED
Example: Rotation matrices are linear operators
For a rotation of [degrees] in R2, the rotation matrix is:
C
S
R =
S C
Rotation and Rotation with Reflection
1
Rotated by 40 deg
and Reflected over the Y axis
(X values inverted)
Section 2.2.0
Original
(8)
where C = cos () and S = sin ().
0.5
As an example, consider = 20o, then
0
0.94
0.34
R=
0.34 0.94
Rotated by 40 deg
0.5
1
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
Verify that R given by Eqn (8) is a ortho-normal matrix for any value of .
Figure 1: Example of rotation and reflection.
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 7
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 8
Student
Exercise
EE/ME 701: Advanced Linear Systems
Section 2.2.0
EE/ME 701: Advanced Linear Systems
If the vi are normal, then
THEOREM 3.2: For any square ortho-normal matrix A:
i)
The transpose of the square ortho-normal matrix is the matrix inverse,
AT = A1.
ii)
Square ortho-normal matrices preserve length: ||A x|| = ||x|| x Rn .
ii)
Square ortho-normal matrices preserve angles:
1
0
0
T
A A= 0 1 0
0 0 1
ii)
iv) The determinant of A is either det (A) = 1 or det (A) = 1.
Proof:
i)
To show that
A=
then
v1 v2 v3
By property i) RT R = I. By property ii) ||R xi || = ||xi ||, so
hv1, v1i hv1 , v2 i hv1 , v3i
AT A = hv2, v1i hv2 , v2 i hv2 , v3i
hv3, v1i hv3 , v2 i hv3 , v3i
hR x1 , R x2 i
hx1, x2i
=
||x1|| ||x2 || ||R x1 || ||R x2 ||
(9)
Part 3: Linear Operators
0
0
hv1, v1i
AT A =
0
0
hv2, v2i
0
0
hv3 , v3 i
(Revised: Sep 10, 2012)
iv) Two properties of the determinant are:
1. For square matrices B, Q and R with B = Q R, det (B) = det (Q) det (R),
2. For any square matrix B, det (B) = det BT
If the vi are orthogonal, then
q
q
p
Let x2 = A x1, then ||x2 || = hx2, x2i = xT1 AT A x1 = xT1 x1 = ||x1||,
where the AT A is eliminated in the 3 step because AT A = I .
iii) hR x1 , R x2 i = xT1 RT R x2
, consider the matrix
(11)
and AT = A1. The argument of this example extends directly to
A Rnn .
x1 , x2 = R x1 , R x2
AT A = I
Section 2.2.0
(10)
Since AT A = I, it follows that det AT det (A) = det (I) = 1. But since
det AT = det (A), it follows that det (A) = 1, thus, given that A is a square
ortho-normal matrix, det (A) = +1 or 1.
QED
~
Page 9
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 10
EE/ME 701: Advanced Linear Systems
Section 2.2.0
THEOREM 3.3: Any square ortho-normal matrix A is a rotation matrix if
det (A) = +1 and a reflection matrix if det (A) = 1, and any rotation or
reflection matrix is a square ortho-normal matrix.
EE/ME 701: Advanced Linear Systems
x1 =
0 1 0
Section 2.2.0
with the 1 in the kth element. Now
hx1, x2i = xT1 x2 6= xT1 x3 = xT1 RT R x2 = hR x1 , R x2i
Proof:
(13)
This contradicts the hypothesis that R is a rotation or reflection matrix.
i)
Any square ortho-normal matrix is either a rotation or reflection matrix
Theorems 3.1 and 3.2 establish that a square ortho-normal matrix preserves
lengths and angles, therefore it is either a rotation or reflection matrix.
ii)
Any rotation or reflection matrix R is a square ortho-normal matrix, proof by
contradiction.
Assume there is a rotation or reflection matrix R which is not a square
ortho-normal matrix, that would imply that either the columns of R are not
orthogonal or that the columns of R are not normalized. Show that each leads
to a contradiction.
ii.a) If the columns are not orthogonal, show that R can not preserve angles.
First, looking back to Eqn (9), if the columns are not orthogonal, there must
be at least one pair vi , v j , i 6= j such that vi , v j = ai j 6= 0. Therefore
RT R 6= I.
Next we need to show that since RT R 6= I ,
x1 , x2 s.t.
hx1 , x2i = xT1 x2 6= xT1 RT R x2 = hR x1 , R x2i
(12)
Note: In a proof, it is not enough to simply assert Eqn (12). Even though
RT R 6= I, how do we know there are allowed choices for x1 and x2 such
that xT1 x2 6= xT1 RT R x2 . The last step of the proof gives a prescription to
construct such a x1 and x2.
ii.b) If one or more columns are not normalized, show that R can not preserve
angles.
In this case there is no vi , v j = ai j 6= 0, i 6= j ; but there is hvi , vi i = aii 6= 1.
Select x2 = vi , then RT R x2 = aii x2 , and
hx1, x2i = xT1 x2 6= xT1 aii x2 = xT1 RT R x2 = hR x1 , R x2i
This contradicts the hypothesis that R is a rotation or reflection matrix.
Thus, it is shown that given R is a rotation or reflection matrix, the assumption that
R is not ortho-normal leads to a contradiction.
QED
Thus all rotation or reflection matrices are ortho-normal matrices.
Because RT R 6= I , there exists at least one x2 such that RT R x2 = x3 6= x2.
Because x3 6= x2, there must be at least one element of x3 which does not
equal the corresponding element of x2, call this the kth element, and choose
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 11
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 12
EE/ME 701: Advanced Linear Systems
Section 2.2.0
THEOREM 3.4: Any two ortho-normal coordinate frames (sets of basis vectors)
A and B in Rn are related by a rotation matrix R and at most one reflection.
Proof: We can transform vectors represented in either coordinate frame to the
standard frame by
s
s
x = A ax ,
x = B bx
EE/ME 701: Advanced Linear Systems
Section 2.3.0
2.3 Summary of mathematical properties of rotation matrices
Rotations and reflections are linear operators, they map from Rn to Rn by a
matrix multiplication.
Matrices that preserve lengths and angles are either rotation or reflection
matrices.
and so
a
bT
= A1 B
and so
det ( abT ) =
All rotation or reflection matrices are square ortho-normal matrices, giving
RT R = I
1
det (B)
det (A)
Thus RT = R1.
Note: with A, Q, R Rnn :
If R is a rotation matrix, RT is a rotation matrix.
det A1 = 1/ det (A) ,
If R1 and R2 are rotation matrices, R3 = R1 R2 is also a rotation matrix.
All square ortho-normal matrices are either rotation or reflection matrices,
if
det (Q R) = det (Q) det (R)
Since A and B are ortho-normal bases, det (A) = 1 , det (B) = 1, which
shows that det ab T = 1, and so abT incorporates at most one reflection.
Even in Rn !
det (R) = +1 the matrix is a rotation matrix,
det (R) = 1 the matrix is a reflection matrix.
~
COROLLARIES 3.4: The action of any two reflections in Rn is to restore the
original handedness.
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 13
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 14
EE/ME 701: Advanced Linear Systems
Section 2.4.0
EE/ME 701: Advanced Linear Systems
Section 2.4.0
2.4 Multi-axis rotations comprise rotations about each axis.
Original Coordinates
Bay gives the example of a rotation operator.
Pitch 30.00 [deg]
Za
In Robotics and elsewhere, the rotation operator is called simply a
rotation matrix.
Za
1
0.5
0.5
0.5
0.5
Ya
Pitch, roll and yaw rotations are seen in figure 2. These are the rotations
about the three axes, and can also be referred to as:
Yb
0
0
Rotation about the X-axis, Rx (pitch)
0.5 0.5
0.5
0.5
0.5
1
Y
0.5
0
0
1
1.5 1.5
Ya
Xa
0.5
0.5
1
Y
1
1.5 1.5
Rotation about the Z-axis, Rz (roll)
Rotation about the Y -axis, Ry (yaw)
The terms pitch, roll and yaw are assigned to different axes by different
authors.
Roll 30.00 [deg]
Yaw 30.00 [deg]
Z
b
Z
Rx , Ry , Rz have the advantage of being unambiguous.
Zb Za
0
0.5
0.5
An example multi-axis rotation is seen in figure 3.
Xa
0.5
0.5
Z
Multi-axis rotations preserve length and angles (so a Euclidean basis set
such as the X, Y, Z axes, remain orthogonal).
Xb
0
0.5
0.5
1
Y
1
1.5 1.5
0.5 0.5
0.5
0.5
0
0
0.5
0.5
1
Y
1
1.5 1.5
Figure 2: 3D rotations, illustrating individual rotations Rx (pitch), Rz (roll), and
Ry (yaw). Note right-hand rule for rotation direction.
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 15
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 16
EE/ME 701: Advanced Linear Systems
Section 2.4.0
EE/ME 701: Advanced Linear Systems
Section 2.4.1
2.4.1 Rotation matrix in terms of the from-frame axes expressed in toframe coordinates
Above view
Z
Side View
Pitch: 30.00, Roll: 20.00, Yaw: 25.00
1
0.5
For example, the rotation from B to A in figure 4 is given as
0.866 0.500
C S
A
=
BR =
0.500 0.866
S C
XX
b
0.5
Xa
Ya
0.5
0.5
0
0.5
0.5
0.5
1
Y1.5
1.5
0.5
0.5
0 0.5
1 1.5 1.5 1
Y
A rotation matrix provides a transformation from one coordinate frame to
another, we can call these the from frame and the to frame.
0.5 0
0.5
Point Pa is given as
Pa =
0.7
0.5
Pa = AB R BPa =
0.356
0.783
Pa
a
YB
xis
Y axis
0.5
xis
Top View
Pitch: 30.00, Roll: 20.00, Yaw: 25.00
a
XB
0
X axis
0.5
1
1
0.5
0
0.5
1.5
Y
1
Za
0.5
0.5
Z
0
0.5 0.5
0.5
Figure 4: Rotation from the A frame to the B frame, BA R.
1.5
0.5
Figure 3: General 3D rotation, combining rotations about all three axes.
The rotation matrix from B to A given by the axes of B expressed in A
Look at figure 4,
|
|
A
A
AY
BR = X B
B
|
|
Look at the B axes in figure 4, expressed in the A coordinate frame.
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 17
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 18
EE/ME 701: Advanced Linear Systems
Section 2.4.1
3-D example
EE/ME 701: Advanced Linear Systems
Section 2.4.2
2.4.2 Example: Photogrammetry, measurement from images
A typical application comes from photogrammetry, where it is often
necessary to shift vectors from target to camera coordinates.
The rotation from the B frame to the A frame in figure 3 is given by:
0.85
0.49
0.17
A
B R = 0.31 0.74 0.60
0.42 0.45 0.78
To shift coordinates from camera to target coordinates:
By convention, the rotation from the camera frame to the target frame is
given by rotations about the three axes, corresponding to three angles pitch,
roll and yaw: Rx (pitch, ) Rz (roll, ) Ry (yaw, ).
0.85
A
X B = 0.31
0.42
c
tR =
In general, the rotation from a B coordinate frame to an A coordinate
frame is given by:
A
A
BR = X B
AY
AZ
= tc R cPa + tPc.
Where tPc R3 is an offset vector.
The X-axis of the B frame, expressed in A coordinates is:
tP
a
(14)
c
tR
1 0
1 0
0
0
0 Rz Ry Rx
0 1
1
C S 0
0 S
= 0 1 0 S C 0 0 1 0 0 C S
0 0 1
0
0 1
S 0 C
0 S C
CC +C S CS S S S CC S
= SC
(15)
S SS +CC C SS CS
S
C S
CC
Compare Eqn (15) with Bay, Eqn (3.16). Bay uses a different ordering for
the rotations.
Because matrix multiplication does not commute, Bays the 3-axis
rotation matrix - while similar - is not exactly the same as tcR.
There are at least 48 ways to put together a 3-Axis rotation matrix, and
they are all found somewhere in the literature.
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 19
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 20
EE/ME 701: Advanced Linear Systems
Section 3.0.0
EE/ME 701: Advanced Linear Systems
3 Linear Operators in Different Bases, or
3.1 Linear Operators and a change of Bases
A Change of Basis (Bay section 3.1.3)
A linear operator is just a map from one vector onto another
When we make a change of basis, we change the axes on which a vector is
represented. For example, given two bases on R2 ,
U=
u1 u2
The vector f =
1 0
=
,
0 0.5
0.5 1
T
V=
v1 v2
y1 = A x1
If we change the representation of the vectors, we have to make a suitable
change in the linear operator.
0.8 1
=
0.8 0
Change the representation of the vectors
x2 = Bx x1
f=U f=
u1 u2
y2 = By y1
(17)
0.8
1
or f = 1.25
+0.5
0.8
0
where Bx and By are basis transformations in the input and output spaces,
respectively, and must be square and invertible.
Rewriting Eqn (16) with the change of basis
y1 = A x1
y2 = A x2
Which is equivalent to writing
(16)
can be represented
0.5
1
0
f=
= 0.5 +2.0
1.0
0
0.5
Section 3.1.0
0.5
,
2.0
or f = V f =
v1 v2
1.25
0.5
With a change of basis from U to V , the representation of a vector changes,
but the vector itself (vector f) remains the same.
(Eqn 16, repeated)
(18)
Relating Eqn (16) to Eqn (18) gives
By y1 = A Bx x1
or
y1 = B1
y A Bx x1
(19)
Using the uniqueness of Eqns (16) and (18), Eqn (19) implies that
A = B1
y A Bx
or, equivalently
A = By A B1
x
We started with the equation
y1 = A x1
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 21
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 22
EE/ME 701: Advanced Linear Systems
Section 3.1.0
EE/ME 701: Advanced Linear Systems
Section 3.1.0
Notice that when we write
and ended up with the equation
y2 = A x2
y2 = By y1
where the input and output bases of linear operator A have changed.
Which brings up this point: implicit in any linear operator are the bases
in which the input and output are expressed.
We normally assume these to be the standard Euclidean bases for Rm
and Rn .
x2 = Bx x1
the transformation matrices By and Bx must be square matrices, and full rank
to be invertible. By Rnn , Bx Rmm
A special case of basis transformation arises when A is a square matrix, or
A : Rn Rn . In this case the input and output transformations can be the
same,
y2 = B y1 , x2 = B x1
Combining with Eqns (16)-(19) above, we can write
y1 = A x1 = B1 A B x1
Which gives:
A = B1 A B
1
BAB
(20)
= A
(21)
When A and B are square and B is invertible, Eqn (20) has a special name.
It is called a similarity transformation.
Similarity transformations preserve the eigenvalues, in other words
eig (A) = eig A , where eig (A) is the vector of eigenvalues of A
Similarity transformations are going to give us the freedom to re-write
system models from one basis to another, to explore model properties such
as modal response, controllability and observability.
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 23
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 24
EE/ME 701: Advanced Linear Systems
Section 4.0.0
4 Change of Basis as a Tool for Analysis
EE/ME 701: Advanced Linear Systems
Section 4.0.0
The basic architecture of an application of change of basis is this,
Given an application:
We have seen the basic mechanics of
1. The application domain has natural basis
Representation of a vector on a specific basis, and
Example:
How to change the basis of vectors, and finally
How to adapt an operator, when the vectors it operates on change basis
These are powerful tools for several types of problems.
Time-domain, for waveform compression
Call this basis s, for the standard basis
2. The application is difficult in the original basis
Bay homework problems 3.4, 3.5, 3.6, 3.7, 3.9, 3.10 and 3.11 are all
addressed by a change of basis.
Example:
A change of basis is the mathematical foundation for many common
methods
If we just throw out 9 out of 10 samples in the time domain, the
reconstruction will be very bad
3. There is an alternative basis on which the application is easy
Fourier transform
Wavelet transform
Example:
Expansion onto Lebesgue and Legendre polynomials
If the data vector (a voice signal) is represented on a set of
wavelet vectors (discrete wavelet transform), the application is
easy:
And applications
Just throw out the coefficients for basis vectors that make
little difference for human perception
Reconstructing MR images
jpeg image compression
Speech compression for cell-phone transmission
And for control systems, a change of basis is necessary to solve
x (t) = A x (t) + B u (t)
at all.
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 25
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 26
EE/ME 701: Advanced Linear Systems
Section 4.0.0
EE/ME 701: Advanced Linear Systems
4.1 Example: non-orthogonal projection onto a plane
4. We solve the problem or achieve the application in 3 steps:
Step 1. Transform the data from the s basis into a basis where the
application is easy, call this this F basis (it can have a different
name for each application).
Bay problem 3.9 raises an interesting challenge from computer graphics.
Problem 3.9 (adjusted values):
Let P be the plane in R3 defined by 1x 1y + 2z = 0, and let l be the vector
Step 2. Solve the application with data in the new F basis
2
l= 1 .
1
Step 3. Transform the results back to the s basis, for utilization
s
A = sF T
Problem expressed
in s basis
(Problem is generally
unsolvable on the s basis)
F
sT
(22)
Solution expressed
in s basis
Alternative, F, basis
Solve
Problem
Section 4.1.0
Denote by A : R3 R3 the projection operator that projects vectors in R3
onto the plane P, not orthogonally, but along vector l.
Projected points can be pictured as shadows of the original vectors onto
P, where the light source is at an infinite distance in the direction l.
(Problem is solvable
on the F basis)
1. Find the matrix
of operator
F A which projects a vector represented in
basis F = f1, f2 f3 onto plane P, where f1 and f2 are any non-
Figure 5: Problem solving steps using a change of basis.
collinear vectors in the plane P, and f3 = l.
Problem: Canons blowing
up, Heat distribution
known in s basis
(s: just x, y, z)
No solution to
Heat-conduction
Equation for general
h0(x,y,z)
Alternative, F, basis is
sine and cosine functions
Solve
Problem
Heat conduction can
be solved for sin(x)
initial distrubtion.
Solution expressed
in s basis.
Hot points determined,
cannon redesigned.
By superposition, the solution
for many sine functions, is the
sum of the solutions for each
individual sine fucntion.
(Revised: Sep 10, 2012)
Non-orthogonal projection is a standard operation in computer graphics.
Rendering systems do shading and compute reflections by representing
complex surfaces as many flat triangular tiles, and computing the
intersection point of many rays with these tiles.
Figure 6: Problem solving steps using a change of basis.
Part 3: Linear Operators
2. Find the matrix of operator s A which projects a vector represented in
standard R3 basis onto plane P.
Page 27
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 28
EE/ME 701: Advanced Linear Systems
Section 4.1.0
The origin For the plane to be a 2-D vector subspace, the origin must lie in the
plane. In practical systems with many tiles (planes), the origin is offset to
a point in each plane as it is processed, so that a projection operator can be
used.
y1
ne
Section 4.1.0
Definitions and background details:
x1
Pla
EE/ME 701: Advanced Linear Systems
Surface Normal The plane is specified by its surface normal, a vector that is
orthogonal to the plane. This is a standard way to specify an n 1
dimensional hyper-surface in Rn .
x2
~n: Surface normal
y2
It is necessary to find basis for vectors for plane P . We can use the fact
that all vectors in plane P are orthogonal to n.
~l: Shadow ray
The projection The calculation
y = Ax
Figure 7: Illustration of projection along ray ~l. In computer graphic, complex
surfaces can be represented as a mesh of triangular tiles.
is a projection of a point (x) onto the surface (to point y) . Since x and y are
both 3-vectors, A is a 3x3 matrix.
The problem can be stated: Given a ray originating at point
x
x= y ,
z
following the line
Note: up to now we have considered only orthogonal projections, that is, if
g = P f is the projection of f onto a subspace, and w = f g, then w g.
2
l= 1
1
But in this case, as with many projections in computer graphics, the
projection is not orthogonal. That is:
Where does the ray strike a plane defined by the surface normal n = 1 ?
2
Being able to cast the relation as a linear operator, of course,
greatly simplifies and accelerates the calculation.
A Play Station 3 can perform this calculation several billion times per
second.
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 29
The line l is not orthogonal to the plane or, equivalently,
l is not parallel to n, l n .
Bases: This problem is approached in problem 3.9 (a) on a basis F. In this
note, points in the standard space (x , y , z) are labeled sx and sy. Vectors
expressed on basis F are labeled Fx and Fy .
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 30
EE/ME 701: Advanced Linear Systems
Section 4.1.0
Suggested Approach: With many engineering challenges, it is good to ask the
question: when I have an answer, how can I verify that it is the correct
answer. Turn problem 3.9 around and ask:
given a point sxi in R3 and its shadow in plane P (call this syi ) how
can I independently verify that syi is correct ?
For this problem, the reverse problem, verification, may be easier than the
analysis to determine sy, and thinking through how to verify a correct
answer may help find how to determine sy.
EE/ME 701: Advanced Linear Systems
(b): to verify that syi is the shadow of sxi consider what it means to be the shadow:
yi sxi = l
Example Data:
1. Given a point sxi and a point syi in the plane, derive the calculation to verify:
(b) that syi is the shadow of sxi
(a): to verify that point sy1 lies in plane P we need basis vectors for P. These will
be any two independent vectors lying in P. That is, any two independent
vectors, each orthogonal to n.
1
sx
2
2.67
s
1 = 3 which projects onto y1 = 0.67
4
1.67
Verifying that syi lies in plane P: One way to check that sy1 lies in P is to form
the projection of sy1 onto P, and verify that it equals sy1 .
(a) that syi is in plane P, and
(24)
If syi sxi = l , then syi is the shadow of sxi .
Consider the point
Considering verification of the correctness of a solution:
Section 4.1.0
Considering that n = 1 ,
one choice is: P = 0 1
1 0
(23)
The projection of sy1 onto P is given by:
>> P = [ 2 1 ; 0
P =
2
1
0
1
-1
0
1 ; -1 0 ]
sy1 = -2.67
0.67
1.67
>> %% Verify sy1 lies in P
>> sy1hat = P * inv(P*P) * P * sy1
sy1hat = -2.67
0.67
1.67
>> sy1hat-sy1
ans =
1.0e-15 *
-0.8882
0
-0.1665
1
PT syi
y = P PT P
If y = syi , then sy1 lies in P.
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 31
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 32
EE/ME 701: Advanced Linear Systems
Section 4.1.0
Verifying that syi is the shadow of sxi : To show that sy1 is the shadow of sx1,
show that ( sy1 sx1 ) is parallel to l . For the example data we find:
%% Difference Vector
>> ll = sy1-sx1
ll = -4.6667
-2.3333
-2.3333
%% Term-by-term ratio
>> ll ./ l
ans = -2.3333, -2.3333, -2.3333
%% Projection ray l
l =
2
1
1
EE/ME 701: Advanced Linear Systems
Section 4.1.0
2. Given a vector Fx1 represented on basis F, find the operator FA that
determines Fy1, that is, determines the shadow point represented on basis F.
Discussion: Using F as a set of basis vectors, than any point sxi is represented as
2
1
2
s
xi = a1 0 + a2 1 + a3 1 = F F xi
1
0
1
sy1-sx1 = -2.3333 l
with
Now considering text problem 3.9.
(Problem 3.9 part a)
Find yi on the F basis. Given
1. Construct the set of basis vectors
F=
f1 f2 l
where f1 and f2 are basis vectors in the plane, and l is the projection ray.
Given P above, the set of basis vectors for F is:
F=
f1 f2 l
b1
yi = b2
b3
so that
2 1 2
= 0 1 1
1 0 1
(Revised: Sep 10, 2012)
Part 3: Linear Operators
a1
xi = a2
a3
This gives:
b1 1 0 0
F
yi = b2 = 0 1 0
0 0 0
0
Page 33
2
1
2
yi = b1 0 + b2 1 + b3 1 = F F yi
1
0
1
To find syi in plane P , set b3 to 0 !
Many F matrices are possible, with f1 and f2 as basis vectors for P.
Part 3: Linear Operators
(25)
F
xi
(Revised: Sep 10, 2012)
Page 34
EE/ME 701: Advanced Linear Systems
Section 4.1.0
For yi to be the projection of xi along l, in the F basis vectors, we can
only modify the 3rd coefficient, giving
b2 = a2
b1 1 0 0
b2 = 0 1 0
0 0 0
0
Or
yi = FA Fxi
FA
with
(Problem 3.9 part b)
Discussion: We have these three relationships:
Properties (a) yi lies in P, and (b) yi is the projection along l of xi
determine the projection in basis F :
Section 4.1.0
1. Starting with FA, find the operator sA that determines sy1 corresponding to
a point sx represented on the standard basis.
b1 = a1
EE/ME 701: Advanced Linear Systems
a1
a2
a3
1
0
0
F
A= 0 1 0
0 0 0
(26)
Standard map from F coordinates to standard coordinates:
s
yi = F Fyi ,
s
FT
so
=F
(28)
The standard map from s coordinates to F coordinates (just the inverse
of Eqn (28))
xi = F 1 sxi ,
note
F
sT
= F 1
(29)
And we have the operator in F coordinates, FA .
(27)
Answer: put the three pieces together
s
y =
s
FT
F
sT
xi
= sA sx
(30)
is the projection operator on the F basis.
>> sA = F*[1 0 0; 0 1 0; 0 0 0] * inv(F)
sA =
0.3333
0.6667
-1.3333
-0.3333
1.3333
-0.6667
-0.3333
0.3333
0.3333
Note: there is only one operator sA that projects along line l to plane P in standard
coordinates. The numerical values come out to those given, whatever basis
is chosen for P.
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 35
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 36
EE/ME 701: Advanced Linear Systems
Section 4.2.0
4.2 Looking at the Fourier Transform as a change of basis
EE/ME 701: Advanced Linear Systems
Consider the inverse DFT
The Fourier transform is written as a convolution integral. Given f (t)
Fourier Transform:
Inverse Fourier Transform:
F () =
f (t) =
2 j t
f (t) e
dt
F () e+2 j x d
f ( j) =
(31)
(32)
F (k) =
1
f ( j) =
N
Inverse DFT:
f ( j) e
j=1
N
2i
N ( j1)(k1)
F (k) e
2i
N ( j1)(k1)
(34)
k=1
The DFT gives a signal in frequency domain, k = (k 1)/N [cycles/sample].
f(j)
2
0
20
40
60
80
j [sample number]
100
120
140
real F(k)
100
50
0
50
0
10
20
30
40
50
60
70
imag F(k)
50
50
0
10
20
30
40
k [wave number]
50
60
70
>> w1 = 1/16; w2 = 1/32;
>> jjs = 1:128;
>> for j1 = jjs,
f(j1) = cos(2*pi*w1*j1) ...
+ cos(2*pi*w2*j1);
end
>> F = fft(f);
>>
>> kk = (0:127)/2;
>> figure(1),
>>
subplot(3,1,1)
>>
plot(jjs, f)
>>
subplot(3,1,2)
>>
plot(kk, real(F))
>>
subplot(3,1,3)
>>
plot(kk, imag(F))
F (k) e
2i
N ( j1)(k1)
(Eqn 34 , repeated)
k=1
2i
N (11)(k1)
2i (21)(k1)
eN
vk =
..
2i
e N (N1)(k1)
(33)
where f ( j) is in time-domain signal and F (k) is in frequency domain, and
DFT is the Discrete Fourier Transform.
1
N
Defining basis vectors
For the Discrete Fourier Transform (DFT), the DFT and Inverse DFT are
given by summations in the place of integrals:
DFT:
Section 4.2.0
ei k (11)
i (21)
e k
=
..
.
ei k (N1)
(35)
where
2 (k 1)
N
is the frequency corresponding to component F (k).
k =
(36)
Using the vk , the inverse DFT of Eqn (34) takes the form:
f=
1
(F (1) v1 + F (2) v2 + + F (N) vN )
N
(37)
T
where f = f (1) f (2) f (N)
is the time-domain signal, and
T
is the frequency-domain signal.
F = F (1) F (2) F (N)
Eqn (37) has the form of expanding f on a set of basis vectors.
The 1/N term is a normalization.
Figure 8: Signal and Discrete Fourier Transform.
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 37
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 38
EE/ME 701: Advanced Linear Systems
Section 4.2.0
EE/ME 701: Advanced Linear Systems
Putting the basis vectors together in an array
V=
Putting together Eqns (39) and (41) gives
v1 v2 vN
(38)
f=
We find
1
VF
N
(where F is the Fourier transform of f).
f=
1
V VTf
N
(42)
where V V T is a projection operation, when the basis vectors are orthogonal.
(39)
Lets ample, with N = 6. Eqn (35) gives the code:
An important property of the Fourier basis functions is that they are
orthogonal . That is, for Eqns (35) and (38):
V VT = N I
Section 4.2.0
or, V T = N V 1
(40)
(The Fourier basis functions require term N to be normalized)
Looking at Eqn (33), and considering the vT is the complex conjugate
transpose of v, we find that the DFT is given by:
F =VTf
(41)
Summary:
Eqn (35) defines an orthogonal basis of N- element vectors (or functions,
for the continuous time FT)
>> w0 = 2*pi/N
>> for kk = 1:N,
>>
for jj = 1:N,
>>
V(jj,kk) =
>>
end
>> end
V =
1.00
1.00
1.00
1.00
1.00
1.00
1.00
0.50
-0.50
-1.00
-0.50
+
+
+
-
0.87i
0.87i
0.00i
0.87i
0.50 - 0.87i
%% Fundamental frequency
%% Build the N basis vectors
%% The N elements of each vec
exp(j*w0*(jj-1)(kk 1));
1.00
-0.50
-0.50
1.00
-0.50
+
+
0.87i
0.87i
0.00i
0.87i
-0.50 - 0.87i
1.00
-1.00
1.00
-1.00
1.00
+
+
-
0.00i
0.00i
0.00i
0.00i
-1.00 + 0.00i
1.00
-0.50
-0.50
1.00
-0.50
+
-
0.87i
0.87i
0.00i
0.87i
-0.50 + 0.87i
1.00
0.50
-0.50
-1.00
-0.50
+
+
0.87i
0.87i
0.00i
0.87i
0.50 + 0.87i
Because the Fourier basis functions are orthogonal, no matrix inverse
is required to compute the change of basis (Fourier would have had a
difficult time doing a large matrix inverse in 1805!)
The DFT, Eqn (41), is a change of basis, from the standard basis of f ( j)
to the Fourier basis F (k).
The IDFT is the change of basis back.
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 39
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 40
EE/ME 701: Advanced Linear Systems
Now consider the FFT
>> f = [ 1 2 3 4 5 6]
f =
1
2
3
4
5
6
Section 4.2.0
Finally, V is a set of orthogonal basis vectors (and 1/N is required to
normalize)
6
0
0
0
0
0
Part 3: Linear Operators
0
6
0
0
0
0
Section 4.2.1
4.2.1 The Fourier transform as a change of basis
%% Using the standard FFT function
>> F1 = fft(f)
F1 = 21.0000
-3.0000 + 5.1962i
-3.0000 + 1.7321i
-3.0000
-3.0000 - 1.7321i
-3.0000 - 5.1962i
%% Using a matrix multiply, for a change of basis
>> F2 = V*f
%% The inverse Fourier transform
F2 = 21.0000
>> f3 = (1/N)* V * F2
-3.0000 + 5.1962i
-3.0000 + 1.7321i
f3 = 1
-3.0000 - 0.0000i
2
-3.0000 - 1.7321i
3
-3.0000 - 5.1962i
4
5
6
>> V*V
ans =
EE/ME 701: Advanced Linear Systems
0
0
6
0
0
0
0
0
0
6
0
0
0
0
0
0
6
0
1. The matrix multiply form only works for sampled signals (discrete
signals, or the discrete Fourier transform). Continuous signals require
integrals (and can be defined in terms of inner products).
2. Matrix multiplication is conceptually simple, but computationally
inefficient. For a 1024 element FFT, a 1024x1024 matrix V would be
required.
Fourier actually focused on the columns of V . Fouriers insight contained 2
parts:
1. He could solve the problem of heat conduction for an initial heat
distribution given by a basis vector vk
2. The basis vectors of the Fourier transform are orthogonal !
Ordinarily given basis vectors V and data f, to find the basis
coefficients we need to solve
1 T
F = V 1 f ,
or V T V
V f
But the Fourier basis vectors are orthogonal, with a very simple
normalizing factor,
F =VTf
0
0
0
0
0
6
(Revised: Sep 10, 2012)
If the Fourier Transform is just a matrix multiplication for a change of basis,
why are text book derivations based on summations and integrals:
so no matrix inversion is required !
Page 41
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 42
EE/ME 701: Advanced Linear Systems
Section 4.2.2
Section 4.3.1
4.3 Additional examples using change of basis
4.2.2 Using the Fourier transform
Suppose we had an operator FA that operates on frequency data, not on
time-domain data, and given f time-domain data.
1
y= V
N
A VT f
(43)
Where the logic is:
Convert f to frequency domain
(action of V T )
Apply the operator in frequency domain
(action of FA)
Convert the answer back to time domain
(action of V )
Now, using Fouriers technique, we can just make a time-domain operator
1
A= V
N
EE/ME 701: Advanced Linear Systems
A V
(44)
And so
y = sA f
(45)
4.3.1 Matching input and output data to discover an operator
Given the data
1
4
1
{x1 , x2 , x3} =
2 , 5 , 2
3
0
6
0
4
1
{y1 , y2 , y3} =
2 , 1 , 0
2
6
1
Find the operator A to solve
y = Ax
The key to finding a good basis for solving the problem is to realize that a
basis F is found so that the x data are
1 0 0
F
x1 , Fx2 , Fx3 = 0 , 1 , 0 , with
0
1
0
F
y1, Fy2 , Fy3
Then the operator would simply be:
F
A = Fy1 , Fy2 , Fy3
1
0
so that
F
y1 = FA Fx1 = FA 0 , Fy2 FA Fx2 = FA 1
0
0
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 43
Part 3: Linear Operators
(Revised: Sep 10, 2012)
etc.
Page 44
EE/ME 701: Advanced Linear Systems
Section 4.3.1
So the trick is to find a basis F on which
1
F
x1 = 0 ,
0
EE/ME 701: Advanced Linear Systems
Section 4.3.2
4.3.2 Operator from data, example
0
F
x2 = 1 , etc.
0
Setting the X and Y data
>> X = [1
2
3
The solution is to choose as basis vectors the xi vectors themselves. Then
F = [x1 , x2 , x3]
(46)
4 -1
5 2
6 0]
>> Y = [ 0
2
-2
Now setting up the change of bases
4 -1
1 0
6 1]
Double checking:
>> sFT = X;
which gives
With Eqn (46)
x1 = F Fx1 = [x1 , x2 , x3] 0 ,
0
s
FT
F
sT
=F,
>> FsT = inv(sFT)
FsT =
-0.8000
0.4000
-0.2000
etc.
= F 1
(47)
On the F basis:
F
y =
A x
with
A = Fy1 , Fy2 , Fy3 = FsT [ sy1, sy2 , sy3 ]
Now, to find the operator in standard coordinates, it is the usual equation:
s
A = sF T
F
sT
(48)
And the operator gives y s from x s in the standard coordinates:
-0.4000
0.2000
0.4000
0.8667
-0.2667
-0.2000
The operator on the F basis
>> FA = FsT * Y
FA =
-2.5333
0.9333
1.2000
1.6000
0.2000
-1.6000
1.6667
-0.6667
0
Now convert the operator to the standard basis
>> sA = sFT * FA * FsT
sA =
1.8000
0.4000
-1.2000
-0.6000
3.8000
2.4000
-0.8667
1.4667
-3.5333
>> sA * X(:,1)
ans =
0.0000
2.0000
-2.0000
>> sA * X(:,2)
ans =
4.0000
1.0000
6.0000
>> sA * X(:,3)
ans =
-1.0000
-0.0000
1.0000
y = sA sx
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 45
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 46
EE/ME 701: Advanced Linear Systems
Section 4.4.0
EE/ME 701: Advanced Linear Systems
Section 5.0.0
5 Operators as Spaces (Bay section 3.2)
4.4 Conclusions change of basis as a tool for analysis
Some problems are much more easily solved in a special coordinate frame
that is different from the natural (standard) coordinates of the problem.
In these cases the best way, and possibly the only way, to solve the problem
is to transform the data into the special coordinates, solve the problem, and
transform the answer back out again.
With linear transformations and operators, we can then build the operator
directly in standard coordinates with Eqn (22).
Since the time of Fourier, coordinate transformation is a standard tool, but
we might call it by other names, or never explicitly compute the matrices F
and F 1
Coordinate transformation is the only general way to solve the equation
The set of linear operators from one vector space into another (or into itself)
also forms a vector space.
Working with the space of all operators A : Rm Rn , and with
y1 = A1 x , y2 = A2 x
then
A1 x + A2x = (A1 + A2) x
and all the required properties of a vector space are satisfied
1. The 0 operator is included in the set
2. For every operator there is the additive inverse operator
x (t) = A x (t) + B u (t)
3. Closure under addition
4. Commutativity of addition
5. Associativity of addition
6. Closure under scalar multiplication
7. Associativity of scalar multiplication
8. Distributivity of scalar multiplication over operator addition
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 47
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 48
EE/ME 701: Advanced Linear Systems
Section 5.1.1
5.1 Operator Norms
EE/ME 701: Advanced Linear Systems
Section 5.2.1
5.2 Determining the value of Operator norms
Operators also have norms. Intuitively, the size of an operator relates to
how much it changes the size of a vector. Given:
Just as there are several vector norms, each induces an operator norm.
5.2.1 The L1 norm of an operator
y = Ax
with suitable norms ||y|| and ||x||. Then the operator norm ||A||op is defined
by:
||y||
(49)
= sup ||y||
||A||op = sup
x6=0 ||x||
||x||=1
where sup, supremum, indicates the least upper bound.
The operator norm is induced by the vector norms ||y|| and ||x||.
An operator matrix does not have to be square or full rank, it can be any
matrix, A Rnm
5.1.1 Operator norm properties
2. ||A1 + A2 ||op ||A1 ||op + ||A2 ||op
3. ||A1 A2 ||op ||A1 ||op ||A2 ||op
4. || A||op = || ||A||op
Or said another way, what is the maximum (L1 ) gain of operator A?
Example
1 4
y = Ax = 2 5 x,
3 6
what is the max. possible ||y||1 / ||x||1
For the moment lets consider x with ||x||1 = 1 . (Through linearity, ||y||1
just scales with ||x||1 ) .
Operator norms have these properties
1. ||A x|| ||A||op ||x||
The L1 norm of an operator answers this question: Given a vector x with
||x||1 , and y = A x , what is the largest possible value of ||y||1 ?
(from the definition of the operator norm)
(triangle inequality)
(Cauchy-Schwarz inequality)
It turns out that the choice for x that gives the largest possible 1-norm
for the output, is to put all of ||x||1 on the element corresponding to the
largest column vector A . (Largest in the L1 sense.)
Choose:
(scalar multiplication)
so
0
x=
1
then
||x||1 = 1 , ||y||1 = 15
||A||1op = 15
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 49
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 50
EE/ME 701: Advanced Linear Systems
Section 5.2.1
Example:
A=
EE/ME 701: Advanced Linear Systems
Section 5.2.2
5.2.2 The L2 -norm of an operator
1
4
,
a2 =
2
5
3 6
a1
y = Ax ,
The L2 -norm of an operator gives the maximum L2 length of an output
vector for a unit input vector.
so
0.5
for x =
,
0.5
0
for x = ,
1
||a2||1 = 15
||A||2 = sup
x6=0
1
y= 2 ,
3
1
for x = ,
0
||a1||1 = 6 ,
||y||1 = 6
||A||2 =
max
||x||2 =1
xT A A x
o1/2
yT y
(52)
Eqn (52) is just a re-statement of the definition of an operator norm, with
the ||y||2 expanded inside the brackets.
The L2 norm of a matrix is the largest singular value of the matrix
||A||2 = (A)
||A||1 = max a j 1
j
(Revised: Sep 10, 2012)
(53)
It is determined using the singular value decomposition (a topic of Bay
chapter 4). For Example:
>>[U,S,V] = svd(A)
U = -0.429
0.806
-0.566
0.112
-0.704
-0.581
V =
S =
0.408
-0.816
0.408
9.508
0
0
0
0.773
0
-0.386
-0.922
(50)
The L1 norm of matrix A is the largest L1 norm of any column vector of A.
Part 3: Linear Operators
Where the overbar indicates complex conjugate.
2.5
y = 3.5 , ||y||1 = 10.5
4.5
4
y = 5 , ||y||1 = 15
6
then
(51)
Bay defines the L2 norm of a matrix operator as:
Given
A = [a1 , a2 ... , am ]
||y||2
||A x||2
= sup
= sup ||A x||2
||x||2 x6=0 ||x||2
||x||2 =1
Page 51
>> x2 = [.386; .922]
;
y2 = A*x2 ,
y2 = [ 4.0740,
5.3820
,
6.6900]
>>norm(y2)/ norm(x2),
ans =
9.5080
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 52
-0.922
0.386
EE/ME 701: Advanced Linear Systems
Section 5.2.3
EE/ME 701: Advanced Linear Systems
5.2.3 The L -norm of an operator
5.2.4 The Frobenius norm
An alternative to the operator norms induced by the vector norms (the 1-, 2and - norms, above), one can define an norm directly on the elements of
the matrix. The entry-wise norms are given by:
The L -norm of an operator gives the maximum L length of an output for
a unit input .
||A|| = sup
x6=0
Section 5.2.4
||y||
||A x||
= sup
= sup ||A x||
||x|| x6=0 ||x||
||x|| =1
(54)
n
||A||opp =
Bay defines the L norm of a matrix operator as:
m
||A|| = max
i
j=1
ai j
i=1 j=1
p
ai j
!1
(56)
Note, care must be taken with notation, because the normal notion of the
operator norm is induced, as in section 5.2.2, not given by Eqn (56)
(55)
A special case of these so-call entry-wise norms is the Frobenius norm
n
The L -norm is given by the row vector of A with the greatest L1 -norm.
||A||F =
For the example matrix:
i=1 j=1
!1
2
2
ai j
(57)
The Frobenius norm has the properties a norm (described in section 5.1)
r
1 4
1
A = r2 = 2 5 ,
3 6
r3
r1 =
r2 =
r3 =
h
h
h
1 4
2 5
3 6
i
i
i
||r1 ||1 = 5
,
The Frobenius norm has the additional advantage of being easy to
compute
||r2 ||1 = 7
(Recall that finding ||A||2 requires solving the singular value
decomposition).
||r3 ||1 = 9
So ||A|| = 9
Additional relationships for the Frobenius norm are given by:
To see why ||A|| is given by the 1-norms of the rows
-norms of anything !) consider that:
(rather than the
||A||F =
0
1
1
1
1
= =
=
= = 1
1
1
1
1
0
One of the above vectors, multiplying A, gives the maximum -norm
output. And that max is the greatest 1-norm of a row.
Part 3: Linear Operators
(Revised: Sep 10, 2012)
||A||F
Page 53
i1/2
h
T
tr A A
v
umin(n,m)
u
= t 2i (A)
(58)
(59)
i=1
where tr () denotes the trace of a matrix, which is the sum of the elements
on the diagonal; and the i in Eqn (59) are the singular values.
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 54
EE/ME 701: Advanced Linear Systems
Section 5.4.0
EE/ME 701: Advanced Linear Systems
Section 5.5.0
5.5 Adjoint Operators
See, for example
J-C Lo and M-L Lin, Robust H Control for Fuzzy Systems
with Frobenius Norm-Bounded Uncertainties, IEEE Transactions on
Fuzzy Systems 14(1):1-15.
5.3 Boundedness of an operator
Adjoint Operator: The adjoint of a linear operator A is denoted A and must
satisfy the relationship
hAx, yi = hx, A yi
for all x and y
An operator is said to be bounded if there exists a finite such that
||A x|| ||x||
The transpose is a special kind of operator, called an adjoint operator
Adjoint Operator is a general concept that applies to all types of vector
spaces (such as vectors of polynomials)
For finite vectors on an ortho-normal basis (e.g., x Rn),
which is equivalent to saying ||A|| .
A linear operator is just a matrix, A
5.4 Operator Norms, conclusions
All 4 matrix norm definitions satisfy properties 1-4 of operator norms
(section 5.1.1).
The adjoint of a real-valued operator is just the matrix transpose
A = AT
The adjoint
operator is just the complex-conjugate
of a complex-valued
T
transpose A = A
Example:
A=
Matlab:
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 55
4 j
2
5
3+ j 6
1
A =
2 3 j
4+ j 5 6
1
is A
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 56
EE/ME 701: Advanced Linear Systems
Section 6.0.0
When A = A
e.g., A =
3+ j 7
A is a symmetric matrix for the real case,
2
A is a Hermitian matrix for the complex case (Hermitian = complex
conjugate transpose, A in Matlab)
EE/ME 701: Advanced Linear Systems
Section 7.0.0
7 Forming the intersection of two vector spaces
Given vector space U, with basis vectors {u1 , u2 , ..., unu } and vector space
V, with basis vectors {v1 , v2 , ..., vnv } , where nu is the dimension of U and
nv is the dimension of V.
Sometimes it is interesting to find the intersection of the two vector spaces,
which is a vector space W given by
We can also say that A is a self-adjoint operator
Hermitian matrices have two important properties that we will use to
develop the singular value decomposition
W = UV
(60)
When U = [u1 , u2 , ..., unu ] and V = [v1 , v2 , ..., vnv ], for vectors lying in W,
it must be the case that there are representations on both U and V , that is
They always have a complete set of eigenvectors
w W,
The eigenvalues are always real (even if the matrix is complex)
w = a1 u1 + a2 u2 + + anu unu = b1 v1 + b2 v2 + + bnv vnv
(61)
Eqn (61) can be rewritten as Eqn (62).
6 Bay Section 3.3
A system of simultaneous linear equations can be written in the form
y = Ab
Which is to say, vectors made of the U
and V basis coefficients of vectors in the
intersection, must lie in the null space of
the matrix [U, V ].
Bay address the various possibilities of the equation y = A b .
We will come back to this topic after covering the singular value
decomposition
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 57
Part 3: Linear Operators
a
1
a2
.
..
an
u
[U, V ]
=0
b1
b
2
.
..
bnv
(Revised: Sep 10, 2012)
Page 58
(62)
EE/ME 701: Advanced Linear Systems
Section 7.1.0
The dimension of the intersection is the number of linearly independent
solutions to Eqn (62), which is the dimension of the null space of [U, V ]
Vectors
a
1
wi = U ...
anu
(63)
8 Conclusion
For finite vectors, linear operators are matrices
Rotation + Reflection matrices Ortho-normal matrix
Linear Operators have a number of characteristics
Are basis vectors for W.
They change when we make a change of basis
The form a vector space
7.1 Example
>> U = [
Section 8.0.0
The rotation matrix is an example operator
b
1
wi = V ...
bnv
or equivalently
EE/ME 701: Advanced Linear Systems
1
2
3
4
5
6 ]
>> Null = null([U, -V])
Null = 0.5000
0.5000
0.5000
0.5000
>> V = [ 2
3
5
The norm is well defined
3
4
4 ]
For any operator, there is an adjoint operator
>> w1 = U*Null(1:2)
w1 = 2.5000
3.5000
4.5000
Verify that w1 lies in U and V by checking the projection onto each.
(Notice the projection calculation for non-orthogonal basis set.)
>> U*inv(U*U)*U*w1
ans = 2.5000
3.5000
4.5000
Part 3: Linear Operators
>> V*inv(V*V)*V*w1
ans = 2.5000
3.5000
4.5000
(Revised: Sep 10, 2012)
Page 59
Part 3: Linear Operators
(Revised: Sep 10, 2012)
Page 60
EE/ME 701: Advanced Linear Systems
EE/ME 701: Advanced Linear Systems
Singular Value Decomposition
6 Determining the rank of A and the four fundamental spaces with the
SVD
28
6.1
Contents
1 Introduction
Example: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
7 Exact, Homogeneous, Particular and General Solutions to a Linear
Equation
34
1.1
Rotation+ Matrices . . . . . . . . . . . . . . . . . . . . . . . . .
7.1
The four types of the solution for y = A x . . . . . . . . . . . . .
34
1.2
Scaling Matrices . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2
Numerical example showing the generalized inverse
36
2 The Singular Value Decomposition
2.1
Numerical Example: . . . . . . . . . . . . . . . . . . . . . . . .
2.2
Forms for the SVD . . . . . . . . . . . . . . . . . . . . . . . . .
Demonstrating that A = rk=1 uk k vTk . . . . . . . . . . .
2.2.1
3 Proof of the SVD theorem (following Will)
Solving for the j and u j , given the v j . . . . . . . . . . . . . . .
12
3.2
Example, good choices for v j . . . . . . . . . . . . . . . . . . . .
13
3.3
Example, bad choices for v j . . . . . . . . . . . . . . . . . . . .
15
3.4
Choosing the v j correctly
. . . . . . . . . . . . . . . . . . . . .
16
3.5
Proof by construction of the SVD . . . . . . . . . . . . . . . . .
19
4 The generalized inverse of A, A#
40
9 History of the Singular Value Decomposition
42
10 Conclusions
43
20
Using generalized inverse of A . . . . . . . . . . . . . . . . . . .
23
4.1.1
24
A numerical example using the generalized inverse . . . .
5 SVD Conclusion
Part 4a: Singular Value Decomposition
8 Additional Properties of the singular values
12
3.1
4.1
. . . . . . .
27
(Revised: Sep 10, 2012)
Page 1
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 2
EE/ME 701: Advanced Linear Systems
Section 1.0.0
1 Introduction
EE/ME 701: Advanced Linear Systems
Section 1.1.0
Gilbert Strang calls the SVD:
"Absolutely a high point of linear algebra."
An n m matrix maps a vector from Rm to Rn :
y = Ax ,
y Rn ,
x Rm ,
A Rnm
(1)
For a nice explanation of the SVD by Todd Will, see:
http://www.uwlax.edu/faculty/will/svd
Ax=y
y
1.1 Rotation+ Matrices
As we have seen, rotation+ matrices preserve lengths and angles.
Output Space, Rn
When lengths and angles are preserved, shapes are preserved.
Figure 1: Action of a matrix, A maps x onto y.
As we shall see, in all cases, the linear transformation from Input Space
to Output Space is made up of 3 parts:
1. An ortho-normal transformation from input coordinates to singular
coordinates
2. Scaling (in singular coordinates)
3. An ortho-normal transformation from singular to output coordinates
As we saw in the previous section, an orthonormal matrix gives either a
rotation (det R = +1) or reflection (det R = 1). Some authors use the term
rotation matrix for actions 1 and 3, because it captures the intuition of
what is going on, but rotation alone is not sufficiently complete.
The relation y = A x is the general linear operator from an Input Vector
Space to an Output Vector Space.
Input Space, Rm
1
1
1
X
1
1
1
X
Figure 2: Example: a 45o rotation.
All full-rank orthonormal matrices are rotation+ matrices.
We will use rotation+ to refer the orthonormal matrices.
If we can understand y = A x in terms of rotations+ and scaling, many
interesting results and practical methods will follow directly.
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 3
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 4
EE/ME 701: Advanced Linear Systems
Section 1.2.0
1.2 Scaling Matrices
EE/ME 701: Advanced Linear Systems
A scaling value is also called a singular value, and can be zero:
Scaling matrices scale the Euclidean axes of a space. An example Scaling
matrix is:
1 0
0.5 0
=
(2)
=
0 2
0 3
15
10
10
Scaled by 3.0
5
5
5
5
5
X
10
15
1 0
0 2
15
15
10
10
(3)
Scaled by 0.0
Scaled by 0.5
0.5 0
15
Section 1.2.0
0
5
5
5
X
10
15
Scaled by 0.5
5
X
10
5
5
15
5
X
10
15
Figure 4: Action of a scaling matrix with a singular value of zero for the Y axis.
Figure 3: Action of a scaling matrix.
A Scaling matrix does not have to be square. The matrices
Figure 4 was generated with:
FF = [ A collection of 2-vectors (specifying points) ]
Scale = [ 0.5, 0; 0 3.0];
FF1 = Scale * FF;
0.5 0 0
0
3 0
0.5 0
= 0
Scaling Matrices:
1. Are diagonal
2. Have zero or positive scaling values
A negative value flips the configuration, well do this with rotations+.
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 5
1 0 0
1 0
0 2 0
3 = 0 2
0 0
0
(4)
(5)
are also scaling matrices, in operators which map from R3 R2 or
R2 R3 , respectively
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 6
EE/ME 701: Advanced Linear Systems
Section 2.0.0
2 The Singular Value Decomposition
EE/ME 701: Advanced Linear Systems
Section 2.1.0
2.1 Numerical Example:
The SVD THEOREM:
Every matrix A can be decomposed into:
1. An rotation+ from input coordinates onto Singular Coordinates (the
coordinates in which the scaling occurs).
%% First SVD example
>> A = [ 2 3 ;
3 4
A =
2
3
3
4
4
-2
-2 ]
>> [U, S, V] = svd(A)
2. Scaling in singular coordinates
3. An rotation+ from singular coordinates onto output coordinates.
U =
-0.5661
-0.7926
-0.2265
-0.1622
-0.1622
0.9733
S =
6.2450
0
0
0
4.3589
0
Said another way:
A Rnm ,
A = U VT
-0.8082
0.5878
-0.0367
(6)
and so
y = Ax =U V x
T
(7)
V =
-0.7071
-0.7071
0.7071
-0.7071
where:
1. V T is an rotation+ from input coordinates onto singular coordinates.
2. is a scaling matrix.
3. U is an rotation+ from singular coordinates onto output coordinates.
%% A can be remade from U, S, V
>> A1 = U*S*V
A1 = 2.0000
3.0000
3.0000
4.0000
4.0000
-2.0000
Uniqueness:
Remarks:
The singular values are unique
The columns of U and V corresponding to distinct singular values are
unique (up to scaling by 1)
1. The columns of U form an ortho-normal base for Rn (the output space). U
is a rotation+ matrix.
The remaining columns of U and V must lie in specific vector subspaces,
as described below.
2. The columns of V form ortho-normal base for Rm (the input space). V is a
rotation+ matrix.
3. The scaling matrix is zero everywhere except the main diagonal.
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 7
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 8
EE/ME 701: Advanced Linear Systems
Section 2.2.1
2.2 Forms for the SVD
EE/ME 701: Advanced Linear Systems
Section 2.2.1
With the SVD
The singular value decomposition can be written
A = U V
or
(8)
A=
uk k vTk
(9)
| | | |
0 0
T
A = U V = u1 u2 un 0 2 0
0 0 ...
| | | |
k=1
where uk are the columns of U, vk are the columns of V , and r is the rank of
A.
Eqn (8) is a decomposition into matrices, while Eqn (9) is a decomposition
into individual component vectors.
vT1
vT2
(12)
..
T
vm
Recall
p = min (n, m)
r = rank (A)
Because of the zeros in the scaling matrix, Eqn (12) reduces to one of
Eqn (13) or (14), below.
2.2.1 Demonstrating that A = rk=1 uk k vTk
The development of the proof of the SVD and generalized inverse rest on
the decomposition
When m n , A is given by Eqn (13), where the columns of U for
j = m + 1..n multiply zero rows in , and are omitted.
A=
uk k vTk
(10)
k=1
Which is the relationship that tells us the connection between a specific
vector vk in the input space and a specific vector uk in the output space.
A general proof of the singular value decomposition theorem follows in
section 3.
Here we show that,
Given that a decomposition exists with the form
A =U V
(11)
where U and V are ortho-normal matrices and is a scaling matrix,
Then a decomposition according to Eqn (9) follows from Eqn (8).
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 9
And when n < m, then A is given by Eqn (14), where the rows of V for
i = n + 1..m multiply zero columns in , and are omitted.
A =
A =
1 vT1
| |
|
T
2 v2
u1 u2 um
..
| |
|
m vTm
|
|
|
1 u1 2 u2 n un
|
|
|
Part 4a: Singular Value Decomposition
when p = m < n
,
keep only p columns of U (13)
vT1
vT2
when p = n < m
,
keep only p rows of V T (14)
..
vTn
(Revised: Sep 10, 2012)
Page 10
EE/ME 701: Advanced Linear Systems
Section 2.2.1
Eqns (13) and (14) both reduce to Eqn (15)
A = 1 u1 2 u2
|
|
Section 3.1.0
3 Proof of the SVD theorem (following Will)
vT1
vT
2
(15)
p up
..
|
vTp
|
EE/ME 701: Advanced Linear Systems
Going across the rows on the right, and down the columns on the left, the
elements of 1 u1 form a vector outer product with the elements of vT1 ,
2 u2 with vT2, etc.
The mechanics of multiplying the terms of Eqn (15) give the result
Following the demonstration given by Will (see: http://www.uwlax.edu/faculty/will/svd)
we first show that given the v j , the decomposition
r
A=
uk k vTk = U V T
(18)
k=1
can be derived.
Then we show how to determine the v j satisfying a necessary condition.
3.1 Solving for the j and u j , given the v j
A=
k uk vTk
(16)
k=1
If r < p then the k , k = {r + 1 , ... p} are zero, and those terms can be
dropped from Eqn (16), giving the sought after form:
r
A=
k uk vTk
(17)
k=1
Given Eqn (18) we find that
r
Avj =
uk k vTk
k=1
v j = u j j vTj v j = u j j
(19)
where all the terms k 6= j drop out because vTj v j = 0, k 6= j, and 1, k = j.
Thus, the u j and v j vectors and j must satisfy
Avj = uj j
(20)
Eqn (20) immediately solves for all of the j and vectors u j corresponding
to j > 0:
j = A v j
uj =
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 11
Part 4a: Singular Value Decomposition
(21)
1
Avj
j
(22)
(Revised: Sep 10, 2012)
Page 12
EE/ME 701: Advanced Linear Systems
Section 3.2.0
V =
0.39 0.92
0.92
0.39
Doing calculations of Eqns (21) and (22)
4.08
A v1 = 5.38
6.69
u1 =
1
1
9.51
0.43
1.00 4.00
0.39 0.92
=
2.00 5.00
0.77
0.92 0.39
0
3.00 6.00
(23)
For U to be an ortho-normal matrix, u1 and u2 must be orthogonal and
normal.
They are automatically normalized by the calculation of Eqn (22).
Check that they are orthogonal:
>> u1 = U(:,1)
u1 = -0.4287
-0.5663
-0.7039
>> u2=U(:,2)
u2 = 0.8060
0.1124
-0.5812
>> u1*u2
ans =
1.1102e-16
5.38 = 0.57
6.69
0.70
Repeating steps (21) and (22) for v2 gives:
9.51 0
= 0 0.77 ,
0
0
0.43 0.81 |
U = 0.57 0.11 ?
0.70 0.58 |
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
0.81
Remarks:
4.08
1 = 5.38 = 9.51
6.69
4.08
0.43
A = 0.57 0.11 ? 0
0.70 0.58 |
0
1 4
A= 2 5
3 6
Choose:
Section 3.2.0
Then A = U V T gives:
3.2 Example, good choices for v j
Consider
EE/ME 701: Advanced Linear Systems
Our algorithm does not tell us how to choose u j if j is zero or non-existent.
But either way, it doesnt matter for Eqn (23).
Since U is an ortho-normal matrix, the missing columns must be an
orthonormal basis for the orthogonal complement of the u j , j = 1..r .
Page 13
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 14
EE/ME 701: Advanced Linear Systems
Section 3.3.0
3.3 Example, bad choices for v j
EE/ME 701: Advanced Linear Systems
For U to be an ortho-normal matrix, its columns must be orthonormal. The
columns are normal automatically, by step (22), but orthogonal ?
What happens if we dont choose the correct ortho-normal matrix V ?
0.49 +
* 0.36
0.55 , 0.57 = 0.99 6= 0
0.66
0.75
For example, choose the V corresponding to a rotation of 20o :
V =
0.94 0.34
0.34
0.94
No ! if V is not correctly chosen, U does not turn out to be an ortho-normal
matrix !
(yet to be proven).
Doing calculations of Eqns (21) and (22)
2.31
2.31
1 = 3.59 = 6.48
4.87
A v1 = 3.59
4.87
2.31
0.36
3.4 Choosing the v j correctly
For finding the j and u j according to Eqns (21) and (22), the challenge is
to find v j which are orthonormal, such that u j are orthogonal.
This requires:
u1 =
3.59 = 0.55
1
4.87
0.75
For j 6= i ,
6.48 0
= 0 7.00 ,
0
0
Part 4a: Singular Value Decomposition
(24)
When j is zero or non-existent we are free to select u j in the orthogonal
complement of {ui } (the left-null space of A), so we are interested in the
case j , i > 0 .
In this case 1/ j (1/i ) 6= 0 , so Eqn (24) is verified if and only if
0.36 0.49 |
U = 0.55 0.57 ?
0.75 0.66 |
(Revised: Sep 10, 2012)
uTj ui = 0, and so
1 1
uTj ui = vTj AT
(A vi ) = 0
j i
Repeating steps (21) and (22) for v2 gives:
Section 3.4.0
vTj AT (A vi ) = vTj AT A vi = 0
Page 15
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
(25)
Page 16
EE/ME 701: Advanced Linear Systems
Section 3.4.0
Lemma 1: When the vectors vi are ortho-normal and span Rm , then
vTj AT A vi = 0 j 6= i if and only if
AT A vi = i vi
(26)
EE/ME 701: Advanced Linear Systems
Section 3.4.0
In the general case of an m m matrix, eigenvalues and eigenvectors can be
a bit messy:
They can be complex.
The eigenvectors will not, in general, be orthogonal, and
in which case vTj AT A vi = i vTj vi = 0
There may not even be a complete set of eigenvectors (in which case we
say the matrix is defective)
Proof:
Necessary: Because the vi are ortho-normal, vectors v j , j 6= i span the
entire m 1 dimensional space orthogonal to vi , therefore if AT A vi must
lie in the 1-D space spanned by vi .
Sufficient: Since
v j vi .
AT
A vi = i vi , then
vTj AT
A vi =
vTj i vi
= 0, because
QED
The solutions to Eqn (26) are the eigenvalues and eigenvectors of
Q = AT A.
Square matrices have special vectors which give a scaled version of
themselves back under multiplication. For example:
1
1
2
2
Lemma 2: When Q is Hermitian (that is, when Q = Q , Q is the complex
conjugate), then
1. There is a complete set of eigenvectors, and furthermore, they are
orthogonal (ortho-normal, when we normalize them), and
2. The eigenvectors and eigenvalues are real (even if Q is complex).
Quick review of eigenvalues and eigenvectors:
However when Q is a Hermitian matrix (as Q = AT A must be) we are in
luck.
= 2
1
1
(27)
Remarks:
1. A symmetric matrix, such as Q = AT A , is a real Hermitian matrix.
2. Further discussion is deferred until our in-depth discussion the eigen-system
of a matrix.
These special vectors are called eigenvectors and the scale factors are
called the eigenvalues.
For an m m matrix, there can be at most m eigenvectors.
Eigenvectors can be scaled (considering multiplying left and right of Eqn
(26) by a scale factor). We usually normalize eigenvectors.
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 17
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 18
EE/ME 701: Advanced Linear Systems
Section 3.5.0
EE/ME 701: Advanced Linear Systems
4 The generalized inverse of A, A#
3.5 Proof by construction of the SVD
With the above lemmas taken together, for any matrix A
Given the equation
y = Ax
AT A is Hermitian (symmetric if A is real)
Lemma 2 shows that AT A has a complete set of real, orthogonal
eigenvectors.
Lemma 1 shows that these eigenvectors satisfy the condition to be the
input basis vectors the SVD, the v j in
r
A=
Section 4.0.0
With A and y given. We are looking for a particular solution that solves
y = A x p
where
y is the projection of y onto the column space of A
uk k vTk
= U V
(28)
x p lies in the row space of A
k=1
Once the v j are known, determine the j u j corresponding to j 6= 0
according to
j = A v j
uj
1
=
Avj
j
(29)
(30)
The remaining u j are determined as basis vectors for the orthogonal
complement of the u j corresponding to j 6= 0.
Following these steps, the singular value decomposition of A is constructed.
QED
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 19
Whenever we have an ortho-normal basis for the column space of A
i
h
Uc = u1 u2 ur
(31)
We may find y by projection:
y = Uc UcT y =
uk uTk y
(32)
k=1
Writing Eqn (32) out graphically shows
uT
| |
|
| |
|
uT
2
y=
y = u1 u2 ur
u1 u2 ur
..
| |
|
| |
|
uTr
a
1
| |
|
a2
= u1 u2 ur
. = a1 u1 + a2 u2 + + ar ur
..
| |
|
ar
Part 4a: Singular Value Decomposition
uT1 y
uT2 y
...
uTr y
(Revised: Sep 10, 2012)
(33)
Page 20
EE/ME 701: Advanced Linear Systems
Section 4.0.0
Eqn (33) shows that the projection of y onto the column space of A can be
written
y = a1 u1 + a2 u2 + + ar ur
(34)
EE/ME 701: Advanced Linear Systems
Section 4.0.0
When we replace x in Eqn (38) with the expanded x of (39), we get
ak = k vTk x = k vTk (b1 v1 + b2 v2 + + br vr )
(40)
with
ak = uTk y
(35)
So far there is nothing surprising about Eqns (33-35). They say that we
can write the output of
y = A x p
(36)
as a linear combination of basis vectors on the column space of A.
But the basis vectors vk are ortho-normal, so
ak = k vTk (b1 v1 + b2 v2 + + br vr ) = k vTk vk bk = k bk
Putting the pieces together
We can write Eqns (33-35) whenever we have Uc , an ortho-normal basis
for the column space of A .
Using Eqns (35) and (41) gives bk , the basis coefficients of x p
Without the SVD, the analysis would stop here, there is no way to
discover contributions of x to give the basis coefficients ak .
With the SVD there is a way to discover the contributions of x to each term !
bk = (1/k ) ak = (1/k ) uk T y
uk k vTk
y = Ax =
k=1
xp =
x
(42)
Plugging the bk into (39) to find x gives
We know each term in the output is given by:
!
(41)
k=1
k=1
vk bk = vk (1/k ) uk
y=
vk (1/k ) uk
k1
y = A# y
(37)
= a1 u1 + a2 u2 + + ar ur
(43)
Where the term in parentheses gives the generalized inverse.
with
ak = k vTk x
(38)
A# =
vk (1/k) uk T
(44)
k=1
Any vector x in the row space of A can be written
Or in Matlab:
x = b1 v1 + b2 v2 + + br vr
(39)
>> Asharp = pinv(A)
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 21
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 22
EE/ME 701: Advanced Linear Systems
Section 4.1.0
4.1 Using generalized inverse of A
EE/ME 701: Advanced Linear Systems
Section 4.1.1
4.1.1 A numerical example using the generalized inverse
Consider the data
Given the relationship
y = Ax
(45)
With data vector y, find
A# =
vk (1/k) uk T
(46)
A = [ 1
2
3
4
5
6
7
8
6
8
10
12 ]
ybar =
6
5
5
3
k=1
With y = A x, find x p
And the solution
x p = A#
(47)
lies in the row space of A, and gives
y = A x p
(48)
Before proceeding, lets find the correct y , the projection of y onto col (A)
>> W = GramSchmidt(A)
W = 0.1826
0.8165
0.3651
0.4082
0.5477
0.0000
0.7303
-0.4082
>> yhat = W*W*ybar
yhat =
6.1000
5.2000
4.3000
3.4000
With y the projection of y onto the column space of A .
Trying the left pseudo-inverse:
In Eqn (46) there is no matrix inverse step.
>> xp1 = inv(A*A) * A * ybar
Warning: Matrix is close to singular or badly scaled.
Results may be inaccurate. RCOND = 8.606380e-18.
xp1 = -2.8750
0.6875
-1.0000
Eqn (46) always works, for every shape or rank of matrix.
Eqn (46) gives a new degree of freedom: we can choose r !
(More on this at the end of the notes on SVD)
Double checking that Ax p1 gives y
A mile off from y !
>> A*xp1,
ans = -5.4375
-9.6250
-13.8125
-18.0000
(The warning is to be taken seriously !)
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 23
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 24
EE/ME 701: Advanced Linear Systems
Section 4.1.1
Now with the generalized inverse (also called the pseudo inverse)
>> [U,S,V] = svd(A)
U = -0.3341
0.7671
-0.4359
0.3316
-0.5378
-0.1039
-0.6396
-0.5393
S =
V =
-0.4001
0.2546
0.6910
-0.5455
23.3718
0
0
0
0
1.3257
0
0
0
0
0.0000
0
-0.2301
-0.5634
-0.7935
-0.7834
0.5910
-0.1924
-0.5774
-0.5774
0.5774
Section 4.1.1
or
>> xp = Asharp * ybar
xp =
-2.3500
2.0500
-0.3000
-0.3741
0.7970
-0.4717
0.0488
>> xp=pinv(A)*ybar
xp = -2.3500
2.0500
-0.3000
Double checking that A x p1 gives y
Previously we saw three cases for the equation
y = Ax ,
A Rnm
Case
Size of A
Exactly constrained
n=m
Matrix Inverse
Tool
Over constrained
n>m
Left Pseudo-Inverse
Under constrained
n<m
Right Pseudo-Inverse
(49)
Must Exist
Table 1: Three cases for the solution of Y = A x
A1
1
ATA
1
A AT
Each of these cases requires that a matrix be invertible, or other methods
are required. With the generalized inverse
There are 2 non-zero s , using Eqn (46)
>> Asharp = V(:,1)*(1/S(1,1))*U(:,1) + V(:,2)*(1/S(2,2))*U(:,2)
Asharp = -0.4500
-0.1917
0.0667
0.3250
0.3500
0.1583
-0.0333
-0.2250
-0.1000
-0.0333
0.0333
0.1000
Eqn (46) is implemented by the pinv() command in Matlab
>> Asharp = pinv(A)
Asharp = -0.4500
0.3500
-0.1000
EE/ME 701: Advanced Linear Systems
xp =
k=1
k=1
vk k = vk (1/k) uTk y = A#x p
with
A# =
(50)
vk (1/k ) uTk
k=1
Equation (50), is valid in all cases where A 6= 0.
-0.1917
0.1583
-0.0333
0.0667
-0.0333
0.0333
0.3250
-0.2250
0.1000
There is no matrix inversion.
At most the basis vectors corresponding to k > tol are retained (we have
the freedom to choose r < rank (A)).
If A 6= 0, at least one k is guaranteed to be greater than zero.
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 25
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 26
EE/ME 701: Advanced Linear Systems
Section 5.0.0
5 SVD Conclusion
:
U:
Section 6.0.0
6 Determining the rank of A and the four funda-
The singular value decomposition gives us an expansion for any real matrix
A of the form
A = U V T
V T:
EE/ME 701: Advanced Linear Systems
rotation+
from input
i coordinates to singular coordinates, where
h
V = v1 vm Rmm
mental spaces with the SVD
The rank of a matrix is the number of singular values which are greater than
a tolerance.
Recall, singular values are always positive or zero.
Scaling in singular coordinates, where Rnm (same as A)
In any numerical calculations we have round-off error, j may be greater
than zero due to round-off error.
rotation+ from
singular icoordinates to output coordinates,
h
where U = u1 un Rnn
So we set a tolerance for the minimum output of Y = A b which will be
considered non-zero.
Because the columns of U and V are orthonormal, we have a second
expansion for A:
r
A=
uk k vTk
Matlabs default value is given by:
tol = max(size(A)) * norm(A) * eps
>> eps = 2.2204e-16 (for 64 bit floating point calculation)
k=1
We have seen that if the vi are the eigenvectors of AT A then the k and uk
are straight-forward to compute.
Since Q = AT A is a Hermitian matrix, the eigenvectors exist and are
orthogonal.
Note: if n < m it may be numerically more convenient to compute the
eigenvectors of Q = A AT , which is a smaller matrix.
Student exercise: how would the SVD be determined from the eigenvec
tors of Q ?
The expansion of A is given as:
..
.
..
.
..
.
..
.
..
.
..
.
U V T = u1 ur ur+1 un
..
..
..
...
...
...
.
.
.
1
..
.
r
vT1
..
.
0
vT
vT
0
r+1
..
..
.
.
vTm
0
where 1 ... r are the singular values greater than the tolerance, and the
rest of the diagonal elements of are effectively zero.
As we are about to see, bases for the four fundamental spaces of a matrix
are given directly from the singular value decomposition.
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 27
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 28
EE/ME 701: Advanced Linear Systems
Section 6.0.0
U V T partitions according to the non-zero singular values:
.. .. ..
. . .
u1 ur
.. .. ..
. . .
.. ..
..
. .
.
ur+1 un
..
.. ..
.
. .
0
...
r
vT1
..
vr
vTr+1
..
T
vm
Section 6.0.0
A program to give the four fundamental spaces
EE/ME 701: Advanced Linear Systems
0
...
function [Row, Col, Null, Lnull] = spaces(M)
%
function [Row, Col, Null, Lnull] = spaces(M)
%
%
Return the four fundamental spaces
%
[n,m] = size(M);
[u,s,v] = svd(M);
r = rank(M);
%% select spaces
Row=v(:,1:r);
Null = v(:,(r+1):m);
Col=u(:,1:r);
Lnull = u(:,(r+1):n);
where the columns of Uand V provide bases for the Column, Left-Null,
Row and Null spaces:
..
.
..
.
..
.
Uc = u1 ur ,
.. .. ..
. . .
..
.
..
.
..
.
Uln = ur+1 un ,
..
.. ..
.
. .
Part 4a: Singular Value Decomposition
..
.
..
.
..
.
..
.
Vr = v1 vr
.. .. ..
. . .
..
.
..
.
Vn = vr+1 vm
..
..
..
.
.
.
(Revised: Sep 10, 2012)
Page 29
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 30
EE/ME 701: Advanced Linear Systems
Section 6.1.0
6.1 Example:
EE/ME 701: Advanced Linear Systems
Section 6.1.0
We need to take a look at the third singular value to see if it is approximately
zero, or just quite small.
Consider
2
A=
0
0
1
1
>> ss = diag(S); ss(3)
ans =
4.2999e-16
(51)
3 is effectively zero (zero to within the round off error), so A is rank 2.
Now form the 4 fundamental spaces.
The four fundamental spaces of matrix A can be found through the SVD:
>> r = sum(ss > norm(A)*max(n,m)*eps)
r = 2
>> [U, S, V] = svd(A)
U =
-0.2543
0.3423
-0.5086
0.6846
-0.6947
-0.2506
-0.4405
-0.5928
S =
0.8987
-0.3615
-0.1757
0.1757
5.3718
0
0
0
0
1.0694
0
0
0
0
0.0000
0
V = -0.5774
-0.2113
-0.7887
0.5774
-0.7887
-0.2113
0.5774
0.5774
-0.5774
-0.1030
0.3769
-0.6509
0.6509
>> [n, m] = size(A)
n =
4 ,
m =
3
Since the rank is 2, the a basis for the row space is given by the first 2
columns of V
>> Vr
Vr =
The null space is spanned by the remaining columns of V
>> Vn
Vn =
Part 4a: Singular Value Decomposition
= V(:, 1:r)
-0.5774
0.5774
-0.2113
-0.7887
-0.7887
-0.2113
(Revised: Sep 10, 2012)
Page 31
= V(:, (r+1):NRows)
0.5774
0.5774
-0.5774
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 32
EE/ME 701: Advanced Linear Systems
Section 6.1.0
Since the rank is 2, a basis for the column space is given by the first 2
columns of U
>> Uc
= U(:, 1:r)
Uc = -0.2543
0.3423
-0.5086
0.6846
-0.6947
-0.2506
-0.4405
-0.5928
EE/ME 701: Advanced Linear Systems
Section 7.1.0
7 Exact, Homogeneous, Particular and General
Solutions to a Linear Equation
7.1 The four types of the solution for y = A x
There are four elements in the solution for
y = Ax
The left null space is spanned by the remaining columns of U
(52)
that we need to consider. The terminology is similar to that for differential
equations:
>> Un = U(:, (r+1):NCols)
Un =
0.8987
-0.1030
-0.3615
0.3769
-0.1757
-0.6509
0.1757
0.6509
Exact Solution: An exact solution is one that exactly satisfies Eqn (52). Recall
that if x is over-constrained (more ys than xs), then we can get a solution
which is not an exact solution.
Homogeneous Solution: A homogeneous solution is a solution to:
A xh = 0
There is the trivial homogeneous solution, x = 0. To have more interesting
homogeneous solutions, matrix A must nhave a non-trivial onull space.
Suppose the null space is given by a basis n1 , n2 , , nn :
xh = n1 n1 + n2 n2 + + nn nn
A xh = 0
(53)
If there is any non-trivial homogeneous solution, there will be infinitely
many of them.
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 33
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 34
EE/ME 701: Advanced Linear Systems
Section 7.1.0
Particular Solution: A particular solution is one that solves Eqn (52), either
exactly or in a least squared error sense. x p is the particular solution, and
y = A x p. The residual,
e
(54)
y = y y
may be zero (an exact solution) or may be minimal.
Each of the terms relating to the particular solution comes from a
fundamental space:
y lies in the column space.
x p lies in the row space.
Recall the several different norms, minimizing the residual for each
different norm can give a different meaning to x p. One can assume that
is minimized in the least-squares sense, unless a different norm is given.
When working with the 2-norm (the usual case),
EE/ME 701: Advanced Linear Systems
Section 7.2.0
7.2 Numerical example showing the generalized inverse
Continuing with the example in section 6.1,
>> Uc = U(:, 1:r)
Uc = [ -0.2543
-0.5086
-0.6947
-0.4405
>> Vr = V(:, 1:r)
Vr = [ -0.5774
-0.2113
-0.7887
'
Keep in mind:
0.3423
0.6846
-0.2506
-0.5928 ]
A = Uc
Even though A R43 , r is
a 2x2 matrix, and Uc and Vr
each have 2 columns.
0.5774
-0.7887
-0.2113 ]
>> SigmaR = Sigma(1:r, 1:r)
SigmaR = [
5.3718
0
0
1.0694
r VrT
2 is the rank of matrix A .
&
y is given as the projection of y onto the column space.
Choose an example y , project y onto the column space.
e
y lies in the left-null space.
>> Y = [ 1 2 3 4 ]
x p is given as
xp = A y
(55)
General Solution: The general solution is a parametrized solution, it is the set of
all solutions which give y :
xg = x p + xh
y = A xg
(56)
>> Yhat = Uc * Uc * Y
Yhat = [ 0.8182
1.6364
3.9091
3.0909 ]
If the null space and any particular solution is known, then all of the general
solutions are given by Eqn (56).
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 35
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 36
EE/ME 701: Advanced Linear Systems
Section 7.2.0
EE/ME 701: Advanced Linear Systems
Section 7.2.0
Lets take a look at the residual e
y = y y
Now compute x p
>> Xp = Vr* inv(SigmaR) * Uc * Y
Xp = [ -0.2121
1.2424
1.0303 ]
ytilde =
0.1818
0.3636
-0.9091
0.9091
Double check that A x p gives y
>> A * Xp
ans = [ 0.8182
1.6364
3.9091
3.0909
e
y should lie in the left-null space. The basis for the left null space is Un
>> Un = U(:, (r+1):NCols)
Un = [ 0.8987
-0.1030
-0.3615
0.3769
-0.1757
-0.6509
0.1757
0.6509
The example y = A x looks like an over-constrained problem (4 data, 3
unknowns), try matrix pseudo-inverse:
>> XX = inv(A*A) * A * Y
Warning: Matrix is close to singular or badly scaled.
Results may be inaccurate. RCOND = 5.139921e-18.
XX = [ -6.2500
1.5000
4.2500 ]
Checking that e
y lies in Un :
>> ytilde2 =
0.1818
0.3636
-0.9091
0.9091
Un * Un * ytilde
Check result (the warning from inv() is to be taken seriously!)
A * XX
---------2.0000
-4.0000
1.7500
3.7500
A * Xp
-------0.8182
1.6364
3.9091
3.0909
Part 4a: Singular Value Decomposition
Yp
-----0.8182
1.6364
3.9091
3.0909
Y
----1
2
3
4
(Revised: Sep 10, 2012)
Page 37
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 38
EE/ME 701: Advanced Linear Systems
Section 7.2.0
EE/ME 701: Advanced Linear Systems
Section 8.0.0
8 Additional Properties of the singular values
The general solution to y = A x is given as:
Define is the largest singular value of A and is the smallest singular
value of A. Also, order the singular values so that 1 = , ... , p = .
xg = x p + xh
In some cases the null space will be empty, so xg = x p . But in this case
dim null (A) = 1
.577
xh = a1 vm = a1 .577
.577
||A||2 =
By writing y = rk=1 uk k vTk x , it is clear that the largest ||y||2 is given
when x corresponds to the largest singular value of A. Choosing x = v1
gives y = 1 u1 and ||y||2 = 1 .
The condition number of A is given as
So xg is given as:
cond (A) =
0.2121
.577
xg = 1.2424 + a1 .577
1.0303
.577
Consider that when we form A# we use 1/k , if there is a k which is very
small, there will be terms in A# which are very large. With
x = A# y
0.2121
.577
xg = 1.2424 + 0.42 .577
1.0303
.577
>> Xg = Xp - 0.42*Xh
Xg = [ -0.455
1.000
1.273 ]
(57)
The condition number is a measure of how much round off errors can be
expanded by multiplying a matrix by A.
Coefficient a1 can be used to satisfy one constraint, such as make
x (2) = 1.0 . Which gives a1 = .2424/.577 = 0.42
if there are noise components in y, when they hit the large terms in A#
they will give large errors.
Example: A problem with cond (A) = 10 is said to be well conditioned,
if cond (A) = 10000 the problem is said to be poorly conditioned.
>> A * Xg = Yhat
= 0.8182
1.6364
3.9091
3.0909
If = 0, the condition number is unbounded.
A method to handle poorly conditioned problems:
q
use A# = k=1vk (1/k ) uTk with some q < r
That is, throw out the smallest singular values.
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 39
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 40
EE/ME 701: Advanced Linear Systems
Section 8.0.0
The absolute value of the determinant of any matrix is the product of its
singular values
p
|det (A)| = k
(58)
k=1
Corollary: if any k = 0, then det (A) = 0.
Recall that when A Rnn
n
det (A) = k
k=1
So for square matrices,
n
p
k = k
k=1 k=1
EE/ME 701: Advanced Linear Systems
Section 10.0.0
9 History of the Singular Value Decomposition
The singular value decomposition was originally developed by differential
geometers, who wished to determine whether a real bilinear form could be made
equal to another by independent orthogonal transformations of the two spaces
it acts on. Eugenio Beltrami and Camille Jordan discovered independently,
in 1873 and 1874 respectively, that the singular values of the bilinear forms,
represented as a matrix, form a complete set of invariants for bilinear forms
under orthogonal substitutions. James Joseph Sylvester also arrived at the singular
value decomposition for real square matrices in 1889, apparently independent
of both Beltrami and Jordan. Sylvester called the singular values the canonical
multipliers of the matrix A. The fourth mathematician to discover the singular
value decomposition independently is Autonne in 1915, who arrived at it via
the polar decomposition. The first proof of the singular value decomposition
for rectangular and complex matrices seems to be by Eckart and Young in 1936;
they saw it as a generalization of the principal axis transformation for Hermitian
matrices.
In 1907, Erhard Schmidt defined an analog of singular values for integral operators
(which are compact, under some weak technical assumptions); it seems he was
unaware of the parallel work on singular values of finite matrices. This theory was
k
further developed by mile Picard in 1910, who is the first to call the numbers sv
singular values (or rather, valeurs singulires).
Practical methods for computing the SVD date back to Kogbetliantz in 1954, 1955
and Hestenes in 1958 resembling closely the Jacobi eigenvalue algorithm, which
uses plane rotations or Givens rotations. However, these were replaced by the
method of Gene Golub and William Kahan published in 1965 (Golub & Kahan
1965), which uses Householder transformations or reflections. In 1970, Golub
and Christian Reinsch published a variant of the Golub/Kahan algorithm that is
still the one most-used today.
[Wikipedia]
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 41
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 42
EE/ME 701: Advanced Linear Systems
Section 10.0.0
10 Conclusions
The Singular Value Decomposition gives us a geometric picture of matrixvector multiplication, comprised of
A rotation+
Scaling
A rotation+
Using the SVD we can find basis vectors for the four fundamental spaces.
And the basis sets are ortho-normal
And the basis vectors of the row and column spaces are linked by the
singular values
Numerically, computation of the SVD is robust because computation of the
eigenvectors of a symmetric matrix is robust.
The SVD can be used to compute:
rank (A)
||A||2
cond (A)
Ortho-normal basis vectors for the four fundamental spaces
The solution to y = A x for the general case, without matrix inversion.
Part 4a: Singular Value Decomposition
(Revised: Sep 10, 2012)
Page 43
EE/ME 701: Advanced Linear Systems
EE/ME 701: Advanced Linear Systems
Eigenvalues, Eigenvectors and the Jordan
Form
3.2.4
3.3
The Jordan form, a second example . . . . . . . . . . . .
27
One more twist, freedom to choose the regular eigenvector . . . .
33
3.3.1
Contents
3.4
1 Introduction
1.1
Review of basic facts about eigenvectors and eigenvalues . . . . .
1.1.1
Repeated eigenvalues . . . . . . . . . . . . . . . . . . . .
Analysis of the structure of the eigensystem of a matrix . . . . . .
2 Properties of the Eigensystem
Bay section 4.1, A-Invariant Subspaces . . . . . . . . . . . . . . .
2.2
Finding eigenvalues and eigenvectors . . . . . . . . . . . . . . .
2.3
Interpreting complex eigenvalues / eigenvectors . . . . . . . . . .
10
2.3.1
Example: 3D Rotation . . . . . . . . . . . . . . . . . . .
11
The eigen-system of symmetric (Hermitian) matrices . . . . . . .
13
3 The Jordan-form
3.1
3.2
41
Why Matlab does not have a numeric Jordan command . .
42
4 Conclusions
43
5 Review questions and skills
45
16
How the Jordan form relates to the real world . . . . . . . . . . .
16
3.1.1
An example calling for the Jordan form . . . . . . . . . .
17
Constructing the Jordan Form . . . . . . . . . . . . . . . . . . .
20
3.2.1
Regular and Generalized Eigenvectors
. . . . . . . . . .
20
3.2.2
First Jordan Form Example . . . . . . . . . . . . . . . . .
22
3.2.3
More on Jordan blocks . . . . . . . . . . . . . . . . . . .
26
Part 4b: Eigenvalues and Eigenvectors
3.4.1
. . . . . . . . . . . . . . . . . . .
2.1
2.4
Summary of the Jordan Form
35
Looking at eigenvalues and eigenvectors in relation to the
null space of ( I A) . . . . . . . . . . . . . . . . . . .
1.1.2
1.2
Example where regular E-vecs do not lie in the column
space of (A k I) . . . . . . . . . . . . . . . . . . . . .
(Revised: Sep 10, 2012)
Page 1
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 2
EE/ME 701: Advanced Linear Systems
Section 1.1.0
1 Introduction
EE/ME 701: Advanced Linear Systems
Section 1.1.2
1.1.1 Looking at eigenvalues and eigenvectors in relation to the null space
of ( I A)
We have seen the basic case of eigenvalues and eigenvectors
Starting from
A vk = k vk .
(k I A) vk = 0
(1)
In this chapter we will elaborate the relevant concepts to handle every case.
(4)
The eigen-values are values of k such that (k I A) has a non-trivial null
space.
The eigenvectors are the basis vectors of the null space !
1.1 Review of basic facts about eigenvectors and eigenvalues
Since
Only square matrices have eigensystems
det (k I A) = 0
Every n n matrix has n eigenvalues, 1 ... n
We know the null space is at least 1 dimensional.
The eigenvector satisfies the relationship A vk = k v , which leads to the
eigenvector being the solution to
(A k I) vk = 0
(2)
Theorem: for each distinct eigenvalue, there is at least one independent
eigenvector.
Proof: The proof follows directly from Eqns (4) and (5).
or, said another way, the eigenvector is a vector in the null space of the
matrix (k I A) .
1.1.2 Repeated eigenvalues
Notes:
Example:
1. Any vector in the null space of (k I A) is an eigenvector. If the null
space is 2 dimensional, then any vec. in this 2D subspace is an E-vec.
2. Since the determinant of any matrix with a non-empty null space is
zero, we have:
(3)
det (k I A) = 0 , k = 1..n
which gives the characteristic equation of matrix A.
Part 4b: Eigenvalues and Eigenvectors
(5)
(Revised: Sep 10, 2012)
Page 3
Matrices can have repeated eigenvalues.
>> A = [ 2 1;
0 2]
>> [V,U] = eig(A)
V =
1.0000
-1.0000
0
0.0000
Part 4b: Eigenvalues and Eigenvectors
A =
2
0
U = 2
0
1
2
0
2
(Revised: Sep 10, 2012)
Page 4
EE/ME 701: Advanced Linear Systems
Section 1.1.2
EE/ME 701: Advanced Linear Systems
Section 1.2.0
1.2 Analysis of the structure of the eigensystem of a matrix
When there are repeated eigenvalues:
1. We are assured to have at least 1 independent eigenvector.
Analysis of the eigensystem of a matrix proceeds by completing table 1.
2. There may fewer independent eigenvectors than eigenvalues
1. Group the into k sets of repeated eigenvalues (one set for each unique
)
Definitions:
The algebraic multiplicity of an eigenvalue is the number of times the
eigenvalue is repeated.
The number of k in the kth set is called the algebraic multiplicity,
and is given by mk
The geometric multiplicity is the number of independent eigenvectors
corresponding to the eigenvalue. (dim null (A I))
Since (k I A) always has a non-trivial null space, every k set of
eigenvalues has at least one eigenvector vk ,
Consider the example above:
>> A = [ 2 1;
0 2]
A =
2
0
2. Determine the number of independent eigenvectors corresponding to k
by evaluating
q (k I A) = dim null (k I A) .
1
2
This is called the geometric multiplicity, and is given by gk .
>> [V,U] = eig(A)
V =
1.0000
-1.0000
0
0.0000
U = 2.0000
0
If mk 2, it is possible that there are fewer independent eigenvectors
than eigenvalues.
0
2.0000
1 q (k I A) mk
Eigenvalue: 2.0
Algebraic multiplicity: 2
(6)
3. If mk > gk for any k, the Jordan form and generalized eigenvectors are
required.
Geometric Multiplicity: 1
Number of missing eigenvectors: 1
k k mk gk
Recall the eigen-decomposition of a matrix:
1
2
..
.
A = V U V 1
The eigen-decomposition only exists if V is invertible. That is if there is a
complete set of independent eigenvectors.
Part 4b: Eigenvalues and Eigenvectors
# Needed Generalized Evecs
mk gk
(Revised: Sep 10, 2012)
Page 5
Table 1: Analysis of the structure of the eigensystem of A.
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 6
EE/ME 701: Advanced Linear Systems
Section 1.2.0
EE/ME 701: Advanced Linear Systems
Section 2.1.0
2 Properties of the Eigensystem
Example,
>> A = [ 2 3 4 ; 0 2 1 ; 0 0 2 ]
A =
2
3
4
0
2
1
0
0
2
First well cover properties of the eigenvalues and regular eigenvectors.
2.1 Bay section 4.1, A-Invariant Subspaces
Recall: for triangular and diagonal matrices, the eigenvalues are the
diagonal elements
>> [V, U ] = eig(A)
V =
1.0000
-1.0000
0
0.0000
0
0
U =
2
0
0
0
2
0
1.0000
-0.0000
0.0000
>> RoundByRatCommand(V)
ans = 1
-1
1
0
0
0
0
0
0
0
0
2
(7)
D = E
Table 2: Analysis of the structure of the eigensystem of A.
Part 4b: Eigenvalues and Eigenvectors
(8)
In an isotropic dielectric medium, the electric field follows the relation
# Needed Generalized Evecs
mk gk
( I A) v = 0
Example (Bay Example 4.1): Electric Fields
The Jordan form will be required.
Av = v
Eqn (8) shows that the Eigenvectors lie in the null space of ( I A)
dim (1 I A) = 1, there is one eigenvector.
The eigenvectors of A are basis vectors for A-invariant subspaces.
v Av = 0
We see that there is one eigenvalue that is triply repeated.
When the operator A is understood from context, X1 is sometimes said
to be simply invariant.
Eqn (7) gives:
The analysis of the structure of the eigensystem of matrix A is seen in
table 2.
k k mk gk
Let X1 be a subspace of linear vector space X. This subspace is A-invariant
if for every vector z X1 , y = A z X1 .
(Revised: Sep 10, 2012)
where E is the electric field vector, D is the electric flux density (also called the
displacement vector) and is the dielectric constant.
Page 7
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 8
EE/ME 701: Advanced Linear Systems
Section 2.2.0
D
1 11 12 13
D2 = 21 22 23
D3
31 32 33
Section 2.3.0
2.3 Interpreting complex eigenvalues / eigenvectors
Some materials, however, are anisotropic, governed by
EE/ME 701: Advanced Linear Systems
Real eigenvalues correspond to scaling the eigenvector
E
1
E2
E3
Complex eigenvalues lead to complex eigenvectors, and correspond to
rotations
Complex eigenvalues come in complex conjugate pairs
Find the directions, if any, in which the E-field and flux density are collinear.
Examples: Consider the basic rotation matrix:
Solution:
For the E-field and flux density to be collinear they must satisfy D = E,
R=
Which is to say, the anisotropic directions are the eigenvectors of the
dielectric tensor.
C S
S
Eigenvalues:
2.2 Finding eigenvalues and eigenvectors
For ( I A) v = 0 to have a solution, ( I A) must have a non-trivial null
space. This is equivalent to saying
det ( I A) = 0
(9)
( I R) =
which gives the characteristic equation
det (I R) = 2 2C +C2 + S2 = 2 2C + 1 = 0
which solves to give
We have seen that Eqn (9) gives an nth order polynomial in
This is more important for understanding than as a solution method
Actual eigenvalue / eigenvector algorithms are beyond the scope of
EE/ME 701.
Use:
q
4C2 4
2
2C
q
2
4 S2
= C j S
The eigenvalues of R are a complex conjugate pair for any value of
6= 0o , 180o !
i = 1.0
>> [V, U] = eig(A)
Part 4b: Eigenvalues and Eigenvectors
2C
(Revised: Sep 10, 2012)
Page 9
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 10
EE/ME 701: Advanced Linear Systems
Section 2.3.1
EE/ME 701: Advanced Linear Systems
Section 2.3.1
The matrix R of Eqn (11) also describes a rotation of 600 about the axis v
2.3.1 Example: 3D Rotation
Weve seen the rotation matrix
c
tR
CC +C S CS S S S CC S
= SC
S SS +CC C SS CS
S
CS
CC
0.577
v3 = 0.577
0.577
(10)
w
y
Rw
x
z
For example, the rotation matrix
Figure 1: Illustration of the action of rotation matrix R on a vector w to give
vector R w.
2/3 1/3 2/3
R = 2/3 2/3 1/3
1/3 2/3 2/3
(11)
The mathematical manifestation of the absence of any other R-invariant
subspace is that the other two eigenvalues are complex.
Corresponds to
= 135o ,
= 135o ,
= 19.47o
When we form solutions of the form
This matrix (like all 3D rotation matrices with general angles) has one real
eigenvalue and a complex conjugate pair
>> [V, U] = eig(R)
V = -0.5774
0.2887 + 0.5000i
0.2887 - 0.5000i
U =
Every vector w v will be rotated by R. So the only R-invariant subspace
lies along the axis of rotation.
0.5000 + 0.8660i
0
0
Part 4b: Eigenvalues and Eigenvectors
-0.5774
0.2887 - 0.5000i
0.2887 + 0.5000i
0.5774
0.5774
0.5774
0
0.5000 - 0.8660i
0
0
0
1.0000
(Revised: Sep 10, 2012)
x (t) = V eU t V 1 x (0)
(12)
the complex eigenvalue and eigenvector pair combines to give solutions that
can be written
x (t) = a1et cos (t + ) w1 + a2et sin (t + ) w2
(13)
where the complex is written = j and the complex eigenvectors
form the real eigenvectors:
1
(real part)
(14)
w1 = (v1 + v2)
2
1
(v1 v2)
2
Part 4b: Eigenvalues and Eigenvectors
w2 = j
Page 11
(imaginary part)
(Revised: Sep 10, 2012)
(15)
Page 12
EE/ME 701: Advanced Linear Systems
Section 2.4.0
2.4 The eigen-system of symmetric (Hermitian) matrices
The eigensystem of a Hermitian matrix Q , (symmetric matrix, if real) have
special, and very helpful properties.
EE/ME 701: Advanced Linear Systems
Property 3: The eigenvectors of a Hermitian matrix, if they correspond to distinct
eigenvalues, must be orthogonal.
Proof: Starting with the given information, eigenvalue values 1 6= 2, and
corresponding eigenvectors v1 and v2
Notation: use A to be the complex-conjugate transpose (equivalent to A in
Matlab).
Property 1: If A = A, then for all complex vectors x, x A x is real.
Proof: Define y = (x A x). Applying the transpose to the product,
Section 2.4.0
Q v1 = 1 v1
(16)
Q v2 = 2 v2
(17)
Forming the complex-conjugate transpose of Eqn (16)
y = (x A x) = x A x = x A x
v1 Q = 1 v1 = 1 v1
Since A = A , y = y . For a number to equal its complex conjugate, it
must be real.
(18)
where we can drop the complex conjugate, because we know 1 is real.
Now multiplying on the right by v2 gives
Property 2: The eigenvalues of a Hermitian matrix must be real.
Proof: Suppose the is an eigenvalue of Q , with v a corresponding
eigenvector, then
Qv = v
v1 Q v2 = 1 v1 v2 = 1 v1 v2
(19)
But the multiplication also gives:
Now multiply on the left and right by v,
v1 (Q v2) = v1 2 v2 = 2 v1 v2
(20)
1 v1v2 = 2 v1 v2
(21)
v Q v = v v = v v = ||v||
So we find
By property 1,
v Q v
must be real, and ||v|| must be real, there for
=
If 1 6= 2, Eqn (21) is only possible if v1v2 = 0, which is to say that v1 and
v2 are orthogonal.
v Q v
||v||2
must be real.
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 13
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 14
EE/ME 701: Advanced Linear Systems
Section 2.4.0
Property 4: A Hermitian matrix has a complete set of orthogonal eigenvectors.
EE/ME 701: Advanced Linear Systems
Section 3.1.0
3 The Jordan-form
Proof: Any square matrix Q has a Schur decomposition
3.1 How the Jordan form relates to the real world
T = V 1 QV
(22)
The solution to the equation
where V is an orthonormal matrix and T is an upper-triangular matrix.
Since T is upper-triangular, T will be lower-triangular. However, V is
ortho-normal and Q is Hermitian, so
V = V 1 ,
So
Q = Q , and
V 1 = (V ) = V
For T to be both upper-triangular and lower-triangular, it must be
diagonal. Let U = T be a diagonal matrix.
Multiplying Eqn (22) on the left by V and on the right by V 1gives
=VUV
(23)
Which is precisely the form of the eigen-decomposition of Q, where
Diagonal matrix U holds the eigenvalues of Q, and
Remarks:
Since T = T , the diagonal elements (the eigenvalues) must be real
(see property 2).
When Q is real (symmetric), V will be real.
(Revised: Sep 10, 2012)
(25)
U =
0
2
...
and
(26)
eU t =
e1 t
0
e2 t
But what if the set of vk is incomplete, so there is no V 1 !
...
(27)
Such matrices can arise with repeated eigenvalues and do not have a
complete set of eigenvectors. Such matrices are called defective.
If we generated A matrices randomly, defective matrices would be quite
rare.
But repeated poles (eigenvalues) are relatively common in the analysis
of dynamic systems, and sometimes even a design goal.
Orthogonal matrix V holds the eigenvectors of Q.
Part 4b: Eigenvalues and Eigenvectors
x (t) = eAt x (0)
has terms of the form
eAt = V eU t V 1
Since T = T , T must be both upper-triangular and lower-triangular.
Q=V TV
(24)
When there is a complete set of eigenvectors,
T = V Q V 1 = V 1 QV = T
x (t) = A x (t) + B u (t)
Page 15
When A has repeated eigenvalues and missing eigenvectors (gk < mk ),
analysis of eAt requires converting matrix A to the Jordan form.
When we have a complete set of independent eigenvectors, eAt is given by
Eqn (26).
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 16
EE/ME 701: Advanced Linear Systems
Section 3.1.1
EE/ME 701: Advanced Linear Systems
Section 3.1.1
Consider eJ t for a matrix of the form
3.1.1 An example calling for the Jordan form
J=
With scalar differential equations, we know that equations with repeated
roots give solutions of the form
1
0
(31)
The expression for eJ t is
y (t) = c1 e1 t + c2 t e1 t .
For example,
y (t) + 6 y (t) + 9y (t) = 0
(28)
has the characteristic equation s2 + 6 s + 9 = 0 which as the roots
s = {3, 3} . The solution to Eqn (28) is:
y (t) = c1 e3t + c2 t e3t .
(29)
J2 t 2
Jk t k
eJ t = I + J t +
+ +
+
(32)
2!
k!
2
3
1 0
1
2 2
3 32
+ t
+ t
+
+ t
=
1! 0
2 ! 0 2
3 ! 0 3
0 1
t k k k k1
+
(33)
+
k! 0
k
The (1,1) and (2,2) elements give the series
But Eqn (27) for eAt has no terms of the form t e3t . And yet Eqn (28) is
simply represented in state space with:
x (t) =
y (t)
y (t)
d y (t) 6 9 y (t)
x (t) =
=
dt y (t)
1 0
y (t)
(30)
And the solution to Eqn (30) is (as always for x (t) = A x (t))
1 k k
t = et
k
!
k=1
1+
The (2,1) element of eJ t is 0. Now consider the (1,2) element, which is
given as
0+
1
t
+
k k1 t k =
1 ! k=2 k !
x (t) = eAt x (0) .
t 1+
k=2
So how can eAt have a term of the form t e3t ?
(Revised: Sep 10, 2012)
1
k1t k1
(k 1) !
= t 1+
k=1
1 kk
t
k!
= t et
So if J has the form of Eqn (31), then
Jt
Part 4b: Eigenvalues and Eigenvectors
(34)
Page 17
e = exp
1
0
Part 4b: Eigenvalues and Eigenvectors
t =
et t et
0
et
(Revised: Sep 10, 2012)
(35)
Page 18
EE/ME 701: Advanced Linear Systems
Section 3.1.1
et t et
1 0
exp 0 1 t = 0 et
0
0
0 0
1 2 t
2t e
t et
et
(36)
1 0 0
(37)
3.2.1 Regular and Generalized Eigenvectors
The regular eigenvectors are what we have considered all along, they satisfy
the relationship
For our specific example, A decomposes according to
A = M J M 1
A=
M=
And
J is a block-diagonal matrix, it is composed of mk mk blocks along the
main diagonal. Eqns (31), (36) and (37) give examples of 2x2, 3x3 and
4x4 blocks.
Each block on the diagonal of J is called a Jordan block.
gives terms of the form t 3 et , etc.
6 9
(39)
The columns of M are regular eigenvectors and generalized eigenvectors
0 1 0
t
exp
0 0 1
0 0 0
Matrix A has been transformed to the Jordan form (sometimes called the
Jordan canonical form) when
J = M 1 A M
which gives terms of the form t 2 et ,
with
Section 3.2.1
3.2 Constructing the Jordan Form
By the argument of Eqns (31)-(35) above,
EE/ME 701: Advanced Linear Systems
0.949
0.316
eAt = M eJ t M 1 = M
(38)
0.0316
0.0949
, J=
e3t t e3t
0
e3t
3
0
1
3
A v = k v
or
(A k I) v = 0
(40)
From Eqn (40), a set of independent regular eigenvectors is given as the null
space of (A k I).
M 1
So the solution x (t) = eAt x (0) will have terms of the form e3t and t e3t ,
as need !
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 19
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 20
EE/ME 701: Advanced Linear Systems
Section 3.2.1
The generalized eigenvectors form chains starting with a regular
eigenvector. The generalized eigenvectors satisfy the relationship
l+1
l
A Vk,l+1
j = k Vk, j +Vk, j
(41)
l
(A k I) Vk,l+1
j = Vk, j
(42)
Or, rearranging
is the the next generalized eigenvector in a chain
(see Bay Eqn (4.14)).
In this notation,
(43)
is the first element of a chain; it is a regular eigenvector, and it is the jth
regular eigenvector of the kth distinct eigenvalue.
= 1 designates that Vk,1 j
The l
be a regular eigenvector.
Section 3.2.2
3.2.2 First Jordan Form Example
Consider eAt , with A given as:
>> A = [ 3 3 3 ; -3 3 -3 ; 3 0 6 ]
A =
3
3
3
-3
3
-3
3
0
6
First look at the eigenvalues
Where Vk,l+1
j
Vk,1 j
EE/ME 701: Advanced Linear Systems
is the first eigenvector in a chain, so it must
Eqn (42) is an example of a recursive relationship, it is an equation that is
applied repeatedly to get all elements of the chain.
The method presented here to determine the Jordan form is the bottom-up
method presented in Bay, section 4.4.3.
>> U = eig(A)
U = 3.0000
6.0000
3.0000
%% This command rounds-off values to nearby rational numbers
%% which may be integers
>> U = RoundByRatCommand(U)
U =
3
6
3
A has a repeated eigenvalue, we can make a table analyzing the structure of
the eigensystem of A
k k mk
1
2
3
6
2
1
gk
# Needed Gen. Evecs
mk gk
1 or 2
1
0 or 1
0
Table 3: Analysis of the structure of the eigensystem of A.
Table 3 shows that A has two distinct eigenvalues, and we dont yet know if
1 has 1 or 2 independent eigenvectors.
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 21
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 22
EE/ME 701: Advanced Linear Systems
Section 3.2.2
Evaluate the geometric multiplicity of 1
EE/ME 701: Advanced Linear Systems
In the example, we need 1 generalized eigenvector for the k = 1 eigenvalue.
>> lambda1=3; lambda2=6; I = eye(3);
>> v1 = null(A-lambda1*I); v1 = v1/v1(1)
v1 = 1
%% Eigenvector, scaled so the
1
%% first element is an integer
-1
In this case for k = 1 we have only one regular eigenvector, so it must serve
as the first element, or anchor, of the chain of generalized eigenvectors
regular eigenvector:
generalized eigenvector:
The geometric multiplicity is the dimension of the null space in which the
eigenvectors lie. For 1, g1 = 1. Putting this information into the table
k k mk gk
1
2
3
6
2
1
# Needed Generalized Evecs
mk gk
1
1
1
0
Table 4: Analysis of the structure of the eigensystem of A.
The total number of eigenvectors (regular+generalized) needed for k is mk .
The number of regular eigenvectors is gk . The regular eigenvectors get the
notation:
1
1
1
Vk,1
, Vk,2
, ..., Vk,g
k
where j = 1 ... gk .
The number of needed generalized eigenvectors, corresponding to k , is
mk gk .
Part 4b: Eigenvalues and Eigenvectors
Section 3.2.2
(Revised: Sep 10, 2012)
Page 23
1
V1,1
solves
2
V1,1
solves
1
(A 1 I)V1,1
= 0
(44)
2
1
(A 1 I)V1,1
= V1,1
In Matlab
>> lambda1=1; lambda2=2; I = eye(3);
%% Find the first regular eigenvector,
>> V111 = null(A-lambda1*I); V111=V111/V111(1)
V111 = 1.0000
1.0000
-1.0000
%% Find the generalized eigenvector by solving Eqn (44)
>> V112 = pinv(A-lambda1*I)*V111
V112 = -1.0000
1.0000
0.0000
%% Find the regular eigenvector for lambda2
>> V211 = null(A-lambda2*I); V211=V211/V211(2)
V211 =
0
1.0000
-1.0000
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 24
EE/ME 701: Advanced Linear Systems
Section 3.2.2
Put the eigenvectors (regular and generalized) together in the M matrix.
The regular and generalized eigenvectors of a chain must be put in order.
vectors Vk,l j
For each k and j, put the
going to the end of the chain.
into M, starting with l = 1, and
For the example,
>> M = [V111, V112, V211]
M =
1.00
-1.00
0
1.00
1.00
1.00
-1.00
0.00
-1.00
..
.
..
.
..
.
..
.
3.2.3 More on Jordan blocks
A matrix in Jordan canonical form has a block diagonal structure, with
Eigenvalues on the main diagonal
>> J = inv(M) * A * M
>> J = RoundByRatCommand(J)
J =
3
1
0
0
3
0
0
0
6
0
,
A 2x2 block has 1 one, a 3x3 block has 2 ones, etc.
One Jordon block corresponds to each regular eigenvector
If the regular eigenvector has no generalized eigenvectors, then it creates a
1x1 block.
i
h
1 V2 V1
M = V1,1
1,1
2,1
3 1
0 3
J=
0 0
If the regular eigenvector anchors a chain with one generalized eigenvector,
then it creates a 2x2 block, etc.
Each Jordan block corresponds to:
1x1 block:
a regular eigenvector
n n block, n 2: a chain anchored by a regular eigenvector, with n 1
generalized eigenvectors
eAt = M eJ t M 1
(45)
For a system governed by x (t) = A x (t), and considering the J matrix, the
output of the system will have solutions of the form
y (t) = c1 e3t + c2 t e3t + c3 e6t
(46)
Using the Vk,l j notation, if we look at the structure of the M matrix, we can
determine the layout of Jordan blocks. For example,
h
i
1 V1 V2 V1 V2
M = V1,1
1,2
1,2
2,1
2,1
Then the blocks of J are arranged:
1x1
J=
where the first two terms correspond to the first Jordan block, and the last
term to the second Jordan block.
Part 4b: Eigenvalues and Eigenvectors
Section 3.2.3
Ones on the upper diagonal within each block.
Put in the chain corresponding to for each regular eigenvector ( j) for
each distinct eigenvalue (k). Chains may have a length of 1.
J has 2 Jordon blocks
EE/ME 701: Advanced Linear Systems
(Revised: Sep 10, 2012)
Page 25
2x2
2x2
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 26
EE/ME 701: Advanced Linear Systems
Section 3.2.4
3
1
0
0
0
0
-1
1
0
0
0
0
1
-1
2
0
0
0
1
-1
0
2
0
0
0
0
1
-1
1
1
0
0
1
-1
1
1
>> U = eig(A)
U = 2.0000
2.0000
2.0000
2.0000
2.0000
0
>> lambda1=0; lambda2=2; I = eye(6);
>> [Row, Col, Null, LNull] = spaces(A-lambda2*I);
>> g2=rank(Null); g2 = 2
(A 2 I) has a 2-dimensional null space, so there are 2 independent regular
eigenvectors.
=
0
0.7071
0
0.7071
-0.7071
0.0000
0.7071
0.0000
0.0000
0
0.0000
0
For convenience, scale the eigenvectors to get integer values
>> V211 = Null(:,1)/Null(3,1)
V211 =
0
0
1
-1
0
0
Part 4b: Eigenvalues and Eigenvectors
>> %% Check that the eigenvectors are indeed eigenvectors
>> %% These norms come out to zero, very small would be sufficient
>> NDiff1 = norm( A*V211 - lambda2*V211 )
NDiff1 =
0
>> NDiff2 = norm( A*V221 - lambda2*V221 )
NDiff2 =
0
Note: All vectors from null (A 2I) are eigenvectors. For example,
1 = 0 , 2 = 2 is repeated 5 times.
Null
Section 3.2.4
Check that the eigenvectors are actually eigenvectors for 2 = 2
3.2.4 The Jordan form, a second example
A =
EE/ME 701: Advanced Linear Systems
>> V221 = Null(:,2)/Null(1,2)
V221 = 1
1
0
0
0
0
(Revised: Sep 10, 2012) Page 27
>> x = 0.3*V211 + 0.4*V221
x = 0.4
>> NDiffx = norm( A*x - lambda2*x )
0.4
NDiffx = 0
0.3
-0.3
0.0
0.0
Make a table of the structure of the eigensystem.
k k mk gk
1
2
0
2
1
5
1
2
# Needed Gen. Evecs
mk gk
0
3
Table 5: Structure of the eigensystem of A.
We need 3 generalized eigenvectors to have a complete set.
These three will be in chains anchored by one or both of the regular
eigenvectors of 2 .
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 28
EE/ME 701: Advanced Linear Systems
Section 3.2.4
EE/ME 701: Advanced Linear Systems
Section 3.2.4
2
Find V2,1
The equation giving generalized eigenvectors is
l
(A k I) Vk,l+1
j = Vk, j
(47)
This is simply the relation
Bx = y
(48)
with B = (A k I) , y = Vk,l j and x = Vk,l+1
j .
We know some things about the solutions of Eqn (47)
1. For an exact solution to exist Vk,l j must lie in the column space of
(A k I)
2. If we find a solution Vk,l+1
j , it is not unique. We can add any component
from the null space of (A k I) , and it will still be a solution.
>> V212 = pinv(A-lambda2*I)*V211
V212 =
0
0
0.00
-0.00
0.50
0.50
%% Check that V212 is a generalized eigenvector
>> NDiffV212 = norm( (A-lambda2*I)*V212 - V211 )
NDiffV212 = 2.7581e-16
2 is a generalized eigenvector.
Yes, V2,1
Considering again the example problem,
Check that V211 and V221 lie in the column space of (A 2 I) by
checking that the projection of each onto the column space is equal to
the original vector
2 is in the column space of (A I)
Test to see if V2,1
2
>> norm( Col*Col*V212-V212 )
ans = 0.7071
>> [Row, Col, Null, LNull] = spaces(A-lambda2*I);
3
No, so there is no V2,1
>> NIsInColumnSpaceV211 = norm( Col*Col*V211-V211 )
NIsInColumnSpaceV211 = 1.1102e-16
%% V211 is in the column space of (A-lambda2*I)
>> NIsInColumnSpaceV2 = norm( Col*Col*V221-V221 )
NIsInColumnSpaceV2 = 1.1430e-15
%% V221 is in the column space of (A-lambda2*I)
Both vectors lie in the column space of (A 2 I) , so each will have at
least one generalized eigenvector.
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 29
2
Now evaluate V2,2
>> V222 = pinv(A-lambda2*I)*V221
V222 = 0.50
>>
%% Check that V222 is a gen. eigenvector
-0.50
>>
NDiffV222
= norm( (A-lambda2*I)*V222 - V221 )
-0.00
NDiffV222
=
6.2804e-16
-0.00
0.00
2 is a generalized eigenvector
Yes, V2,2
0.00
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 30
EE/ME 701: Advanced Linear Systems
Section 3.2.4
2 is in the column space of (A I)
Now check to see that V2,2
2
3 . This will be the third generalized eigenvector
Yes, so there is a V2,2
V223 =
0.00
-0.00
0.25
0.25
-0.00
-0.00
0
0
0
1
2
0
0
0
0
0
1
2
J has 3 Jordan blocks
J =
First we need the regular eigenvalue corresponding to 1
>> V111 = null(A-lambda1*I);
>> V111 = V111/ V111(5)
V111 =
0
0
0
0
1
-1
Put in the chains of E-vecs, starting each chain with its regular E-vec
Part 4b: Eigenvalues and Eigenvectors
>> J = inv(M)*A*M;
>> J = RoundByRatCommand(J)
J =
0
0
0
0
0
2
1
0
0
0
2
0
0
0
0
2
0
0
0
0
0
0
0
0
Interpreting the result
Now build the M matrix
>> M = [ [V111] [V211 V212]
>> M = RoundByRatCommand(M)
M =
0
0
0
0
0
0
0
1.00
0
0 -1.00
0
1.00
0
0.50
-1.00
0
0.50
Section 3.2.4
Now find J
>> NIsInColumnSpaceV222 = norm( Col*Col*V222-V222 )
NIsInColumnSpaceV222 = 4.2999e-16
>> V223 = pinv(A-lambda2*I)*V222,
EE/ME 701: Advanced Linear Systems
[V221 V222 V223] ]
1.00
1.00
0
0
0
0
0.50
-0.50
0
0
0
0
(Revised: Sep 10, 2012)
Correspondingly, M has 3 chains of eigenvectors
M
0
0
0.25
0.25
0
0
0
| 0
0 |
0
0
0
----------------0
| 2
1 |
0
0
0
0
| 0
2 |
0
0
0
----------------------------0
0
0 |
2
1
0
0
0
0 |
0
2
1
0
0
0 |
0
0
2
0
0
0
0
1.00
-1.00
|
0
|
0
| 1.00
| -1.00
|
0
|
0
0
0
0
0
0.50
0.50
|
|
|
|
|
|
1.00
1.00
0
0
0
0
0.50
-0.50
0
0
0
0
0
0
0.25
0.25
0
0
The first eigenvector in each chain is a regular eigenvector.
Page 31
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 32
EE/ME 701: Advanced Linear Systems
Section 3.3.0
3.3 One more twist, freedom to choose the regular eigenvector
Fact: If a matrix A has repeated eigenvalues, with gk > 1 independent eigenvector,
the gk eigenvectors form a vector subspace, any vector from which is an
eigenvector.
When mk 3, it is possible that gk 2, and we need to find a generalized
eigenvector.
EE/ME 701: Advanced Linear Systems
Section 3.3.0
When gk 2, we have the freedom to choose the anchor for the chain of
generalized eigenvectors
1 , V 1 , ...
Not just from a list, Vk,1
k,2
But as any vector from the null space of (A k I) .
1 , V 1 , neither one of
It may be that we have valid eigenvectors , Vk,1
k,2
which lies in the column space of (A k I) !
By forming the intersection of the null and column spaces of (A k I), we
can find the needed regular eigenvector.
In this case,
dim null (A k I) = gk 2
and any vector from the 2-dimensional (or larger) null space of (A k I) is
an eigenvector.
W = col (A k I) null(A k I) ,
Vk,1 j W
(50)
Consider the generating equation for the generalized eigenvector
(A k I) Vk,2 j = Vk,1 j
(49)
The anchor Vk,1 j must also lie in the column space of (A k I)
A regular eigenvector that anchors a chain of generalized eigenvectors must
lie in 2 spaces at once:
The null space of (A k I) . . . . . . . . . . . . . . . To be a regular e-vec of A.
The column space of (A k I) . . . To generate a generalized e-vec of A.
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 33
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 34
EE/ME 701: Advanced Linear Systems
Section 3.3.1
3.3.1 Example where regular E-vecs do not lie in the column space of
(A k I)
Consider the matrix
2 1
0 4
A=
0 1
0 1
>> NIsInColumnSpaceV1 =
NIsInColumnSpaceV1 =
>> NIsInColumnSpaceV2 =
NIsInColumnSpaceV2 =
>> RoundByRatCommand ( eig(A) )
ans = 2
2
2
2
-0.7118
-0.2468
-0.4650
-0.4650
norm( Col*Col*v1-v1 )
0.6325
norm( Col*Col*v2-v2 )
0.6325
But what about the possibility that there exists another eigenvector which
lies in the null space and column space of (A k I):
Null =
0
0.8165
-0.4082
-0.4082
1
0
0
0
So the structure of the eigensystem is given in table 6
k k mk gk
Null(3,1) )
No, neither eigenvector lies in the column space of (A k I)
>> I = eye(4) lambda1=2,
>> [Row, Col, Null, Lnull] = spaces(A - lambda1*I)
0.3055
0.7342
-0.4287
-0.4287
A first choice for eigenvectors are the two basis vectors of the null space of
(A k I)
Determine if the eigenvectors like in the column space of (A k I)
Analyzing the structure of its eigensystem
Col =
Section 3.3.1
>> v1 = RoundByRatCommand( Null(:,1) /
>> v2 = RoundByRatCommand( Null(:,2) )
v1 =
0
v2 = 1
-2
0
1
0
1
0
2 2
0 0
EE/ME 701: Advanced Linear Systems
x1 = a1 v1 + a2 v2
First, consideration of the possibilities
The universe is R4 , or 4D, with dim col (A k I) = 2, and
dim null (A k I) = 2. So there are 3 possibilities:
1. Two 2D spaces can fit in a 4D universe and not intersect, so it is
possible
col (A k I) null (A k I) = 0
# Needed Gen. Evecs
mk gk
2
2. It is possible that the intersection is 1 D
Table 6: Analysis of the structure of the eigensystem of A.
3. It is possible that the intersection is 2 D
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 35
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 36
EE/ME 701: Advanced Linear Systems
Section 3.3.1
Previously, we have seen how to form the intersection of two subspaces,
Given sets of basis vectors U = [u1 , u2 , ..., unu ] and V = [v1 , v2 , ..., vnv ],
vectors in the intersection
W = UV
That is
Now find w1 , a vector in both the column and null spaces of (A k I)
>> w1 = Col*CoeffVec(1:2)
>> w1 = RoundByRatCommand( w1/w1(1)),
w1 =
1
2
-1
-1
Verify that w1 is an eigenvector
wi = a1 u1 + a2 u2 + + anu unu = b1 v1 + b2 v2 + + bnv vnv
Section 3.3.1
(51)
Are solutions to
for some non-zero values
EE/ME 701: Advanced Linear Systems
a1 anu , b1
(52)
i
a1 anu , b1 anv .
i
anv must solve
a
1
[U, V ] ... = 0
bnv
(A k I) has only one regular eigenvector that can anchor a chain. So the
chain must have length 3 (2 generalized E-vecs).
2 as solution to
Compute a candidate for the first generalized eigenvector, V1,1
2 = V1
(A k I) V1,1
1,1
(53)
h
i
The coefficient vector must lie in the null space of Col, Null , where
[Col] and [Null] are sets of basis vectors on the column and null spaces.
>> CoeffVec = null([Col, -Null])
CoeffVec =
0.7033
-0.0736
0.6547
0.2673
h
i
Since the null space of Col, Null is one dimensional, the intersection
of the column and null spaces is 1D
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
>> NIsEigenvectorW1 = norm( A*w1 - lambda1 * w1)
NIsEigenvectorW1 =
0
Page 37
>> V111 = w1,
>> v3 = pinv(A - lambda1*I) * V111,
v3 =
0
0.3333
0.3333
0.3333
3 , V 2 must be in the
To find the remaining generalized eigenvector, V1,1,
1,1
column space of (A k I)
>> NIsInColumnSpaceV112 = norm( Col*Col*v3 - v3)
NIsInColumnSpaceV112 = 0.4216
It is not !
2 ,
v3 is a candidate generalized eigenvector, but we can not use it for V1,1
3 .
because it does not lead to V1,1
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 38
EE/ME 701: Advanced Linear Systems
Section 3.3.1
2 must solve
Going back to the generating Eqn, V1,1
EE/ME 701: Advanced Linear Systems
Section 3.3.1
2. Determine the candidate vector
>> v3b = v3 + Null * CoeffVec2(3:4),
2
1
(A k I) V1,1
= V1,1
(54)
v3 is a particular solution to Eqn (54), but it is not the only solution. Any
vector
2
V1,1
= v3 + b1 n1 + b2 n2
(55)
is a solution to (54), where n1 and n2 are basis vectors for the null space of
(A k I).
2 that is in the column space of (A I), we need a
To find a value for V1,1
k
solution to
2
= v3 + b1 n1 + b2 n2 = a1 c1 + a2 c2
(56)
V1,1
Or
c1 c2 n1
1. Find the coefficient vector
a
1
i
a2
= v3
n2
b1
b2
(57)
3. Check to be sure the new candidate is in the column space of (A k I)
>> NIsInColumnSpaceV112b = norm( Col*Col*v3b - v3b)
NIsInColumnSpaceV112b =
6.0044e-16
Yes !
2 and compute V 3
Set V1,1
1,1
>> V112 = v3b
V112 = 0.5714
0.1429
0.4286
0.4286
>> V113 = pinv(A - lambda1*I) * V112
V113 =
0
0.1905
0.6905
-0.3095
1 is any independent regular eigenvector. Compute J.
Build the M matrix, V1,2
>> M = [ V111, V112, V113, V121 ]
M = 1.0000
0.5714
0
2.0000
0.1429
0.1905
-1.0000
0.4286
0.6905
-1.0000
0.4286
-0.3095
>> CoeffVec2 = pinv( [Col, -Null]) * v3
CoeffVec2 = -0.0880
-0.8406
-0.2333
0.5714
v3b = 0.5714
0.1429
0.4286
0.4286
1.0000
0
0
0
>> J = RoundByRatCommand( inv(M) * A * M )
J = 2
1
0
0
0
2
1
0
0
0
2
0
0
0
0
2
J has a 3x3 block, and a 1x1 block.
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 39
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 40
EE/ME 701: Advanced Linear Systems
Section 3.4.0
3.4 Summary of the Jordan Form
EE/ME 701: Advanced Linear Systems
Section 3.4.1
3.4.1 Why Matlab does not have a numeric Jordan command
Square matrices always have n eigenvalues, i
Strikingly, Matlab has no numerical routine to find the generalized
eigenvectors or Jordan form (standard Matlab no jordan() routine !)
The regular eigenvectors are given as the null space of (A i I)
For a repeated eigenvalue k
The algebraic multiplicity, mk , is the number of times k is repeated
This is because the Jordan form calculation is numerically very sensitive,
a small perturbation in A produces a large change in the chains of
eigenvectors
This sensitivity is true of the differential equations themselves,
The geometric multiplicity, gk , is the dimension of null(A k I)
When eigenvalues are repeated, we may not have enough independent
regular eigenvectors (gk < mk ), in which case the Jordan form is required.
The Jordan form corresponds to scalar differential equations with
repeated roots and solutions of the form
y (t) + 6 y (t) + 9.00001y (t) = 0
has two distinct roots !
Consider the stages where a decision must be made
When there are two eigenvalues with a b, are they repeated or
distinct ?
y (t) = a1 e1 t + a2 t e1 t + a3 t 2 e1 t ...
What is the dimension of null (A I) ?
For repeated eigenvalues, regular eigenvectors give rise to chains of
generalized eigenvectors. The generalized eigenvectors are solutions to
l+1
l
(A i I) vk,
j = vk, j
Does vlk, j lie in the column space of (A I), or does it not ?
l+1
Is vk,
j independent of the existing eigenvectors ?
(58)
There is no known numerical routine to find the Jordan form that is
sufficiently numerically robust to be included in Matlab.
Examples have shown several characteristics of eigensystems with repeated
roots.
The Matlab symbolic algebra package does have a jordan() routine. It
runs symbolically on rational numbers to operate without round-off error,
for example:
where Eqn (58) is repeated as needed to obtain mk eigenvectors.
21/107 52/12
A=
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 41
119/120
1/1
11/12
Part 4b: Eigenvalues and Eigenvectors
3/2
8/5
13/14
(Revised: Sep 10, 2012)
Page 42
EE/ME 701: Advanced Linear Systems
Section 4.0.0
4 Conclusions
EE/ME 701: Advanced Linear Systems
Section 5.0.0
The solution in Jordan for comprises Jordan blocks
To solve
Each Jordan block corresponds to a chain of generalized eigenvectors.
x (t) = A x (t) + B u (t)
The generalized eigenvectors solve the equation
we are interested to make a change of basis from physical (or other)
coordinates to modal coordinates.
This involves the eigenvalues and eigenvectors. The eigenvectors solve the
equation
(A i I) vi = 0
l+1
l
(A i I) vk,
j = vk, j
For the l th generalized eigenvector, of the jth Jordan block, of the kth distinct
eigenvalue
v1k, j is a regular eigenvector, corresponding to k
So the eigenvectors lie in null (A i I)
l+1
l
For vk,
j to exist, vk, j must lie in the column space of (A k I)
Eigenvalues may be complex
The Jordan form leads to solutions of the differential equation in the form
t et , t 2 et , etc. For example:
Complex eigenvalues correspond to solutions with terms
x (t) = a2et cos (t + ) w1 + a2et sin (t + ) w2
The complex terms correspond to the action of a rotation in state space,
in the sub space spanned by the complex eigenvectors.
The eigenvectors corresponding to an eigenvalue (or a complex conjugate
pair of eigenvalues) define and A-invariant subspace. Vectors in this
subspace stay in this subspace.
The modal matrix is the transformation from from modal to physical
p
coordinates, M = m T .
If we lack a complete set of regular eigenvectors, M includes generalized
eigenvectors, and
J = M 1 A M
gives the system matrix in Jordan form.
(Revised: Sep 10, 2012)
Page 43
..
.
..
.
..
.
.. 2 t
. e
t e2 t
..
. 0 e2 t
..
. 0
0
..
1 1 .
1 ..
...
J =
..
. 2 1
..
.
2 1
..
.
2
With a complete set of eigenvectors, M = V .
Part 4b: Eigenvalues and Eigenvectors
(59)
eJt
1 t t e1 t
e
0 e1 t
Part 4b: Eigenvalues and Eigenvectors
1 2 2 t
2t e
t e2 t
e2 t
(Revised: Sep 10, 2012)
Page 44
EE/ME 701: Advanced Linear Systems
Section 5.0.0
5 Review questions and skills
1. In what fundamental space do the regular eigenvectors lie.
2. Given the eigenvalues of a matrix, analyze the structure of the eigensystem
(a) Determine the number of required generalized eigenvectors
3. Indicate the generating equation for the generalized eigenvectors.
4. Indicate in what fundamental space the vectors of the generating equations
must lie.
5. When gk 2, and no regular eigenvector lies in the column space of
(A k I), what steps can be taken ?
6. When additional generalized eigenvectors are needed, and v, a candidate
generalized eigenvector does not lie in the column space of (A k I), what
steps can be taken ?
Part 4b: Eigenvalues and Eigenvectors
(Revised: Sep 10, 2012)
Page 45
EE/ME 701: Advanced Linear Systems
EE/ME 701: Advanced Linear Systems
Review-By-Example of some of the Basic
Concepts and Methods of Control System
Analysis and Design
4.1.1
4.2
Contents
1 Differential Eqns, Transfer Functions & Modeling
4.3
Real Pole location and response characteristics . . . . . .
Basis of pole-zero maps, 2nd order, p1,2 = j
27
. . . . . . .
28
4.2.1
Notation for a complex pole pair . . . . . . . . . . . . . .
29
4.2.2
Complex pole location and response characteristics
. . .
30
4.2.3
The damping factor: . . . . . . . . . . . . . . . . . . .
33
Determining approximate performance measures from a dominant
second-order mode . . . . . . . . . . . . . . . . . . . . . . . . .
35
4.3.1
Rise time from pole locations, 10%-90% . . . . . . . . .
36
4.3.2
Peak time from pole locations . . . . . . . . . . . . . . .
36
4.3.3
Settling time from pole locations . . . . . . . . . . . . . .
36
1.1
Example 1, Golden Nugget Airlines . . . . . . . . . . . . . . . .
1.2
Block diagram
. . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3
Laplace Transforms and Transfer Functions . . . . . . . . . . . .
1.3.1
Laplace transform . . . . . . . . . . . . . . . . . . . . .
1.3.2
Some basic Laplace transform Pairs . . . . . . . . . . . .
5 Design
1.3.3
Transfer Functions are Rational Polynomials . . . . . . .
5.1
Design methods
. . . . . . . . . . . . . . . . . . . . . . . . . .
38
1.3.4
A transfer function has poles and zeros . . . . . . . . . .
10
5.2
Root Locus Design . . . . . . . . . . . . . . . . . . . . . . . . .
38
1.3.5
Properties of transfer functions . . . . . . . . . . . . . . .
12
1.3.6
The Impulse Response of a System . . . . . . . . . . . .
15
2 Closing the loop, feedback control
16
2.1
Analyzing a closed-loop system . . . . . . . . . . . . . . . . . .
20
2.2
Common controller structures: . . . . . . . . . . . . . . . . . . .
21
2.3
Analyzing other loops
22
. . . . . . . . . . . . . . . . . . . . . . .
3 Analysis
4 Working
4.1
37
6 Summary
39
7 Glossary of Acronyms
40
25
with
the
pole-zero
constellation
26
Basics of pole-zero maps, 1st order, p1 = . . . . . . . . . . .
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
26
Page 1
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 2
EE/ME 701: Advanced Linear Systems
Section 1.1.0
1 Differential Eqns, Transfer Functions & Modeling
1.1 Example 1, Golden Nugget Airlines
EE/ME 701: Advanced Linear Systems
Section 1.2.0
1.2 Block diagram
A block diagram is a graphical representation of modeling equations and
their interconnection.
Dynamic systems are governed by differential equations
(or difference equations, if they are discrete-time)
Eqns (1)-(3) can be laid out graphically, as in figure 2.
Example (adapted from Franklin et al. 4thed., problem 5.41, figure 5.79)
Mp(t)
Elevator
servo
v(t)
GNA
7
s + 10
Aircraft
Dynamics
+
Me(t)
+
Mt(t)
Integrator
(t)
s+3
s2 + 4 s + 5
(t)
1
s
(t), (t)
Figure 2: Block diagram of the aircraft dynamics and signals.
Mp(t) Me(t)
Figure 1: Golden Nugget Airlines Aircraft. Me (t) is the elevator-moment (the
control input), M p (t) is the moment due to passenger movements (a
disturbance input), and (t) is the aircraft pitch angle, (t) = (t) .
For example, the aircraft dynamics give:
1
d Mt (t)
d 2 (t)
d (t)
+ 5 (t) = 1
+ 3 Mt (t)
+4
dt 2
dt
dt
Mt (t) = Me (t) + M p (t) ;
1
(t) =
(1)
d (t)
dt
(2)
d Me (t)
+ 10 Me (t) = 7 v(t)
dt
(3)
Where Mt (t) is the total applied moment.
Eqn (1) has to do with the velocity of the aircraft response to Mt (t).
Eqn (2) expresses that the pitch-rate is the derivative of the pitch angle.
And Eqn (3) describes the response of the elevator to an input command
from the auto-pilot.
Main Fact: The Differential Equations come from the physics of the system.
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 3
Signal Symbol
Signal Name / Description
Units
v (t)
Voltage applied to the elevator drive
[volts]
Me (t)
Moment resulting from the elevator surface
[N m]
M p (t)
Moment resulting from movement of the passengers
(t)
[N m]
Pitch Rate (angular velocity)
[rad/sec]
Aircraft pitch angle
[radians]
(t)
Table 1: List of signals for the aircraft block diagram.
When analyzing a problem from basic principals, we would also have a list
of parameters.
Parameter Symbol
Signal Name / Description
Value
Units
b0
Motion parameter of the elevator
[N-m/volt]
...
...
...
Table 2: List of parameters for the aircraft block diagram.
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 4
EE/ME 701: Advanced Linear Systems
Section 1.3.1
1.3 Laplace Transforms and Transfer Functions
EE/ME 701: Advanced Linear Systems
Section 1.3.2
1.3.2 Some basic Laplace transform Pairs
To introduce the Transfer Function (TF), we need to review the Laplace
transform.
Time Domain Signal
L{f(t)}
Scaled Impulse:
F(s)
f(t)
L-1{F(s)}
Unit step:
F (s) = b
1
s
f (t) = t, t 0
F (s) =
1
s2
n!
Decaying Exponential:
"Time Domain"
Space of all
functions of s
"Frequency Domain"
f (t) = t n , t 0
F (s) =
(t) = b et
F (s) =
f (t) = sin (wt)
Sinusoid:
f (t) = Bc cos (t) + Bs sin (t)
Sinusoidal oscillation:
f (t) = et (Bc cos (t) + Bs sin (t))
Oscillatory Exp. Decay:
s2 +2
c s+Bs
F (s) = sB2 +0
s+2
Bc s+(Bc +Bs )
F (s) = s2 +2 s+2
n
2 + 2
Table 3: Basic Laplace-transform pairs.
Answers:
U(s)
u(t)
Solve
Algebraic
Eqn.
Solve
Diff Eq.
Y(s)
y(t)
2. The LT makes possible the
transfer function.
1/
Area = 1.0
1+(t)
t=0
t=0
t=0
(t) , impulse function
1+ (t), unit step
Unit ramp
e-t
Space of all functions
of time
Space of all
functions of s
t=0
4. Frequency domain: find Guy (s)
for all U (s).
Figure 3: Solve differential equation in time domain or solve an algebraic
equation in s domain.
(Revised: Sep 08, 2012)
t=0
Decaying expon.
Page 5
t=0
A Sinusoid
Part 1A: Controls Review-By-Example
sn+1
b
s+
F (s) =
n =
Why pay attention to Laplace
transforms ?
Part 1A: Controls Review-By-Example
f (t) = b (t)
F (s) =
Higher Powers of t:
Space of all functions
of time
3. Time domain: find y (t) for one
u (t).
F (s) = 1
f (t) = 1+ (t) = 1.0, t 0
Unit ramp:
The Inverse Laplace transform
maps F (s) to f (t).
1. Differential equations in time
domain correspond to algebraic
equations in s domain.
f (t) = (t)
Unit Impulse:
1.3.1 Laplace transform
The Laplace transform (LT) maps
a signal (a function of time) to a
function of the Laplace variable s .
Laplace transform
Oscil. decay exp.
(Revised: Sep 08, 2012)
Page 6
EE/ME 701: Advanced Linear Systems
Section 1.3.2
Two theorems of the Laplace transform permit us to build transfer functions:
1. The Laplace transform obeys superposition and scaling
Given: z (t) = x (t) + y (t)
then
Z (s) = X (s) +Y (s)
z (t) = 2.0 x (t)
then
Z (s) = 2.0 X (s)
Given:
Given: X (s) = L {x (t)} ,
then
d x (t)
dt
Section 1.3.2
Eqns (5)-(7) tell us something about the ratio of the LT of the output to the
LT of the input:
a2 s2 + a1 s + a0 Y (s) = (b1 s + b0) U (s)
(7, repeated)
(b1 s + b0)
Output LT
Y (s)
=
=
= G p (s)
U (s)
(a2 s2 + a1 s + a0)
Input LT
2. The Laplace transform of the derivative of a signal x (t) is s times the LT
of x (t):
EE/ME 701: Advanced Linear Systems
(8)
A Transfer Function (TF) is a ratio of the input and output LTs
Given
G p (s) =
= s X (s)
(b1 s + b0)
,
(a2 s2 + a1 s + a0)
Y (s) = G p (s) U (s)
(9)
Where G p (s) is the TF of the plant.
Putting these rules together lets us find the
transfer function of a system from its governing
differential equation.
u(t)
System
y(t)
Gp(s)
The transfer function is like the gain of the system, it is the ratio of the
output LT to the input LT.
Consider a system that takes in the signal u (t) and gives y (t), governed by
the Diff Eq:
a2
Important fact:
d 2 y (t)
d y (t)
d u (t)
+ a1
+ a0 y (t) = b1
+ b0 u (t)
dt
dt
dt 2
However, the TF depends only on the parameters of the system
(coefficients a2..a0 and b0..b1 in the example), and not on the actual
values of U (s) or Y (s) (or u (t) and y (t)).
(4)
(Notice the standard form: output (unknown) on the left, input on the right.)
Whatever signals y (t) and u (t) are, they have Laplace transforms. Eqn (4)
gives:
Basic and Intermediate Control System Theory are present transfer
function-based design.
By engineering the characteristics of the TF, we engineer the system to
achieve performance goals.
r(t)
d 2 y (t)
d y (t)
+ a1
+ a0 y (t)
L a2
dt 2
dt
d u (t)
= L b1
+ b0 u (t)
dt
a2 s2 Y (s) + a1 sY (s) + a0 Y (s) = b1 sU (s) + b0 U (s)
a2 s2 + a1 s + a0 Y (s) = (b1 s + b0) U (s)
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
(5)
(6)
(7)
Page 7
e(t)
Controller
u(t)
ys(t)
Plant
y(t)
Gp(s)
KcGc(s)
Sensor Dynamics
Hy(s)
Figure 4: Block diagram of typical closed-loop control.
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 8
EE/ME 701: Advanced Linear Systems
Section 1.3.3
1.3.3 Transfer Functions are Rational Polynomials
EE/ME 701: Advanced Linear Systems
Section 1.3.4
1.3.4 A transfer function has poles and zeros
A TF has a numerator and denominator polynomial, for example
GNA
G p (s) =
(t), (t)
Mp(t) Me(t)
Figure 5: Golden Nugget Airlines Aircraft. (t) [radians/second] is the
pitch-rate of the aircraft, and Mt (t) is the moment (torque) applied by
the elevator surface.
Consider the differential equation of the Aircraft pitch angle:
1
d 2 (t)
d (t)
d Mt (t)
+4
+ 5 (t) = 1
+ 3 Mt (t)
dt 2
dt
dt
N (s)
2 s2 + 8 s + 6
= 3
D (s) s + s s2 + 4 s + 0
The roots of the numerator are called the zeros of the TF, and the roots of
the denominator are called the poles of the TF. For example:
>> num = [2 8 6]
>> den = [1 2 4 0 ]
num =
2
8
6
den =
1
2
4
>> zeros = roots(num)
zeros =
-3
-1
>> poles = roots(den)
poles =
0
-1.0000 + 1.7321i
-1.0000 - 1.7321i
(10)
From (10) we get the TF, take LT of both sides, and rearrange:
s2 + 4 s + 5 (s) = (s + 3) Mt (s)
(s)
s+3
= 2
Mt (s)
s +4s +5
(11)
We can also use Matlabs system tool to find the poles and zeros
>> Gps = tf(num, den)
Transfer function:
Note:
We can write down the TF directly from the coefficients of the differential
equation.
We can write down the differential equation directly from the coefficients
of the TF.
Transfer functions, such as Eqn (11), are ratios of two polynomials:
s+3
numerator polymonial
(s)
=
:
Mt (s) s2 + 4 s + 5
denominator polynomial
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
>> zero(Gps),
>> pole(Gps),
We call a TF such as (11) a rational polynomial.
Page 9
(12)
%% Build the system object
2 s^2 + 8 s + 6
----------------s^3 + 2 s^2 + 4 s
ans = -3
-1
ans =
0
-1.0000 + 1.7321i
-1.0000 - 1.7321i
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 10
EE/ME 701: Advanced Linear Systems
Section 1.3.4
Interpreting the poles (p1, p2, ..., pn) and zeros (z1, ..., zm)
EE/ME 701: Advanced Linear Systems
Section 1.3.5
1.3.5 Properties of transfer functions
We can use the poles and zeros to write the TF in factored form:
Just as a differential equation can be scaled by multiplying the left and right sides
by a constant, a TF can be scaled by multiplying the numerator and denominator
by a constant.
2 s2 + 8 s + 6
b2 (s z1) (s z2)
=
s3 + 2 s2 + 4 s (s p1) (s p2) (s p3)
2 (s + 3) (s + 1)
=
(s 0) (s + 1 1.732 j) (s + 1 + 1.732 j)
G (s) =
Monic: A TF is said to be monic if an =1. We can always scale a TF to be monic.
If G1 (s) is scaled to be monic, then
b0
(13)
G1 (s) =
s + a1
With a complex pole pair we can do two things,
1. Use a shorthand
G (s) =
2 (s + 3) (s + 1)
s (s + 1 1.732 j)
with b0 = b0/an and a1 = a1 /an .
(Because poles always come in complex conjugate pairs.)
(a) Write the complex pole pair as a quadratic, rather than
G (s) =
1st
order terms
Rational Polynomial Form: A TF is in rational polynomial form when the
numerator and denominator are each polynomials. For example
2 (s + 3) (s + 1)
s (s2 + 2 s + 4)
G p (s) =
The zeros are values of s at which the transfer function goes to zero
(14)
An example of a TF not in rational polynomial form is:
The poles are values of s at which the transfer function goes to infinity
We can plot a pole-zero map:
2 s2 + 8 s + 6
s3 + s s2 + 4 s + 0
G3 (s) =
PZ Map
2 (s + 3)/ (s)
(s2 + 2 s + 4)/ (s + 1)
(15)
imaginary
>> Gps=tf([2 8 6], [1 2 4 0])
>> pzmap(Gps)
By clearing the fractions within the fraction, T3 (s) can be expressed in rational
polynomial form
1
0
G3 (s) =
1
2
4
2 (s + 3)/ (s)
(s2 + 2 s + 4)/ (s + 1)
(s) (s + 1)
2 s2 + 8 s + 6
2 (s + 3)(s + 1)
= 3
= 2
(s) (s + 1) (s + 2 s + 4)(s) s + 2 s2 + 4 s
Note the middle form above, which can be called factored form.
2
real
Figure 6: Pole-Zero constellation of aircraft transfer function.
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 11
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 12
EE/ME 701: Advanced Linear Systems
Section 1.3.5
General Form for a rational polynomial transfer function
for a rational polynomial transfer function is
G (s) =
sm + b
The general form
sm1 + b
bm
m1
1 s + b0
ansn + an1 sn1 + + a1 s + a0
1.3.5 Properties of Transfer function (continued)
(16)
So, for example, the aircraft transfer function from elevator input to pitch
rate gives a type 0 system:
(s)
s+3
=
Me (s) s2 + 4 s + 5
m = number of zeros, n = number poles
A TF with m n is said to be proper.
poles : s = 2 1 j ,
type : 0
But the TF from the elevator to the pitch angle gives a type I system:
When m < n the TF is said to be strictly proper
s+3
(s)
=
Me (s) s (s2 + 4 s + 5)
Example of a TF that is not proper:
2s2 + 5 s + 4
s+1
Section 1.3.5
System Type: A property of transfer functions that comes up often is the system
type. The type of a system is the number of poles at the origin.
Proper and Strictly Proper Transfer Functions:
G4 (s) =
EE/ME 701: Advanced Linear Systems
note: m = 2,
n=1
poles : s = 0, 2 1 j ,
type : I
If we put a PID controller in the loop, which adds a pole at the origin, the
system will be type II.
such a TF can always be factored by long division:
G4 (s) =
2s2 + 5 s + 4 2 s (s + 1) 3 s + 4
3s +4
1
=
+
= 2 s+
= (2s + 3)+
s+1
(s + 1)
s+1
s+1
s+1
A non-proper TF such as G4 (s) has a problem: As s = j j , the gain
goes to infinity !
Since physical systems never have infinite gain at infinite frequency,
physical systems must have proper transfer functions.
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 13
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 14
EE/ME 701: Advanced Linear Systems
Section 2.0.0
EE/ME 701: Advanced Linear Systems
Section 2.0.0
2 Closing the loop, feedback control
1.3.6 The Impulse Response of a System
When we have a system such as in figure 7, the Laplace transform of the
output (a signal) is given by
u(t)
Gp(s)
r(t) + u(t)
y(t)
y(t)
Gp(s)
sensor
Y (s) = G p (s) U (s)
(17)
A unit impulse input is a very short pulse with an area under the curve of
the pulse of 1.0.
Since the Laplace transform of a unit-impulse is 1, then the Laplace
transform of the impulse response is the transfer function
Open Loop
Closed Loop
Figure 8: A plant with TF G p (s) in open and closed loop. Closed loop requires a
sensor.
Feedback is the basic magic of controls. A feedback controller can
Make an unstable system stable . . . . . . . . . . . . . . . . . . Helicopter autopilot
Make a stable system unstable . . . . . . . . . . . . . . . . Early fly-ball governors
Yimp (s) = G p (s) Uimp (s) = G p (s) 1 = G p (s)
Make a slow system fast . . . . . . . . . . . Motor drive, industrial automation
Make a fast system slow . . . . . . . . F16 controls, approach / landing mode
u(t)
Make an inaccurate system accurate . . . . . . . . . . . . . . . . . . . . machine tools
y(t)
t
u(t): impulse
U(s)=1
The magic comes because closing the loop changes the TF
t
System
Gp(s)
Open loop:
y(t): impulse response
Y(s)=Gp(s)
Closed loop:
Figure 7: The transfer function is the Laplace transform of the impulse response.
The connection between the impulse response and TF can be used to
determine the TF
Apply an impulse and measure the output, y (t). Take the LT and use
G p (s) = Y (s).
The connection between the impulse response and TF helps to understand
the mathematical connection between an LT and a TF
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Y (s)
= G p (s)
U (s)
Page 15
Y (s) = G p (s) (R (s) Y (s)) = G p (s) R (s) G p (s)Y (s)
Y (s) (1 + G p (s)) = G p (s) R (s)
G p (s)
Y (s)
=
R (s)
1 + G p (s)
(18)
Y (s)
Forward Path Gain
=
= Try (s)
R (s)
1 + Loop Gain
(19)
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 16
Mt(t)
+
s+3
s +4s+5
(t)
Figure 9: Simple feedback of aircraft pitch angle.
s+3
G p (s)
(s)
s (s2 +4 s+5)
=
=
R (s) 1 + G p (s) 1 + 2 s+3
s (s +4 s+5)
(20)
Eqn (20) is not in rational polynomial form, so
s+3
(s)
s (s2 +4 s+5)
=
R (s) 1 + 2 s+3
s (s +4 s+5)
Analyzing the response
Gps
= tf([1 3], [1 4 5 0])
Try = tf([1 3], [1 4 6 3])
figure(1), clf
[Y_open, Top]
= step(Gps, 6);
[Y_closed, Tcl] = step(Try, 6);
plot(Top, Y_open, Tcl, Y_closed)
xlabel(t [seconds]);
ylabel(\Omega, pitch-rate)
title(Open- and closed-loop)
text(3, 1.6, Open-loop, rotation, 45)
text(4, 0.8, Closed-loop)
SetLabels(14)
print(-deps2c, OpenClosedResponse1)
s s2 + 4 s + 5
s+3
=
(21)
s (s2 + 4 s + 5) s (s2 + 4 s + 5) + (s + 3)
The closed-loop TF is still not quite in Rat Poly form, here is the final step:
(s)
s+3
=
R (s) s3 + 4 s2 + 6 s + 3
(22)
Open and closedloop response, aircraft
3.5
The open-loop
system is type I
The closed-loop
system is type 0
r(t)
(s)
2.5
1.5
1
Closedloop
(t)
0.5
R(s)
0
0
Figure 10: Block with r (t) as input and (t) as output.
Part 1A: Controls Review-By-Example
The response
completely
changes !
, pitchrate
r(t)
(Revised: Sep 08, 2012)
Section 2.0.0
oo
p
Example, wrapping a simple feedback loop around the aircraft dynamics
EE/ME 701: Advanced Linear Systems
n
l
Section 2.0.0
O
pe
EE/ME 701: Advanced Linear Systems
3
t [seconds]
Figure 11: Open and Closed loop response of the aircraft, the two responses have
very different characteristics.
Page 17
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 18
EE/ME 701: Advanced Linear Systems
Section 2.0.0
e(t)
Kc
Mt(t)
Section 2.1.0
2.1 Analyzing a closed-loop system
Introduce a proportional controller gain
r(t)
EE/ME 701: Advanced Linear Systems
s+3
s2 + 4 s + 5
A typical, basic loop (such as velocity PI control of a motor drive) has 3
components:
(t)
1. Plant (thing being controlled)
Figure 12: Feedback for aircraft pitch control, with P-type gain Kc .
Look at Kc = 1.0, 3.0, 10, 0
Kc = 1; Try1 = tf(Kc*[1 3], [1 4 5
Kc = 3; Try2 = tf(Kc*[1 3], [1 4 5
Kc = 10; Try3 = tf(Kc*[1 3], [1 4 5
figure(1), clf
...
plot(Top, Y_open, Tcl, Y_closed1,
...
2. Controller or compensator (usually a computer, a often PLC for motor
drives)
0]+Kc*[0 0 1 3])
0]+Kc*[0 0 1 3])
0]+Kc*[0 0 1 3])
3. A sensor
r(t)
e(t)
Controller
KcGc(s) =
Nc(s)
Kc D (s)
c
u(t)
Plant
Gp(s) =
y(t)
Np(s)
Dp(s)
Tcl, Y_closed2, Tcl, Y_closed3)
Sensor Dynamics
Ny(s)
Hy(s) = D (s)
y
ys(t)
Open and closedloop response, aircraft
3.5
Figure 14: Basic loop with a plant, compensator and sensor.
lo
op
The TF is given as
pe
2
O
System gets less
stable as Kc
increases
2.5
, pitchrate
System gets much
faster as Kc
increases
Try (s) =
Kc =10
1.5
Kc =3
Kc Gc G p
Y (s)
Forward Path Gain
=
=
R (s)
1 + Loop Gain
1 + Kc Gc G p Hy
1
Closedloop, Kc =1
Often, for the controls engineer the plant, G p (s), is set (e.g., the designer of
a cruise-control does not get to change the engine size).
0.5
0
0
3
t [seconds]
Figure 13: Open and Closed loop response of the aircraft, with Kc = 1.0,
Kc = 3.0, and Kc =10.0 .
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 19
As a controls engineer, we get to pick Gc (s) and maybe influence Hy (s)
(e.g., by convincing the project manager to spring $$$ for a better sensor).
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 20
EE/ME 701: Advanced Linear Systems
Section 2.2.0
2.2 Common controller structures:
PI
PID
Proportional-Derivative
Prop.-Integral
Prop.-Int.-Deriv.
Gc (s) =
k p s +ki
s
Section 2.3.0
2.3 Analyzing other loops
PD
Gc (s) = kd s + k p
EE/ME 701: Advanced Linear Systems
Gc (s) =
kd s2 +k p s +ki
s
d(t)
Lead-Lag
Disturbance Filter
Nd
Gd(s) = D
d
(s+z ) (s+z )
Gc = Kc (s+p11 ) (s+p22 )
r(t)
Input Shaping
Hr(s)
Common Applications
PID: Many many places, Astrom has estimated that 80% of controllers are
PID (good, speed accuracy, stability)
Lead-Lag: Used where a pole at the origin is unacceptable, can be as good
as PID (notice, 5 parameters rather than 3)
e(t)
Controller
N
KcGc(s) = Kc Dc
c
uc(t)
Plant
up(t)
Gp(s) =
y(t)
Np
Dp
Sensor Dynamics
Ny
Hy(s) = D
y
ys(t)
PI: Velocity control of motor drives, temperature control (good speed and
accuracy, acceptable stability)
PD: Position control where high accuracy is not required (good speed and
stability, so-so accuracy)
Figure 15: Basic loop with a disturbance input, d (t) , and sensor noise, Vs (t)
added.
In some cases we may want to consider additional inputs and outputs.
Many systems have a disturbance signal that acts on the plant, think of
wind gusts and a helicopter autopilot.
All systems have sensor noise.
Any signal in a system can be considered an output. For example, if we
want to consider the controller effort, uc (t), arising due to the reference
input
Tru (s) =
Hr (s) Kc Gc (s)
Uc (s) Forward Path Gain
=
=
R (s)
1 + Loop Gain
1 + Kc Gc (s) G p (s) Hy (s)
(23)
If we wanted to consider the error arising with a disturbance, we would have
Tde (s) =
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 21
Gd (s) G p (s) Hy (s) (1)
E (s)
=
D (s) 1 + Kc Gc (s) G p (s) Hy (s)
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Vs(t)
(24)
Page 22
EE/ME 701: Advanced Linear Systems
Section 2.3.0
2.3 Analyzing other loops (continued)
As a final example, lets consider the output arising with sensor noise
Tvy (s) =
Y (s) Hy (s) (1) Kc Gc (s) G p (s)
=
Vs (s)
1 + Kc Gc (s) G p (s) Hy (s)
(25)
The example transfer functions, Eqns (23), (24) and (25) show some
interesting properties. The TFs are repeated here (omitting the many (s)s)
Try (s) =
Hr Kc Gc G p
1 + Kc Gc G p Hy
Tru (s) =
Hr Kc Gc
1 + Kc Gc G p Hy
Ted (s) =
Gd G p Hy (1)
1 + Kc Gc G p Hy
Tvy (s) =
Hy (1) Kc Gc G p
1 + Kc Gc G p Hy
EE/ME 701: Advanced Linear Systems
Section 2.3.0
If we consider what happens as Kc , we can see what happens for very
high gain. For this, assume that Hr (s) = 1.0 and Gd (s) = 1.0, since these
two terms merely pre-filter inputs.
When Kc , 1 + Kc Gc G p Hy Kc Gc G p Hy , so
Try (s)
Kc Gc G p
1
=
Kc Gc G p Hy Hy
Tde (s)
Gd G p Hy (1) Gd v
=
Kc Gc G p Hy
Kc Gc
Tru (s)
,
Kc Gc
1
=
Kc Gc G p Hy G p Hy
Tvy (s)
Hy (1) Kc Gc G p
= 1
Kc Gc G p Hy
Try (s) 1/Hy (s) shows that the TF of the plant can be compensated, it
disappears from the closed-loop TF as Kc .
Try (s) 1/Hy (s) also shows that the TF of the sensor can not be
compensated. If the characteristics of Hy (s) are bad (e.g., a cheap sensor)
there is nothing feedback control can do about it !
Tde (s) 1/Kc Gc shows that disturbances can be compensated, as Kc ,
errors due to disturbances go to zero ;)
The denominators are all the same
The poles are the same for any input/output signal pair
The stability and damping (both determined by the poles) are the same
for any signal pair
The numerators are different
The zeros are in general different for each input/output signal pair
Since the numerator help determine if the signal is small or large, signals
may have very different amplitudes and phase angles
Tru (s) 1/G p Hy shows that U1 (s) does not go up with Kc , and also, if the
plant has a small gain (G p (s) is small for some s = j ) then a large control
effort will be required for a given input.
Tvy (s) 1 shows that there is no compensation for sensor noise. If there
is sensor noise, it is going to show up in the output !
Summary:
Feedback control can solve problems arising with characteristics of the
plant, G p (s), and disturbances, d (t).
Feedback control can not solve problems with the sensor, Hy (s), or
sensor noise, vs (t) .
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 23
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 24
EE/ME 701: Advanced Linear Systems
Section 3.0.0
EE/ME 701: Advanced Linear Systems
3 Analysis
Section 4.1.0
4 Working with the pole-zero constellation
4.1 Basics of pole-zero maps, 1st order, p1 =
Analysis is determining system performance based on a model.
You have seen that:
The performance criteria any controls engineer should be aware of are seen
in table 4.
Each real pole is associated with a mode of the system response.
The criteria fall into 3 groups
A real pole gives the terms of Eqn (27), as seen in figure 16.
In EE/ME 574 we will be using all 18 of the criteria listed
Y (s) =
Many performance measures are introduced in first semester controls (we
will review them here). Those with (*) are introduced in 2nd semester
controls.
As seen in table 4, some performance measures are evaluated in the time
domain, and others in the frequency domain.
Amplitude
1
0
(26)
Impulse response
PoleZero Map
()
y (t) = C1 et
2
Imaginary Axis [sec1]
Speed
Degree of Stability Accuracy
Time
Rise Time
Stable / Unstable
Steady-State Error
Domain
Settling Time
Overshoot
ISE ()
IAE ()
Peak Time
Damping Factor
Frequency Pole Locations
Pole Locations
Disturbance Rejection
Noise Rejection ()
Domain
Bandwidth ()
Phase Margin ()
()
()
(Or S-plane) Cross-over Freq.
Gain Margin
Tracking Error ()
Table 4: A range of controller specifications used in industry.
Note: () marks items developed in 2nd semester controls.
3
C1
=
s+2 s+
Faster
h(t) =
t
t/
2 t
e
=e =e
C1 / e
Splane
2
3
2
1
0
1
Real Axis [sec ]
0
0
1
2
Time (secs)
Figure 16: First order pole and impulse response.
Depending on the performance measure, one of four methods of evaluation
is used:
1. Evaluated directly from the transfer function (e.g., steady state error)
2. Evaluated by looking at the system response to a test signal (e.g., trise )
3. Evaluated from the pole-zero constellation (e.g., stability, settling time)
4. Evaluated from the bode plot (e.g., phase margin)
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 25
A real pole has these characteristics:
y (t) et/ where [sec] = 1/ is the time constant.
Further to the left indicates a faster response (smaller ).
The pole-zero constellation does not show either the KDC or the Krlg of
the TF or the amplitude of the impulse response.
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 26
EE/ME 701: Advanced Linear Systems
Section 4.1.1
Example 1: Shifting poles to the left accelerates transient decay
And we have seen for second order:
Two example first-order responses are shown.
X Case:
Case:
= 16 [sec1]
Each complex pole pair gives a mode of the system response.
= 1/4 [sec]
A complex pole pair gives the terms of Eqn (27), as seen in figure 19.
= 1/16 [sec]
Using the Laplace transform pair
F (s) =
Bc s + (Bc + Bs )
f (t) = et (Bc cos (t) + Bs sin (t))
s2 + s s + 2n
one finds
Y (s)
Faster Decay
5
15
10
5
0
Real Axis [sec1]
y (t)
Figure 17: A change in pole location changes the decay rate and damping.
2 s + 14
b1 s + b0
=
s2 + 2 s + (2 + 2) s2 + 3 s + 18.25
PoleZero Map
0.5
0.5
2
t
0.5
Time (secs)
0
0
0.5
Time (secs)
6 4 2
0
Real Axis [sec1]
t/
=e
1.5t
=e
0
1
5
8
0
0
Impulse response
3
Splane
System step response
1.5
= 0.06 [sec]
Amplitude
Amplitude
Imaginary Axis [sec ]
X System step response
1.5
= 0.25 [sec]
(27)
A et cos (t + ) = 3.4 e1.5t cos (4t 53.97o) (28)
Amplitude
Imaginary Axis [sec1]
PZ map showing two systems
5
20
Section 4.2.0
4.2 Basis of pole-zero maps, 2nd order, p1,2 = j
4.1.1 Real Pole location and response characteristics
4 [sec1]
EE/ME 701: Advanced Linear Systems
2
0
3.40 cos(4t 54.0o)
2
4
Time (secs)
Figure 19: Second order pole and impulse response.
Figure 18: A change in changes the time constant.
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 27
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 28
EE/ME 701: Advanced Linear Systems
Section 4.2.1
4.2.1 Notation for a complex pole pair
*
p1, p1
=1.5 j 4
p1
X Case: 1
Case: 4
[sec1]
PO
4 [rad/sec] 4.12 [rad/sec] 0.25
[Dim less]
44%
4 [rad/sec] 5.66 [rad/sec] 0.71
[Dim less]
4%
PZ map with sgrid (indicates and n)
PZ map showing two systems
1
0
Real Axis [sec1]
Description
Given by
Units
Decay Rate
Damped Nat. Freq.
Natural Freq.
Pole Angle
Damping Factor
p1 = + j
p1 = + j
2n = 2 + 2
= atan2 (, )
= /n
[sec1]
[rad/sec]
[rad/sec]
[deg]
Dimless, []
4
2
Faster Decay
0
2
4
6
8
4 2
0
2
1
Real Axis [sec ]
0.7
2
0
Amplitude
Polar
2
4
0.9
0.7 0.5 0.3
6 4 2
0
2
1
Real Axis [sec ]
6
8
System step response
X System step response
1.5
p1 = + j
p1 = n (90 + )
= n
= sin ()
p
= 1 2 n
= / n
p
n 1 2
H (s) =
H
(s)
=
s2 + 2 n s + 2n
(s + )2 + 2
Table 6: The terms of table 5 relate to rectangular or polar form for the poles.
0.5 0.3
4 0.9
Figure 21: A change in changes the decay rate and damping.
Table 5: Factors derived from the location of a complex pole.
(Note: Franklin et al. often use , d and .)
Rectangular
6
Imaginary Axis [sec1]
Splane
p1
= 0.24
46 % Overshoot
0.5
0
0
1.5
Amplitude
Figure 20: Complex pole pair with n and defined.
or
or d
n
or
[sec1]
6
3
Term
Example 1: Shifting poles to the left accelerates transient decay
Imaginary Axis [sec1]
Imaginary Axis [sec ]
6
4
Section 4.2.2
4.2.2 Complex pole location and response characteristics
A complex pole pair can be expressed in polar or rectangular coordinates:
PZ Map,
EE/ME 701: Advanced Linear Systems
2
4
Time (secs)
= 0.71
4 % Overshoot
0.5
0
0
2
4
Time (secs)
Figure 22: Step response: a change in , is unchanged.
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 29
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 30
Section 4.2.2
Example 2: Shifting poles out vertically increases oscillation frequency
X Case: 4 [sec1]
4 [rad/sec]
4%
Case: 4 [sec1] 16 [rad/sec] 16.49 [rad/sec] 0.24 [Dim less] 44%
Faster
Oscillation
0
10
20
0.5 0.3
0.9
20
12
10
0.9
20
20 10
0
10
Real Axis [sec1]
30
0.5
0
0
1
Time (secs)
4%
16 [rad/sec] 22.63 [rad/sec] 0.71
[Dim less]
4%
5.66 [rad/sec]
10
0
10
20
0.7 0.50.3
20
0.9
10
0
28 20 12
= 0.24
46 % Overshoot
0.5
0.9
20
0.7 0.50.3
30 20 10
0
10
1
Real Axis [sec ]
System step response
1.5
= 0.71
4 % Overshoot
0.5
0
0
1
Time (secs)
= 0.71
4 % Overshoot
0.5
= 0.25 [sec]
1
Time (secs)
10
X System step response
0
0
0.71
4 [rad/sec]
20
1.5
Amplitude
1.5
= 0.71
4 % Overshoot
PO
Figure 25: A radial change in pole location changes the decay rate and oscillation
frequency, but not the damping.
System step response
Amplitude
Amplitude
1.5
[Dim less]
30 20 10
0
10
1
Real Axis [sec ]
0.7 0.5 0.3
20 10
0
10
Real Axis [sec1]
Figure 23: A change in changes the oscillation frequency and damping.
X System step response
[sec1]
PZ map showing two systems PZ map with sgrid (indicates and n)
10
28
[sec1]
Case: 16
Imaginary Axis [sec ]
0.7
20
Imaginary Axis [sec1]
Imaginary Axis [sec1]
20
30
X Case:
PZ map with sgrid (indicates and n)
PZ map showing two systems
10
PO
0.71 [Dim less]
5.66 [rad/sec]
Example 3: Shifting poles out radially rescales time
Section 4.2.2
Imaginary Axis [sec ]
EE/ME 701: Advanced Linear Systems
Amplitude
EE/ME 701: Advanced Linear Systems
= 0.06 [sec]
2
0
0
0.2
0.4
Time (secs)
Figure 26: Maintaining , time is rescaled.
Figure 24: Step response: a change in .
Note: The S-plane has units of [sec1 ].
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 31
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 32
EE/ME 701: Advanced Linear Systems
Section 4.2.3
4.2.3 The damping factor:
b0
b0
=
s2 + s a1 + a0 s2 + 2 n s + 2n
(29)
value
System characteristic
<0
=0
Unstable
Marginally Stable (poles on imaginary axis)
0<<1
giving:
Section 4.2.3
A full list of what we can learn from is seen in table 7.
Plugging = /n back into the 2nd order form gives:
H (s) =
EE/ME 701: Advanced Linear Systems
a1
a1
=
=
2 n 2 a0
(30)
is defined by Eqn (30) for either real poles: ( 1.0), or
a complex pole pair: (0.0 < < 1.0).
Stable Complex Pole Pair
0.5 < 1.0 Typical value for a well designed controller
=1
>1
Critical Damping (repeated real poles)
Over Damped (two separate real poles)
Table 7: Ranges of damping factor.
As illustrated in figure 24, above, on the range 0.0 < < 1.0, the damping
factor relates to percent overshoot. For a system with two poles and no
zeros, the percent overshoot is given by Eqn (31) and plotted in figure 27
q
P.O. = 100 e / 1 2
(31)
S-plane
Stable Region
( >0 )
Unstable Region
( <0 )
Marginally Stable Region ( =0 )
100
Figure 28: Damping factor and stability.
90
80
Percent Overshoot
70
60
50
40
30
20
10
0
0
0.1
0.2
0.3
0.4
0.5
0.6
Damping factor [.]
0.7
0.8
0.9
Figure 27: Percent overshoot versus damping fact. Exact for a system with two
complex poles and no zeros, and approximate for other systems.
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 33
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 34
EE/ME 701: Advanced Linear Systems
Section 4.3.0
EE/ME 701: Advanced Linear Systems
4.3 Determining approximate performance measures from a
Section 4.3.3
4.3.1 Rise time from pole locations, 10%-90%
dominant second-order mode
1st order:
Returning to performance measures, in section 3 weve seen that these are
defined from the step response.
Rise time
Settling time
Peak time
Overshoot
Step response of complex pole pair
1.5
Step response of complex pole pair
2.5
Output y(t)
Output y(t)
ss
0.5
| |
T : 1090%
r
Rise Time
yp
r (1)
ss
2%
4 |
6
8
Ts: 98% Settling Time
tr =
1.8
n
y (1)
ss
1.5
Time [sec]
(32)
4.3.3 Settling time from pole locations
Settling time is sometimes defined as the time to approach within 4% or 2% or
even 1% of the steady-state value. These give slightly different definitions of Ts .
The one will will use (corresponding approximately to 2%) is:
|
tp
90
y (0)
0
ss
tp =
Mp
0.5
rss(0) y10
1.8
4.3.2 Peak time from pole locations
Overshoot
1
tr =
2nd order:
T : 1090% Rise Time
ts =
0 |
t
10
| 1
t
90
2
Time [sec]
(33)
Figure 29: Quantity definitions in a CL step response.
While there are not equations that give these measures exactly for any
system other than 2 poles and no zeros. But for this special case
T (s) =
b0
s2 + 2 n s + 2n
We can derive the relations below.
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 35
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 36
EE/ME 701: Advanced Linear Systems
Section 5.0.0
5 Design
EE/ME 701: Advanced Linear Systems
Section 5.2.0
5.1 Design methods
The major design methods are:
In some sense, Design is the opposite of Analysis
Root locus
Analysis
Speed, Stability: determine by determining pole locations
Accuracy: increase the system type, check the SSE
Completed
Controller
Design
Performance
Specifications
Frequency response
()
Speed: Bandwidth and Cross-over frequency directly from bode plot
Stability: Phase margin, Gain margin directly from bode plot
Accuracy: tracking accy., disturbance rejection directly from bode plot
Design
State Space methods
Figure 30: Design = Analysis1
In Analysis, we use mathematical methods to determine performance
specifications from a completed controller design (all structure and
parameters specified).
Design using state-space design methods, check speed, stability and
accuracy from the step response
5.2 Root Locus Design
Devised by Walter R. Evans, 1948 (1920 1999)
In Design, we use whatever method works !
Mathematical
W.R. Evans, Control system synthesis by
root locus method, Trans. AIEE, vol. 69,
pp. 6669, 1950.
Gut feeling
Trial and error
(Amer. Institute of Elec. Engs became
IEEE in 1963)
Calling a colleague with experience
to determine a controller structure and parameters to meet performance
goals.
Part 1A: Controls Review-By-Example
()
(Revised: Sep 08, 2012)
Page 37
Evans was teaching a course in controls,
and a student (now anonymous) asked a
question about an approximation.
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 38
EE/ME 701: Advanced Linear Systems
Section 6.0.0
6 Summary
EE/ME 701: Advanced Linear Systems
Section 7.0.0
7 Glossary of Acronyms
Dynamic systems are governed by differential equations
LT: Laplace Transform
Every input or output signal of the system has a unique Laplace transform
TF: Transfer Function
For linear systems,
FPG: Forward path gain
The ratio of Laplace transforms, however, does not depend on the signals.
LG: Loop Gain, (also Gl (s))
The ratio depends only on properties of the system. We call it the transfer
function. For example:
RHP: Right-half of the S plane (unstable region)
LHP: Left-half of the S plane (stable region)
b1 s + b0
Guy (s) = 2
s + a1 s + a0
(34)
PO: Percent Overshoot
Transfer function gives us many results
The pole locations tells us the stability and damping ratio
We can get approximate values for rise time, settling time and other
performance measures
Control system analysis is the process of determining performance from a
system model and controller design
Control system
* Steady-state error
* Step response
LF: Low frequency (as in LF gain).
frequency.
Frequencies below the cross-over
HF: High frequency (as in HF gain).
frequency.
Frequencies above the cross-over
CL: Closed loop
OL: Open loop
We have various tools for evaluating performance, including
* Pole locations
SSE: Steady-state error
* Bode plot
P, PD, PI, PID: Proportional-Integral-Derivative control, basic and very common
controller structures.
Design =Analysis1
it is the process of determining a controller design given a system model
and performance goals.
The root locus method is one method for controller design.
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 39
Part 1A: Controls Review-By-Example
(Revised: Sep 08, 2012)
Page 40
EE/ME 701: Advanced Linear Systems
EE/ME 701: Advanced Linear Systems
Building Linear System Models
Contents
1 System modeling,
classical vs. state-variable modeling
1.1
1.2
1.3
1.4
State variable modeling: . . . . . . . . . . . . . . . . . . . . . .
1.2.1
Signals that are not states or inputs . . . . . . . . . . . . .
1.2.2
Writing the n first-order differential equations in standard
form . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Standard form for the linear, time-invariant state-variable
model: . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
Why consider linear state-variable models ?
(when algebraic form is just fine) . . . . . . . . . . . . . . . . .
11
A quick history of state variable modeling . . . . . . . . . . . . .
12
2 Formal properties of systems
21
3.6
Lumped Parameter system . . . . . . . . . . . . . . . . . . . . .
22
3.7
Continuous-time versus sampled (discrete-time) signals and systems 23
3.8
Discrete-time (sampled) signals and systems . . . . . . . . . . . .
24
3.9
Continuous signals and systems, continuity in the mathematical
sense . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
3.10 Quantized signal . . . . . . . . . . . . . . . . . . . . . . . . . .
26
3.11 Deterministic Signals and systems . . . . . . . . . . . . . . . . .
27
3.12 A Note on units . . . . . . . . . . . . . . . . . . . . . . . . . . .
27
4 State, revisited
14
4.1.2
4.1.3
For direct application of Superposition and Scaling, state
must be zero . . . . . . . . . . . . . . . . . . . . . . . .
30
Definition of a Linear System considering non-zero state
(e.g., DeCarlo definition 1.8) . . . . . . . . . . . . . . . .
31
Time-invariant system, considering non-zero state
32
. . . .
6 Steps of building a state-variable model (for a linear system)
35
18
7 Examples with state-variable models
37
2.2
Simple example system
3.1
Admissible signal . . . . . . . . . . . . . . . . . . . . . . . . . .
18
3.2
Linear System (without internal variables) . . . . . . . . . . . . .
19
3.3
Time-invariant system (without internal variables) . . . . . . . . .
20
3.4
Causal (non-anticipative) system . . . . . . . . . . . . . . . . . .
21
(Revised: Sep 10, 2012)
4.1.1
30
16
15
3 Classification of Systems and Signals
Modified definition of linearity, considering state . . . . . . . . .
33
A system Maps inputs to outputs . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
4.1
28
5 Standard Notation for State Model
2.1
Part 5: Models of Linear Systems
Realizable system . . . . . . . . . . . . . . . . . . . . . . . . . .
Modeling with a single, higher-order differential equation
(sometimes called classical modeling) . . . . . . . . . . . . . . .
1.2.3
3.5
Page 1
7.1
Example 1, an electrical circuit
. . . . . . . . . . . . . . . . . .
37
Building and exercising the circuit model . . . . . . . . .
43
Example 2, a mechanical system . . . . . . . . . . . . . . . . . .
45
7.2.1
53
7.1.1
7.2
Building and exercising the suspension model
Part 5: Models of Linear Systems
. . . . . .
(Revised: Sep 10, 2012)
Page 2
EE/ME 701: Advanced Linear Systems
7.2.2
7.3
7.4
EE/ME 701: Advanced Linear Systems
Conclusions, Quarter Suspension Example . . . . . . . .
59
Example 3, state-variable model from a differential equation . . .
60
7.3.1
Steps of deriving the state-variable model . . . . . . . . .
61
7.3.2
Building and exercising a differential equation model
. .
64
State-variable model and simulation diagram . . . . . . . . . . .
66
8 Some basic operations with state-variable models
8.1
73
Deriving the transfer function from the state-variable model . . . .
73
8.1.1
Interpreting the transfer function
. . . . . . . . . . . . .
74
8.1.2
DC gain of a state-variable model . . . . . . . . . . . . .
77
1 System modeling,
classical vs. state-variable modeling
To introduce state modeling, lets first look at an example. Consider the
RLC circuit of figure 1. The parameters and signals of the circuit are:
Parameters
8.2
Interpreting D . . . . . . . . . . . . . . . . . . . . . . . .
77
Coordinate transformation of a State Variable Model . . . . . . .
78
8.2.1
81
Example coordinate transformation . . . . . . . . . . . .
9 State-variable Feedback control
Determination of a new model with state feedback
. . . . . . . .
85
9.2
State-variable feedback example: Inverted pendulum . . . . . . .
87
9.2.1
Designing a pole placement controller
. . . . . . . . . .
90
9.2.2
LQR Design . . . . . . . . . . . . . . . . . . . . . . . .
92
volts
amp
volts
amp/sec
amps
volt/sec
Units
Signals
Units
Vs (t)
[volts]
VL (t)
[volts]
iS (t)
[amps]
iL (t)
[amps]
VR (t)
[volts]
VC (t)
[volts]
iR (t)
[amps]
iC (t)
[amps]
iR
+
is
Vs(t)
10 Conclusions
Signals
Table 1: List of parameters and signals.
84
9.1
Units
L
8.1.3
Section 1.0.0
iL
+
L
iC
+
C
VC(t)
95
Figure 1: RLC Circuit with voltages and currents marked.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 3
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 4
EE/ME 701: Advanced Linear Systems
Vs(t)
iL
+
EE/ME 701: Advanced Linear Systems
Section 1.1.0
1.1 Modeling with a single, higher-order differential equation
iR
is
Section 1.0.0
iC
(sometimes called classical modeling)
VC(t)
Develop the nth order differential equation, write:
iR (t) iL (t) iC (t) = 0
1
(Vs (t) VC (t)) iL (t) CVC (t) = 0
R
1
1
1
VC (t) + iL (t) =
Vs (t)
VC (t) +
RC
C
RC
The constituent relations are:
VR (t) = R iR (t)
d iL (t)
VL (t) = L
dt
d VC (t)
iC (t) = C
dt
(1)
Where known quantities are on the right and unknowns on the left.
(2)
In Equation (5) we have unknown signals VC (t) and iL (t) on the left, we
must eliminate one of them, to have one equation in one unknown.
Take the derivative of Eqn (5) to produce iL (t), and use inductor
constituent relation and VC (t) = VL (t) to give:
The continuity constraints are (Kirchhoffs laws for electrical systems):
VR (t) +VL (t) Vs (t) = 0
(3)
VC (t) VL (t) = 0
is (t) iR (t) = 0
iR (t) iL (t) iC (t) = 0
(5)
(4)
1
1
1
VC (t) +
VC (t) + iL (t) =
Vs (t)
RC
C
RC
1
1
1
VC (t) +
VC (t) =
Vs (t)
VC (t) +
RC
LC
RC
(6)
Limitations of classical modeling:
Complexity goes up quickly in the number of variables, 3rd order 4-5X
as much algebra as 2nd order , 4th order 10-20 X more algebra.
Two approaches to modeling this type of dynamic system
1. Classical modeling, with a single high-order differential equation
The significance of the initial conditions is unclear: VC (t0 ) , VC (t0 ) ...,
how to get VC (t0 ) ?
2. State-variable modeling
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 5
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 6
EE/ME 701: Advanced Linear Systems
Section 1.2.0
1.2 State variable modeling:
EE/ME 701: Advanced Linear Systems
Section 1.2.1
1.2.1 Signals that are not states or inputs
Develop n first order differential equations in terms of states and inputs:
Determine the state variables of the system, these are variables that
make up the initial condition for the system (formal definition and more
examples coming later).
For an nth order system, there are n states
For the RLC circuit, there are two states, we can choose them to be
VC (t) and iL (t), giving the state vector:
x (t) =
VC (t)
iL (t)
Systems generally have many signals. Table 2 shows 8 signals for this
simple circuit. Only 2 are states, and 1 is an input, so 5 are just signals
within the system.
A Deep Property of state variable modeling: all of the signals in the
system have a unique expression in terms of the states and inputs.
If no expression exists, the model does not have enough states, and
If there are multiple possible expressions, the model has redundant
states that should be removed from the state vector.
Examples of using Eqns (1) - (4) to write other signals in terms of the
states, VC (t) and iL (t) and the input Vs (t)
We are working toward the state equation:
x (t) = A x (t) + B u (t)
(7)
VL (t) = VC (t)
VR (t) = Vs (t) VL (t) = Vs (t) VC (t)
Starting with differential equations that come directly from the model:
1
VC (t) =
iC (t)
C
(8)
1
iL (t) = VL (t)
L
(9)
Note that the derivative is written on the left, and all signals that
determine the derivative are on the right.
n
o
All signals on the right must be states, VC (t) , iL (t) , or inputs, Vs (t)
iR (t) = VR (t) /R = 1/R (Vs (t) VC (t))
Use the basic modeling equations to re-write the differential equations with
only states and inputs on the right hand side:
1
1
1
iC (t) = (iR (t) iL (t)) =
VC (t) =
C
C
C
1
(Vs (t) VC (t)) iL (t)
R
(10)
1
1
iL (t) = VL (t) = VC (t)
L
L
(11)
Notice that in Eqns (8) and (9), the right hand terms including iC (t) and
VL (t), neither of which is a state or an input.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 7
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 8
EE/ME 701: Advanced Linear Systems
Section 1.2.2
1.2.2 Writing the n first-order differential equations in standard form
EE/ME 701: Advanced Linear Systems
1.2.3 Standard form for the linear, time-invariant state-variable model:
The model equations are written with:
Input
u(t)
Each state derivative on the left
A, B, C, D
(12)
(13)
Standard form for state-variable model equations:
(14)
State Equation: x (t) = A x (t) + B u (t)
(16)
Output Equation: y (t) = C x (t) + D u (t)
(17)
y (t) = C x (t) + D u (t)
Putting Eqns (12) and (13) in the form of (14)
VC (t)
iL (t)
1
RC
1
L
C1
VC
x(t)
Figure 2: Block diagram showing the elements of a state-variable model,
including inputs, outputs and states, and model matrices A, B, C, D.
Put the model equations in matrix / vector form:
x (t) = A x (t) + B u (t)
Output
y(t)
System
The right hand side written with states and inputs as the only signals
1
1
1
VC (t) =
VC (t) iL (t) +
VS (t)
RC
C
RC
1
iL (t) = VC (t)
L
Section 1.2.3
(t) +
1
RC
iL
0
0
h
i VC
(t) + [0] Vs (t)
y (t) =
1 0
iL
Signals
Vs (t)
x (t)
Name
State
Vector
u (t)
(15)
Input
Units
"
[volts]
[amps]
Parameters
#
A=
"
C1
0
"
1
L
B=
[volts]
C=
Vector
y (t)
Output Vector
1
RC
D = [0]
x (t) =
VC (t)
iL (t)
[volts]
[amps]
Name
1
RC
[volts]
where the state vector is:
x (t0 )
Initial Condition :
System Matrix
Input Matrix
Output Matrix
Feed-forward
Units
i
h
volts/sec
sec1
i amp
h
amps/sec
1
sec
volt
" 1 #
h sec i
amps/sec
volt
[]
volts
amp
[]
i i
Matrix
Table 2: List of parameters and signals of the State-Variable Model. Each of the
seven elements has a name, and each has physical units, which depend
on the details of the system.
States are signals (i.e., functions of time)
States have units (like all physical signals)
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 9
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 10
EE/ME 701: Advanced Linear Systems
Section 1.3.0
1.3 Why consider linear state-variable models ?
EE/ME 701: Advanced Linear Systems
Section 1.4.0
1.4 A quick history of state variable modeling
(when algebraic form is just fine)
Prior to the development of state-variable approaches, analytic work focused
on algebraic equations for modeling.
Many types of systems are modeled with sets of linear equations
Plants and controllers for control
Controls, design, economics, statistics, etc.
Multi-variable statistical analysis (e.g. data analysis)
Algebraic approaches tended to give specific results without general
theory
Business and economic models
Example, it was not known until publication of Masons rule in 1953
when control system equations could be simplified.
Signal processing
And more
In fact, the only models which can be worked out in a general way are linear.
Many properties of the system made only clear by state-variable
approach:
Many truly nonlinear results are known, but they are applicable only in
specific cases.
Controllability and Observability
Nonlinear systems are often analyzed by linearizing them about an
operating point ...
Coordinate transformations and modal coordinates
Interest in state variable modeling for analysis and design of systems
accelerated with the early work of Kalman, Bellman and others.
To apply the methods of linear analysis.
Any time there are two or more variables, vectors and matrices become
powerful tools for analysis
Earlier algebraic approaches do not extend well to multi-variable systems
(such as multi-input, multi-output (MIMO) systems).
Powerful mathematical results for linear systems are applicable to problems
from all domains. !!!! Geometric interpretation !!!!
Vector spaces bring us notions of size, orthogonal, independence,
sufficiency, degrees of freedom (DOF), which apply equally well across
all domains.
Part 5: Models of Linear Systems
Optimal control (and Kalman filtering)
(Revised: Sep 10, 2012)
Page 11
Rudolf E. Kalman, On the General Theory of Control Systems, Proc.
1st Inter. Conf. on Automatic Control, IFAC: Moscow , pp 481-93, 1960.
Richard E. Bellman, Dynamic Programming, Princeton University Press,
1957.
Rudolf E. Kalman, Mathematical Description of Linear Systems,
SIAM J. Control, vol. 1, pp 152-192, 1963.
Vasile M. Popov, On a New Problem of Stability for Control Systems,
Automatic Remote Control, 24(1):1-23, 1963.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 12
EE/ME 701: Advanced Linear Systems
Section 1.4.0
The computational tools to work with vectors and matrices were being
introduced at about the same time
EE/ME 701: Advanced Linear Systems
Section 2.0.0
2 Formal properties of systems
Definition: A system is something which maps inputs to outputs.
Wilkinson, James Hardy, Rounding errors in algebraic processes,
Englewood Cliffs:Prentice-Hall, 1963.
George E. Forsythe and Cleve B. Moler, Computer solution of linear
algebraic systems, Englewood Cliffs: Prentice-Hall, 1967.
Cleve Moler went on to co-found Mathworks (Matlab); and is
Mathworks chief scientist today.
u(t)
y(t)
Aircraft
Outputs:
- Pitch
- Roll
- Yaw
- Velocity
Inputs:
- Aileron
- Rudder
- Elevator
- Throttle
Figure 3: A System maps inputs to outputs.
Analysis: What can we say about how an aircraft responds ?
Design: How do we engineer the aircraft so that it responds well ?
Control: How do we pick u(t) ?
Starting Point
What is a system ?
What are the properties of a system ?
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 13
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 14
EE/ME 701: Advanced Linear Systems
Section 2.1.0
2.1 A system Maps inputs to outputs
EE/ME 701: Advanced Linear Systems
Section 2.2.0
2.2 Simple example system
The System
System
N{}
u(t)
.y(t)
+
+
u(t)
y(t)
Vc(t)
Space of All
Possible Input Signals
Figure 5: Simple example system with a single state. This system is an RC
circuit.
Space of All
Possible Output Signals
1+(t)
Figure 4: Spaces of input and output signals.
y(t) = N [u(t)]
(18)
0 t<0
u(t) = 1+ (t) =
1 t0
Examples (describe the components and signals):
System
Car Cruise Control
Components
Signals
Car, Engine, Throttle
v (t) , (t) , RPM (t)
Rocket, engine,
v(t) , (t)
Rocket Guidance
steering hydraulics
(t) , (t)
(19)
Look at two cases:
Solving the Diff. Eqn. gives:
Case 1: VC (t = t0 ) = 0.0 [volts]
Case 2: VC (t = t0 ) = 3.0 [volts]
Temperature in Oil Refining
t=0
Input Signal, unit step function:
y1 (t) = 1 e RC t
y2
1
(t) = 1 + 2 e RC t
t >0
t >0
Economy
Biosphere
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 15
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 16
EE/ME 701: Advanced Linear Systems
Section 2.2.0
EE/ME 701: Advanced Linear Systems
Section 3.1.0
3 Classification of Systems and Signals
2.5
Signals and systems are characterized by several properties
2
y (t)
These have formal mathematical definitions (See Bay section 1.1)
y(t)
1.5
Characteristic
* Admissible
* Linear
* Time Invariant
* Causal (Non-anticipative)
* Realizable
* Lumped
* Continuous
* Discrete
* Quantized
* Deterministic
y (t)
0.5
0
0
0.5
1.5
Time [sec]
Figure 6: Responses y1 (t) and y2 (t) of the RC circuit.
So the system response, y(t) , depends on two things:
The input u(t)
Something inside the system at t = t0 , in this case VC (t = t0)
Definition: The state of the system is information about conditions inside the
system at t = t0 that is sufficient and necessary to uniquely determine the
system output.
Signal System
Table 3: Characteristics of Signals and Systems
3.1 Admissible signal
Write:
y(t) = N [u(t), x(t0 )]
(20)
Some facts about the system state:
A signal is admissible if it has mathematical properties such that it has a Laplace
transform (all physical signals are admissible). Signal u (t) is admissible if:
i)
it is piecewise continuous (discontinuous at most at a finite number of
locations in any finite interval)
ii)
t0
It is equal to the number of independent storage elements in the system
(such as independent capacitors, inductors, masses and springs)
iii)
u (t) is exponentially bounded (there exists an exponential signal which goes
to infinity faster than u (t))
The selection of state variables is not unique (more on this and state-variable
transformations later)
iv)
u (t) is a vector of the correct dimension (e.g., for aircraft inputs [aileron,
rudder, elevator, throttle], u (t) R4).
The number of state variables is equal to the order of differential equation
required to describe the system.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 17
s.t.
u (t) 0 t t0
Part 5: Models of Linear Systems
( :
there exists;
for all)
(Revised: Sep 10, 2012)
Page 18
EE/ME 701: Advanced Linear Systems
Section 3.2.0
EE/ME 701: Advanced Linear Systems
Section 3.3.0
3.2 Linear System (without internal variables)
3.3 Time-invariant system (without internal variables)
System: a system is linear if and only if (iff) it obeys the superposition rule (which
incorporates scaling and homogeneity).
A system is time invariant if a time shifted input gives the time-shifted output.
Given:
Given:
y = N [u]
y1 = N [u1]
y2 = N [u2 ]
N [u (t T )] = y(t T )
and
Examples:
u3 = c1 u1 + c2 u2
Time Invariant: a boiler
fuel in heat out (independent of clock time t0 )
Then:
Not Time Invariant (time varying)
y3 = c1 y1 + c2 y2
Example: the economy
Define: u(t) = Loans for buying corn seed
Or
u(t) = 1 in April is not the same as u(t) = 1 in August
N [c1 u1 + c2 u2 ] = c1 N [u1] + c2 N [u2]
Example: flight dynamics
Related:
x (t) = A (t) x (t) + B (t) u (t)
Scaling:
N [c1 u1 ] = c1 y1
Homogeneity:
N [0] = 0
y (t) = C (t) x (t) + D (t) u (t)
A (t) , B (t) , C (t) and D (t) depend on altitude and Mach number.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 19
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 20
EE/ME 701: Advanced Linear Systems
Section 3.5.0
3.4 Causal (non-anticipative) system
EE/ME 701: Advanced Linear Systems
Section 3.6.0
3.6 Lumped Parameter system
For a causal system, the output at time t0 , y(t0 ), is completely determined
by the inputs for t t0 .
That is: u(t1 ) , t1 > t0 does not influence y(t0 ) .
A lumped parameter system is characterized by ordinary differential
equations (the coefficients of the differential equation are the lumped
parameters)
Distributed systems are characterized by partial differential equations
Causal Example:
A good example of an Lumped Parameter system is an RLC circuit
iR
y(t) = 2 u(t 1)
is
Vs(t)
Non-causal (anticipative) Example:
iL
+
iC
+
VC(t)
y(t) = 2 u(t + 1)
Figure 7: An RLC circuit.
3.5 Realizable system
A realizable system is a physically possible system, one which in principle
could be built. To be realizable, a system must be:
1. Causal
When the frequency is not too high, the circuit of figure 7 is characterized
by:
1
1
1
vc (t) +
(21)
vc (t) +
vc (t) =
vs (t)
RC
LC
RC
When the frequency is high enough wave phenomena become important
At some high frequency, lumped parameter models break down
2. y(t) is real (not complex) if u(t) is real
Partial differential equations and wave propagation are required
A distributed system model is required.
We will deal exclusively with lumped parameter systems
All continuous-time systems have a cross-over frequency where
distributed phenomena (often wave phenomena) become important
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 21
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 22
EE/ME 701: Advanced Linear Systems
Section 3.7.0
3.7 Continuous-time versus sampled (discrete-time) signals
EE/ME 701: Advanced Linear Systems
Section 3.8.0
Continuous (time) system: governed by differential equations, such as:
and systems
x(t)
= 2 x(t) + 3 u(t)
(22)
y(t) = x(t)
0.9
0.8
0.7
3.8 Discrete-time (sampled) signals and systems
u(t)
0.6
0.5
Signal: A discrete (sampled) signal u(k) is defined (sampled) only at specific
sample instants, t = tk (see figure 8).
0.4
0.3
System: A discrete (sampled) system is governed by a difference equation, such
as:
0.2
0.1
0
0
0.5
1.5
Time [sec]
2.5
xk = 2 xk1 1 xk2 + 3 uk
Figure 8: A continuous and sampled signal.
Signals:
A continuous signal is defined
for all values of time, e.g.,
u (t) = 0.5 + 0.2 cos t/3 +
6
+0.2 cos (2t/3)
+0.2 cos (2t + )
Part 5: Models of Linear Systems
yk = xk
A sampled (discrete) signal is defined only
particular instants, e.g.,
t (k) =
0.3
0.6
0.9
1.2
1.5
1.8
2.1
2.4
(23)
u (k) =
0.805
0.297
0.235
0.400
0.128
0.0937
0.5247
0.528
(Revised: Sep 10, 2012)
Page 23
Usage: generally we say sampled signal and discrete system, though all
combinations are sometimes seen.
Any computer-based data acquisition results in sampled signals; any
computer-based signal processor is a discrete system
Some mathematical results are more straight-forward or intuitive (or
possible) for one time or the other.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 24
EE/ME 701: Advanced Linear Systems
Section 3.9.0
3.9 Continuous signals and systems, continuity in the mathematical sense
Signal: A signal is continuous if the limit of a sequence of values is the value of
the limit:
lim u (t) = u (t1 )
(24)
tt1
Said another way
EE/ME 701: Advanced Linear Systems
Section 3.10.0
3.10 Quantized signal
A signal can be quantized, the signal takes only certain possible values, for
example
u(t) {0, 1/4, 1/2, 3/4, 1, 5/4, ...}
(27)
Discrete signals are often quantized, but continuous signals can also be.
Example signals with the possible combinations of the characteristics
continuous and discrete and/or quantized are seen in figure 9.
if t t1 ,
then
u (t) u (t1 )
System: A system is continuous if, when a sequence of input signals ui converges
to u, then the corresponding sequence of outputs converges to the output
signal of the limiting input. That is, if
lim ui = u
(25)
N [u ] = lim N [ui ]
(26)
then
i
Figure 9: Example signals: a) Continuous-time signal (with a discontinuity), b)
Continuous-time, quantized signal, c) Discrete signal, d) Discrete,
quantized signal.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 25
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 26
EE/ME 701: Advanced Linear Systems
Section 3.12.0
EE/ME 701: Advanced Linear Systems
Section 4.0.0
4 State, revisited
3.11 Deterministic Signals and systems
Signal: Deterministic signals have no random component, examples:
We saw above that internal variables partially determine the output of a
system. Look at two cases:
Deterministic signal:
u(t) = 1+ (t)
the ideal voltage on an RC circuit, with u(t), VC (t0 ) given.
y1 (t) = 1 e RC t
Case 1: vc (t = t0 ) = 0 [volts]
Non-deterministic signal:
Case 2: vc (t = t0) = 3 [volts]
y2
1
(t) = 1 + 2 e RC t
t >0
t >0
wind gusts acting on a helicopter.
3
System: A deterministic system does not introduce random components into the
output signals
System
2.5
R
+
2
u(t)
Vc(t)
RLC circuit
y (t)
y(t)
y(t)
Deterministic:
1.5
y (t)
0.5
Non-deterministic:
0
0
0.5
economy,
1
Time [sec]
Figure 10: Simple example system.
biological systems,
These internal variables are called states.
quantum mechanical systems
Definition: The state of a system at a time t0 is the minimum set of internal
variables which is sufficient to uniquely specify the system outputs given
the input signal over [t0 , ).
3.12 A Note on units
All physical signals have units
Systems with physical inputs and outputs have units.
The units of the system are [Output] / [Input].
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 27
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 28
1.5
EE/ME 701: Advanced Linear Systems
Section 4.0.0
EE/ME 701: Advanced Linear Systems
Section 4.1.1
4.1 Modified definition of linearity, considering state
Examples of states
Elementary dynamics:
position and velocity of particles
Circuits:
voltages across a capacitor, currents through
inductors
4.1.1 For direct application of Superposition and Scaling, state must be zero
Fluid system:
Rate of flow, levels in tanks
Economy:
balances in accounts, levels of material in
inventories, position of material in transit
For superposition and scaling to apply to a system in the simple way, the
internal states must be zero, consider:
y (t) = N [u (t) , x (t0 )] = u (t) + x (t0 )
(28)
so with u3 = u1 + u2
Some facts about states:
The number of states is equal to the order of the differential (or
difference) equation of the model
States are often associated with energy-storage elements,
y3 (t) = u1 (t) + u2 (t) + x (t0)
(29)
but simple application of superposition would require:
More generally, states are associated with storage of something, where
the stored amount changes with time, giving:
y3 (t) = (u1 (t) + x (t0 )) + (u2 (t) + x (t0))
(30)
= u1 (t) + u2 (t) + 2 x (t0 )
d Amount Stored
= f (x, u,t)
dt
The definition of the states of a system is not unique, consider analyzing
a circuit for a voltage, or for a current.
Keep in mind that the state variables must be independent. Consider the
circuit of figure 11
So x (t0 ) 0 is required for Eqns (29) and (30) to be consistent.
(0 is the Null or zero vector, it is a vector of zeros)
Definition: zero state response:
R1
iL1(t)
y (t) = N [u (t) , 0] = u (t) + 0
(31)
L1
Va (t)
iL2(t)
L2
Figure 11: Circuit with inductors in series. This system has only one state, iL1 (t) .
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 29
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 30
EE/ME 701: Advanced Linear Systems
Section 4.1.2
EE/ME 701: Advanced Linear Systems
4.1.2 Definition of a Linear System considering non-zero state (e.g., DeCarlo
definition 1.8)
4.1.3 Time-invariant system, considering non-zero state
The definition of a time-invariant system is a bit more complex when state
is considered, because we must account for the time-shifted state.
Let N [u, x (t0 )] be the response of system N [] to the input signal u ()
defined over [t0, ) , with initial state x (t0 ) .
Definition:
Then system N [] is linear if and only if, for any two admissible inputs
signals u1 and u2 , and for any scalar k,
k (N [u1, x (t0 )] N [u2 , x (t0 )]) = N [k (u1 u2) , 0 (t0)]
Section 4.1.3
A system is time-invariant if, t t0 x1 Rn such that
NT [N [u, x (t0)]] = N [NT [u] , x1 (t0 + T )]
x (t0 ) Rn
(32)
(33)
the time-shifted output = output from the time-shifted input
where 0 () is the zero vector.
For linear systems, the response can be factored into the response due to the
initial state and the response due to the input
where NT [] is the time delay system:
Student
Exercise
NT (u (t)) = u (t T ) .
(34)
Interpretation:
Eqn(33) says that there exists a possibly different initial condition x1 (t + T )
such that the delayed output of the original system with IC x (t0 ) is identical
to the output of the system with a delayed input and the new IC x1 (t + T ) .
Study question:
For a linear time-invariant system, what is the relationship between x (t0 )
and x1 (t0 + T )
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 31
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 32
EE/ME 701: Advanced Linear Systems
Section 5.0.0
5 Standard Notation for State Model
EE/ME 701: Advanced Linear Systems
Section 5.0.0
For the linear, time-varying, continuous-time system we add (t) as an
argument to the model matrices
The notation for a state model depends on its properties, whether it is linear
or nonlinear, time invariant or time varying, continuous or discrete, etc.
Most general continuous case, nonlinear and time varying
State equation:
x (t) = f (x (t) , u (t) , t)
(35)
Output equation:
y (t) = g (x (t) , u (t) , t)
(36)
x (t) = A (t) x (t) + B (t) u (t)
(41)
y (t) = C (t) x (t) + D (t) u (t) ;
(42)
The discrete-time system does not have derivatives, the system equation
gives x (k) at the next sample instant:
x (k + 1) = A x (k) + B u (k)
(43)
y (k) = C x (k) + D u (k)
(44)
all models have x (t0 ) as the IC. Also, see DeCarlo example, Eqn (1.24).
For the nonlinear, time invariant system, time is no longer an argument of
f () and g ()
And if the discrete-time system is time-varying, A, B, C, D become
functions of sample:
State equation:
x (t) = f (x (t) , u (t))
(37)
x (k + 1) = A (k) x (k) + B (k) u (k)
(45)
Output equation:
y (t) = g (x (t) , u (t))
(38)
y (k) = C (k) x (k) + D (k) u (k)
(46)
y (t) = C x (t) + D u (t) ;
(40)
nx1
u(t)
+
B
nxm
n: number of states
m: number of inputs
p: number of outputs
x(t)
px1
y(t)
=
C
pxn
u(t)
+
D
pxm
mx1
Output equation:
(39)
A
nxn
nx1
State equation:
x (t) = A x (t) + B u (t)
x(t)
mx1
.
x(t)
nx1
The linear, time-invariant, continuous-time system we have seen (and will
be the one we most commonly use)
Figure 12: Configuration of the signals (vectors) and parameter (matrices) of a
state variable model.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 33
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 34
EE/ME 701: Advanced Linear Systems
Section 6.0.0
6 Steps of building a state-variable model (for a
linear system)
EE/ME 701: Advanced Linear Systems
Section 6.0.0
3. Write the differential equations in state-variable form
(a) Higher order differential equations are written as a chain of first-order
equations
1. Write the relevant relations for the system
(b) Put derivative in the left-hand side, these must be the state derivatives
(a) Define symbols for the signals and parameters
(c) All signals on the right-hand side must be expressed in terms of the states
and inputs
(b) Write the equations
(d) Put the model in state-variable form
i. Constituent relations (for elements)
ii. Continuity constraints (for how elements are linked into a system)
4. Write the equation of the output signal (or signals) using the states and
inputs
(c) Record the units, verify that units balance in the equations
5. Check units throughout, to verify correctness.
The equations express laws of physics, the units must balance
Essential things to keep in mind:
2. Identify the differential equations
1. Always distinguish between signals and parameters
(a) Determine the system order
The system order will almost always be sum of the orders of the
contributing differential equations
Rarely, differential equations may be inter-dependent in a way that
reduces the order
(b) Select the state variables
Signals are functions of time and change when the input (signals) change.
Parameters are generally constant (or slowly varying) and are
properties of the system.
Both have physical units
2. Pay special attention to the states
n state variables for an nth order system
Systems have many signals, only n signals are states
State variables must be independent
States correspond to the initial condition needed to determine the output
of the system
The choice is not unique
Often the storage variables are a good choice (often called physical
coordinates)
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 35
With experience, it is usually pretty straight-forward to determine the
states.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 36
EE/ME 701: Advanced Linear Systems
Section 7.1.0
7 Examples with state-variable models
EE/ME 701: Advanced Linear Systems
Section 7.1.0
(b) Write the equations
i. Constituent relations
7.1 Example 1, an electrical circuit
v1(t)
Is(t)
d
Vc (t)
dt 1
d
VL1 (t) = L1 IL1 (t)
dt
VR1 (t) = R1 IR1 (t) ,
IR1(t) VR1(t)
+
R1
Ic1 (t) = C1
VR2 (t) = R2 IR2 (t) ,
IL1(t) VL1(t)
v2(t)
+
Ic1(t)
L1
+
+
VIs(t)
Vc1(t) C1
ii. Continuity Constraints
IR2(t)
+
VR2(t)
Kirchhoffs voltage law
R2
Kirchhoffs current law
VIs (t) +VL1 (t) +VC1 (t) = 0
IR1 (t) + IL1 (t) Ic1 (t) IR2 (t) = 0
Vc1 (t) = VR2 (t)
Is (t) IR1 (t) IL1 (t) = 0
VL1 (t) = VR1 (t)
Is (t) Ic1 (t) IR2 (t) = 0
Figure 13: Electrical circuit.
Sum of the voltages around a loop, + if you enter at the + terminal
Sum of the currents entering a node
1. Write the relevant relations for the system
(c) Record the units, verify that units balance in the equations
(a) Define symbols for the signals and parameters
Units recorded in table 4. Check
Signals
Is (t)
IR1 (t)
Sup. current
[Amps]
Vs (t)
Sup. voltage
Parameters
[volts]
R1 voltage
[volts]
R1 , R2
Resistance
VR2 (t)
R2 voltage
[volts]
C1
Capacitance
Vc1 (t)
C1 voltage
[volts]
L1
Inductance
VL1 (t)
L1 voltage
[volts]
R1 current
[Amps]
VR1 (t)
IR2 (t)
R2 current
[Amps]
Ic1 (t)
C1 current
[Amps]
IL1 (t)
L1 current
[Amps]
i
volts
amp
h
i
amp-sec
h volt i
volt-sec
amp
Table 4: Signals and parameters of the electrical circuit.
Ic1 (t) = C1
d
Vc (t)
dt 1
[amps] =
VL1 (t) = L1
d
IL (t)
dt 1
[volts] =
h amp-sec i volt
volt
second
volt-sec h amp i
amp
second
2. Identify the differential equations
d
Vc (t)
dt 1
d
VL1 (t) = L1 IL1 (t)
dt
Ic1 (t) = C1
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 37
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 38
EE/ME 701: Advanced Linear Systems
Section 7.1.0
(a) Determine the system order
EE/ME 701: Advanced Linear Systems
3. Write the differential equations in state-model form
1 Cap + 1 Inductor, 2nd Order
(a) Higher order differential eqns are written as a chain of first-order eqns.
(b) Select the state variables
(b) Put derivative in the left-hand side, these must be the state derivatives
For this RLC circuit there is a clear choice:
Vc1 (t)
x (t) =
IL1 (t)
x2 (t) =
VR1 (t)
IR2 (t)
x3 (t) =
1
d
Ic (t)
Vc (t) =
dt 1
C1 1
1
d
VL (t)
IL (t) =
dt
L1 1
(47)
VIs (t)
VR2 (t)
(48)
(49)
This step shows why Vc1 (t) and IL1 (t) are natural choices for the
states.
Other possible choices
Section 7.1.0
(c) All signals on the right-hand side must be expressed in terms of the states
and inputs
Later, we will see how to convert the state model into a state model
with any of these state vectors.
If we wanted a model based on x3 (t), it is probably easiest to derive
the model based on the physical coordinates, Eqn (47), and then
make a change of basis to transform the model to x3 (t).
Illegal choices
In Eqn (48) we need to express Ic1 (t) in terms of {Vc1 (t) , IL1 (t) , Is (t)}
This involves using the constituent and continuity equations that
describe the system.
From
Is (t) Ic1 (t) IR2 (t) = 0
From Vc1 (t) = VR2 (t) , IR2 = VR2 /R2
=
=
Ic1 (t) = Is (t) IR2 (t)
Ic1 (t) = Is (t) Vc1 (t) /R2
(50)
These selections for the states are not independent
x4 (t) =
VR1 (t)
IR1 (t)
, x5 (t) =
Vc1 (t)
VR2 (t)
This is the needed form. For VL1 (t)
VL1 (t) = VR1 (t) = R1 IR1 (t) = R1 (Is (t) IL1 (t))
Using Eqns (50) and (51) in Eqns (48) and (49)
And this one is not allowed because Is (t) is an input
Part 5: Models of Linear Systems
x6 (t) =
Vc1 (t)
Is (t)
(Revised: Sep 10, 2012)
(51)
Page 39
d
1
1
1
(Is Vc1 (t) /R2) =
Vc (t) + 0 IL1 (t) + Is (t)
Vc (t) =
dt 1
C1
C1 R2 1
C1
R1
R1
1
d
R1 (Is (t) IL1 (t)) = 0Vc1 (t) IL1 (t) + Is (t)
IL (t) =
dt
L1
L1
L1
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012) Page 40
EE/ME 701: Advanced Linear Systems
Section 7.1.0
EE/ME 701: Advanced Linear Systems
(d) Put the model in state-variable form
x (t) = A x (t) + B u (t)
d Vc1 (t) C11R2 0 Vc1 (t)
=
+
x (t) =
dt IL (t)
0
RL11
IL1 (t)
1
1
C1
R1
L1
5. Check units throughout, to verify correctness.
[volts/sec]
[volts]
, x (t) :
x (t) :
[amps/sec ]
[amps ]
(52)
Is (t)
sec1
A:
1 ,
sec
This is the state equation, with
x (t) =
Vc1 (t)
IL1 (t)
u (t) = [Is (t)] ,
A=
C11R2
0
0
RL11
4. Write the equation of the output signal (or signals) using the states and
inputs
Suppose the output were VR1 (t). In Eqn (51) we have already derived
that VR1 (t) = R1 (Is (t) IL1 (t)), so
y (t) =
0 R1
Vc1 (t)
IL1 (t)
y (t) = C x (t) + D u (t)
This is the output equation, with
y (t) = [VR1 (t)] , C =
0 R1
Units check in the state equation.
Units in the output equation are given as:
h h
i i [volts]
volts
+ volts [amps]
[volts] = amp
amp
[amps ]
Units check in the output equation.
We need to find a way to express Vs (t) in terms of the states and inputs,
{Vc1 (t) , IL1 (t) , Is (t)}. Going back to the original equations,
, D = [R1 ]
(Revised: Sep 10, 2012)
So units in the state equation are given as:
h
i
volt
sec1
[volts/sec]
[volts]
=
+ amp-sec
1
1 [amps]
sec
[amps/sec ]
[amps ]
sec
y (t) = [Vs (t)]
Notice, the output is made up of a signal, and the output and feed-forward
matrices are made up of parameters.
Part 5: Models of Linear Systems
1
C1
R1
L1
i
volt
amp-sec
B:
1
sec
h
Alternative output, suppose we wanted to determine Vs (t), for example to
calculate the supply power Ps (t) = Vs (t) Is (t). The output will be Vs (t):
+ [R1 ] Is (t)
B=
Notice, the state and input are made up of signals, and the system and
input matrices are made up of parameters.
Section 7.1.0
Page 41
Vs (t) = VR1 (t) +Vc1 (t) = R1 (Is (t) IL1 (t)) +Vc1 (t)
(53)
This gives Vs (t) in terms of the states and inputs !
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 42
EE/ME 701: Advanced Linear Systems
Section 7.1.1
EE/ME 701: Advanced Linear Systems
For y (t) = [Vs (t)]
Section 7.1.1
Build the Matlab state-variable model object
y (t) = C x (t) + D u (t)
>> SSmodel = ss(A, B, C, D)
(54)
a =
with
C =
1 R1
, D = [R1 ]
x1
x2
x1
-500
0
y1
x1
0
x2
0
-1000
b =
x1
x2
(55)
c =
A deep property of state models: just as with Eqns (54) and (55) we can find
a row for the C and D matrices to give any signal in the system.
Now we can look at poles and zeros
x2
-100
>> Poles = pole(SSmodel)
Poles = -1000
-500
>> Zeros = zero(SSmodel)
Zeros = -500
0
d =
u1
y1 100
Continuous-time model.
7.1.1 Building and exercising the circuit model
To build the model, build the A , B , C and D matrices:
>>
>>
>>
>>
%%
R1
R2
L1
C1
Looking at the step and frequency response
Set up the parameter values
= 100
%% Ohms
= 200
%% Ohms
= 0.1
%% Henries
= 10e-6
%% Farads
>> figure(1), clf; step(SSmodel); print(-deps2c, CircuitAStepResponse)
>> figure(2), clf; bode(SSmodel); print(-deps2c, CircuitAFreqResponse)
Bode Diagram
Step Response
40
100
30
Magnitude (dB)
90
80
%% Build the state equation
>> A = [ -1/(C1*R2), 0; 0, -(R1/L1) ]
A = -500
0
0
-1000
Amplitude
70
Part 5: Models of Linear Systems
10
0
50
10
90
Phase (deg)
40
30
%% Build the output Eqn
>> C = [ 0, -R1]
C =
0 -100
>> D = [ R1 ]
D = 100
(Revised: Sep 10, 2012)
20
60
20
>> B = [ 1/C1; R1/L1]
B = 1.0e+05 *
1.0000
0.0100
u1
1e+05
1000
Page 43
45
10
0
3
Time (sec)
6
3
x 10
0
1
10
10
10
Frequency (rad/sec)
10
10
Figure 14: Circuit step and frequency response
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 44
EE/ME 701: Advanced Linear Systems
Section 7.2.0
7.2 Example 2, a mechanical system
The parameters are listed in table 5. Signals are listed in table 6.
See tables 5 and 6.
(b) Write the equations
m2
i. Constituent relations (for elements)
zv(t)
vcar
ks
bs
m1
zw(t)
kw
Road Surface
r(t)
Figure 15: Quarter vehicle suspension.
Parameters
m1
kw
wheel mass
[kg]
ks
i
Newtons
h meter i
Newtons
Suspension damping meter/sec
Suspension stiffness
[kg]
bs
i
Newtons
tire stiffness
meter
Table 5: Parameters of the mechanical quarter suspension.
h
F (t) = m x (t)
(56)
Hooks law:
F (t) = k (x1 (t) x2 (t))
(57)
Damper Eqn:
F (t) = b (x1 (t) x2 (t))
(58)
ii. Continuity constraints (for how elements are linked into a system)
Inerial Reference
m2 1/4 vehicle mass
The constituent relations for mechanical systems are:
Newtons 2nd law:
It is desired to model the system
and determine the response zv (t)
to the road profile.
Section 7.2.0
(a) Define symbols for the signals and parameters
A Quarter Suspension is shown in figure 15.
The vehicle is moving across a
road, with the surface profile given
by r (t).
EE/ME 701: Advanced Linear Systems
Sum of the forces acting on any free body is zero
Sum of the velocities around any loop is zero
(c) Record the units, verify that units balance in the equations
See tables 5 and 6.
2. Identify the differential equations
The differential Eqns come from Eqns (56) and (58)
Signals
zv (t)
Height of vehicle
[m]
zw (t) Height for tire center [m]
r (t) Height of road surface [m]
Table 6: Signals of the mechanical quarter suspension.
1. Write the relevant relations for the system
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 45
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 46
EE/ME 701: Advanced Linear Systems
Section 7.2.0
EE/ME 701: Advanced Linear Systems
Model equations (repeated)
(a) Determine the system order
We get two 2nd order differential equations:
m1 zw (t) =
m2 zv (t) =
Section 7.2.0
m1 zw (t) = kw (r (t) zw (t)) ks (zw (t) zv (t)) bs (zw (lt) zv (t))
m2 zv (t) = +ks (zw (t) zv (t)) + bs (zw (t) zv (t))
Forces on wheel
Forces on 1/4 vehicle
The system will be 4th order. From the free-body diagrams
Lets look at one of the 2nd order terms first
m1 zw (t) = kw (r (t) zw (t)) ks (zw (t) zv (t)) bs (zw (t) zv (t))
(59)
m2 zv (t) = +ks (zw (t) zv (t)) + bs (zw (t) zv (t))
(60)
zw (t) =
(b) Select the state variables
The natural physical coordinates are
mass
zw (t)
zw (t)
x (t) =
zv (t)
zv (t)
Part 5: Models of Linear Systems
position and velocity of each
[m]
[m/sec]
[m]
[m/sec]
(Revised: Sep 10, 2012)
zw (t)
Examining Eqn (62)
h
Likewise, from Eqn (63)
h
+ks /m2 +bs /m2 ks /m2
Part 5: Models of Linear Systems
zw (t) = (kw + ks ) /m1 bs /m1 +ks /m1
zv (t) =
i
zw (t)
+ [__] r (t)
__ __ __ __
zv (t)
zv (t)
(61)
Page 47
zw (t)
zw (t)
d x (t)
=
x (t) =
dt
zv (t)
zv (t)
- The system order will almost always be sum of the orders of the
contributing differential equations -
We need 4 state variables, we have two 2nd order differential
equations.
(63)
3. Write the differential equations in state-variable form
(62)
zw (t)
i
zw (t)
+[kw ] r (t)
+bs /m1
zv (t)
zv (t)
(64)
z (t)
w
i
zw (t)
+ [0] r (t)
bs /m2
zv (t)
zv (t)
(65)
(Revised: Sep 10, 2012)
Page 48
EE/ME 701: Advanced Linear Systems
Section 7.2.0
Eqns (64) and (65) gives two of the rows of the A matrix.
Looking back at x (t),
zw (t) =
zw (t)
d x (t)
=
x (t) =
dt
zv (t)
zv (t)
For the first row we must write:
zw (t)
zw (t)
i
zw (t)
+ [__] r (t)
__ __ __ __
zv (t)
zv (t)
zw (t) is an element of the state vector, we just hook it up !
zw (t) =
Likewise
z (t)
i
zw (t)
+ [0] r (t)
0 1 0 0
zv (t)
zv (t)
zw (t)
h
i
zw (t)
+ [0] r (t)
zv (t) = 0 0 0 1
zv (t)
zv (t)
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 49
Section 7.2.0
Putting the pieces together
x (t) = A x (t) + Bu (t)
0
1
0
0
(kw +ks )
bs
ks
m1 + m1 + mbs1
m1
x (t) =
0
0
0
1
ks
bs
ks
+ m2 m2 mbs2
+ m2
(66)
0
z (t)
zw (t) mkw
+ 1 r (t)
zv (t) 0
zv (t)
0
4. Write the equation of the output signal (or signals) using the states and
inputs
If the output of interest is the vehicle response
y1 (t) = zv (t) =
EE/ME 701: Advanced Linear Systems
zw (t)
i
zw (t)
+ [0] r (t)
0 0 1 0
zv (t)
zv (t)
(67)
Suppose the desired output is the force in the suspension spring. Since
Fs (t) = ks (zw (t) zv (t))
The output equation is given as:
y2 (t) = Fs (t) =
Part 5: Models of Linear Systems
ks 0 ks
zw (t)
i
zw (t)
+ [0] r (t)
zv (t)
zv (t)
(Revised: Sep 10, 2012)
(68)
Page 50
EE/ME 701: Advanced Linear Systems
Section 7.2.0
Suppose the desired output is the road force on the tire. Since
EE/ME 701: Advanced Linear Systems
Section 7.2.0
5. Check units throughout, to verify correctness.
Noting that 1.0 [Newton]=1.0 kg m/sec2
Fw (t) = kw (r (t) zw (t))
Terms such as kw /m1 or bs /m2 have units of
The output equation is given as:
y3 (t) = Fw (t) =
kw
zw (t)
kw
:
m1
i
zw (t)
+ [kw ] r (t)
0 0 0
zv (t)
zv (t)
(69)
y (t) = C x (t) + D u (t)
z (t)
v
y (t) = Fs (t)
Fw (t)
zw (t)
0 0 1 0
zw (t)
= ks 0 ks 0
zv (t)
kw 0 0 0
zv (t)
(70)
kg-m 1
sec2 m
kg-m 1
sec2 m/s
1
kg
1
kg
=
=
i
h
sec2
i
h
sec1
Thus the state equation has units of:
Now suppose the desired output is all three
bs
:
m2
0
+ 0
r (t)
kw
[]
[1]
[]
[]
[m/sec]
1 2 1
sec2
m/sec2
sec
sec
sec
=
x (t) :
[]
[m/sec]
[]
[]
[1]
1 2 1
2
2
sec
sec
sec
m/sec
sec
[]
sec2
[m]
+
[]
[]
(Revised: Sep 10, 2012)
Page 51
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
[m/sec]
[m]
[m/sec]
Units check !
Part 5: Models of Linear Systems
[m]
Page 52
EE/ME 701: Advanced Linear Systems
Section 7.2.1
EE/ME 701: Advanced Linear Systems
Examine the poles and zeros and response
7.2.1 Building and exercising the suspension model
Build the model
= 10000
= 2500
= 10000
= 25
= 250
%%
%%
%%
%%
%%
N/m
N/m
N/m/s
kg
kg
>> A = [ 0
1
0
0
-(kw+ks)/m1 -bs/m1
ks/m1
bs/m1
0
0
0
1
ks/m2
bs/m2 -ks/m2 -bs/m2 ]
>> B = [ 0; kw/m1; 0 ; 0]
>> C1 = [ 0, 0, 1, 0];
D1 = [0]
>> SSmodel2a = ss(A, B, C1, D1)
%% Fast mode
%% Lightly damped mode
%% Slow mode
Bode Diagram
Step Response
50
1.8
Magnitude (dB)
kw
ks
bs
m1
m2
>> Poles = pole(SSmodel2a)
Poles =
-438.92
-0.41 + 6.00i
-0.41 - 6.00i
-0.25
>> Zeros = zero(SSmodel2a)
Zeros = -0.2500
1.6
Amplitude
1.4
50
100
1.2
150
200
0
0.8
Phase (deg)
>>
>>
>>
>>
>>
Section 7.2.1
0.6
0.4
90
180
0.2
0
a =
b =
x1
x2
x3
x4
x1
0
-500
0
10
x2
1
-400
0
40
x3
0
100
0
-10
c =
x4
0
400
1
-40
x1
x2
x3
x4
u1
0
400
0
0
d =
y1
x1
0
x2
0
x3
1
x4
0
y1
u1
0
6
Time (sec)
10
12
14
270
1
10
10
10
10
Frequency (rad/sec)
10
10
Figure 16: Suspension step and frequency response.
The shock absorber is not doing its job !
Examine the damping of the modes
>> [Wn, rho] = damp(SSmodel2a)
Wn = 0.2516
6.0186
6.0186
438.9211
rho =
1.0000
0.0687
0.0687
1.0000
Mode with damping of 0.07 is too lightly damped.
Continuous-time model.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 53
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 54
EE/ME 701: Advanced Linear Systems
Section 7.2.1
EE/ME 701: Advanced Linear Systems
Modify the parameters
>>
>>
>>
>>
>>
>>
%%
kw
ks
bs
m1
m2
Examine the poles, zeros and damping
2nd parameter
= 10000
%%
= 1000
%%
= 1000
%%
= 25
%%
= 250
%%
set
N/m
N/m
N/m/s
kg
kg
>> Poles = pole(SSmodel2b)
Poles = -31.4478
-5.4731 + 1.3136i
-5.4731 - 1.3136i
-1.6059
>> Zeros = zero(SSmodel2b)
Zeros = -1.0000
%% Softer spring
%% Softer shock absorber
x1
x2
x3
x4
x1
0
-440
0
4
x2
1
-40
0
4
x3
0
40
0
-4
x4
0
40
1
-4
%% Fast Mode
%% Oscillatory mode,
%%
now alpha>omega
%% Slow mode
The slow mode is faster, the fast mode is much slower, and the oscillatory
mode is better damped
>> A = [ 0
1
0
0
-(kw+ks)/m1 -bs/m1
ks/m1
bs/m1
0
0
0
1
ks/m2
bs/m2 -ks/m2 -bs/m2 ]
>> B = [ 0; kw/m1; 0 ; 0]
>> C1 = [ 0, 0, 1, 0]; D1 = [0]
>> SSmodel2b = ss(A, B, C1, D1)
a =
Section 7.2.1
Compute the damping factors
>> [Wn, Z] = DAMP(SSmodel2b)
Wn = 1.6059
5.6286
5.6286
31.4478
b =
x1
x2
x3
x4
Z =
1.0000
0.9724
0.9724
1.0000
Damping factor of 0.97 is much better
u1
0
400
0
0
Examine the step and frequency response
Bode Diagram
Step Response
50
1.4
x3
1
x4
0
d =
y1
u1
0
50
100
Continuous-time model.
0.8
150
0
0.6
Phase (deg)
x2
0
Amplitude
y1
x1
0
Magnitude (dB)
1.2
c =
0.4
0.2
0.5
1.5
Time (sec)
2.5
3.5
90
180
270
1
10
10
10
Frequency (rad/sec)
10
10
Figure 17: Suspension step and frequency response.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 55
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 56
EE/ME 701: Advanced Linear Systems
Section 7.2.1
Lets consider the Transfer Function
EE/ME 701: Advanced Linear Systems
The model now has 3 outputs
>> TFmodel2b = tf(SSmodel2b)
r(t)
A, B, C, D
Section 7.2.1
zv(t)
Fs(t)
y(t)
Transfer function:
1600 s + 1600
-------------------------------------s^4 + 44 s^3 + 444 s^2 + 1600 s + 1600
x(t)
Fw(t)
n: states
m: inputs
p: outputs
mxp: TFs
Figure 18: Block diagram of quarter suspension system reflecting 3 outputs.
Building the state-variable model and computing the transfer function
may be the easiest way to get the TF of a complex system.
Lets consider multiple outputs, from Eqn (70).
The C and D matrices change.
Output Eqn: y (t) = C x (t)+D u (t)
>> C2c = [ 0 0
1 0
ks 0 -ks 0
-kw 0
0 0 ];
>> D2c = [ 0; 0; kw]
>> SSmodel2c = ss(A, B, C2c, D2c)
a =
x1
x2
x3
x4
c =
y1
y2
y3
x1
0
-440
0
4
x1
0
1000
-1e+04
x2
1
-40
0
4
x3
0
40
0
-4
x2
0
0
0
x3
1
-1000
0
x4
0
40
1
-4
b =
x1
x2
x3
x4
x4
0
0
0
u1
0
400
0
0
d =
y1
y2
y3
u1
0
0
1e+04
Continuous-time model.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 57
Examine the transfer function and step response
>> TFmodel2c = tf(SSmodel2c)
Transfer function from input to output...
1600 s + 1600
#1: -------------------------------------s^4 + 44 s^3 + 444 s^2 + 1600 s + 1600
#2:
4e05 s^2 - 3.207e-10 s + 2.754e-09
-------------------------------------s^4 + 44 s^3 + 444 s^2 + 1600 s + 1600
#3:
10000 s^4 + 4.4e05 s^3 + 4.4e05 s^2 + 6.487e-10 s + 1.913e-10
------------------------------------------------------------s^4 + 44 s^3 + 444 s^2 + 1600 s + 1600
Each input/output pair (in this case 1x3) gives a TF.
The zeros and gain are different between the TFs
The poles are the same
Note: the coefficients of 109 1010 in TFs 2 and 3 are due to round-off
error. These TFs have a double zero at the origin.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 58
EE/ME 701: Advanced Linear Systems
Section 7.2.2
equation
Given a differential equation, such as
Step Response
a3 y(t) + a2 y (t) + a1 y (t) + a0 y (t) = b2 u (t) + b1 u (t) + b0 u (t)
1.5
We can immediately write down the TF model
0.5
0
To: Out(2)
(71)
T (s) =
1000
Amplitude
Section 7.3.0
7.3 Example 3, state-variable model from a differential
>> figure(1), clf;
>> step(SSmodel2c);
>> print(-deps2c, Mechanism2StepResponsec)
To: Out(1)
EE/ME 701: Advanced Linear Systems
b2 s2 + b1 s + b0
3
a3 s + a2 s2 + a1 s + a0
(72)
500
How do we construct a state-variable model ?
A. In Matlab
500
>> num = [b2, b1, b0]; den = [a3, a2, a1, a0]
>> TFmodel = tf(num, den);
>> SSmodel = ss(TFmodel)
To: Out(3)
10000
5000
0
B. Put together the state-variable model in one of the canonical forms.
5000
0.5
1.5
2
Time (sec)
2.5
3.5
Figure 19: Step response shown with 3 outputs.
There are 4 canonical forms. The canonical forms have a strong
relationship to properties called controllability and observability, and
we will see all four (cf. Bay chapter 8).
Here we consider Controllable Canonical Form
(cf. Bay section 1.1.3)
7.2.2 Conclusions, Quarter Suspension Example
First, the Diff Eq must be monic. That means that the an coefficient is 1.0
Follow the steps to put together the model
Divide through Eqn (71) (or 72) by a3
It is straight forward to
test various parameter configurations
1.0y(t) + a2 y (t) + a1 y (t) + a0 y (t) = b2 u (t) + b1 u (t) + b0 u (t) (73)
obtain a transfer function
State-variable model naturally represents systems with multiple outputs
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 59
With a2 = a2 /a3 , a1 = a1 /a3 , ... , b2 = b2 /a3 , ... , b0 = b0 /a3
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 60
EE/ME 701: Advanced Linear Systems
Section 7.3.1
EE/ME 701: Advanced Linear Systems
Section 7.3.1
1.0z(t) + a2 z (t) + a1 z (t) + a0 z (t) = 1.0 u (t)
7.3.1 Steps of deriving the state-variable model
(74, repeated)
1. Write the relevant relations for the system
2. Identify the differential equations
(a) Define symbols for the signals and parameters.
(a) Determine the system order:
Parameters
h
i
1
y
Signals
a2 Coefficient sec
b2 Coefficient
u sec
h
i
y (t) Output y
y
a1 Coefficient sec2 b1 Coefficient
2
u (t) Input u
h u sec i
y
a0 Coefficient sec3 b0 Coefficient
u sec3
Table 7: Signals and Parameters for the Differential Equation.
Units result from
dividing through by a3 , which has units of sec3 .
3rd order
(b) Select the state variables
Considering Eqn 74, select the state variables to be the output variable
and its derivatives up to the n 1 derivative
(b) Write the equations
To model a differential equation in Controllable Canonical form,
break the TF into two components, with the denominator first.
u(t)
y(t)
B(s)
A(s)
(a) Original TF
u(t)
1
A(s)
z(t)
(76)
The state vector is said to be in phase-variable form.
The derivative of the last phase variable is given by Eqn (74), re-written
as Eqn (77) (cf. Bay Eqn (1.25))
The breakdown of figure 20 gives two equations:
1.0z(t) = a2 z (t) a1 z (t) a0 z (t) + 1.0 u (t)
(74)
(75)
x1 (t)
x (t) = x2 (t)
x3 (t)
(c) Record the units, verify that units balance in the equations
See table 7.
(Revised: Sep 10, 2012)
The derivative of each of the phase variables is simply its successor
Figure 20: Transfer function as two components.
Part 5: Models of Linear Systems
x (t) = z (t)
z(t)
3. Write the differential equations in state-variable form
(b) TF in two components
y (t) = b2 z (t) + b1 z (t) + b0 z (t)
z (t)
The state-derivative vector includes the highest derivative of z (t) .
y(t)
B(s)
1.0z(t) + a2 z (t) + a1 z (t) + a0 z (t) = 1.0 u (t)
z (t)
x (t) = z (t)
,
z (t)
Page 61
z (t)
= z (t)
z(t)
Part 5: Models of Linear Systems
= 0
0
1
a0 a1 a2
z (t)
z (t)
z (t)
(77)
+ 0 u (t)
1
(78)
(Revised: Sep 10, 2012)
Page 62
EE/ME 701: Advanced Linear Systems
Section 7.3.1
y (t) = b2 z (t) + b1 z (t) + b0 z (t)
(75, repeated)
4. Write the equation of the output signal (or signals) using the states and
inputs
z (t)
h
i
y (t) = b0 b1 b2 z (t) + [0] u (t)
(79)
z (t)
EE/ME 701: Advanced Linear Systems
5. Check units throughout, to verify correctness (continued)
Putting the units together:
u-sec1
x (t) : u-sec2
u-sec3
(Note: Bays development includes a b3 term. Eqn (1.25) show how the b3
term is incorporated).
5. Check units throughout, to verify correctness.
From Eqn (74), z (t) has units of [u], so the phase-variable state vector
has units
z (t)
x (t) = z (t)
z (t)
[u]
x (t) : u-sec1
u-sec2
[]
1
[]
A : []
[]
1
2 1
3
sec
sec
sec
C:
h
y
u
i h
y-sec
u
Part 5: Models of Linear Systems
i
y-sec2
u
[]
B : []
sec3
y
u
y
u
i h
[]
[]
[u]
= []
u-sec1
[]
1
2 1
u-sec2
sec
sec
sec3
[]
+ [] [u]
sec3
y-sec
u
i
[u]
y-sec2
u-sec1
u
u-sec2
(Revised: Sep 10, 2012)
Page 63
>> a2 = 5; a1 = 7; a0=8;
>> b2= -2; b1 = 3; b0 = 4;
>> A = [ 0
1
0 ;
0
0
1;
-a0, -a1, -a2];
>> B = [ 0;0; 1];
>> C = [ b0, b1, b2];
D = 0;
Part 5: Models of Linear Systems
y
+ [u]
Build the model
D:
y (t) : [y] =
h
7.3.2 Building and exercising a differential equation model
Looking at table 7, the model matrices have units of
Section 7.3.2
(Revised: Sep 10, 2012)
Page 64
EE/ME 701: Advanced Linear Systems
Section 7.3.2
EE/ME 701: Advanced Linear Systems
Look at the step and frequency response
>> SSmodel3a = ss(A, B, C, D)
x1
x2
x3
x1
0
0
-8
x2
1
0
-7
x3
0
1
-5
b =
>> figure(2), clf; step(SSmodel3a);
>> figure(3), clf; bode(SSmodel3a);
u1
0
0
1
x1
x2
x3
Bode Diagram
Step Response
10
1.2
0
Magnitude (dB)
a =
y1
x1
4
x2
3
x3
-2
d =
u1
0
y1
0.8
10
20
30
Amplitude
c =
Section 7.4.0
0.6
40
360
Phase (deg)
0.4
0.2
>> Poles = pole(SSmodel3a) >> Zeros = zero(SSmodel3a)
Poles = -0.6547 + 1.3187iZeros = 2.3508
-0.8508
-0.6547 - 1.3187i
-3.6906
Looking at the pole-zero map (constellation)
0.2
4
5
Time (sec)
270
180
90
1
10
10
10
10
Frequency (rad/sec)
Figure 22: Circuit step and frequency response.
7.4 State-variable model and simulation diagram
>> figure(1), pzmap(SSmodel3a);
Analog computers include integrators, gain blocks and summing junctions
PoleZero Map
1.5
b3
y(t)
Imaginary Axis
0.5
b1
b2
b0
u(t)
0.5
.
x3
.
x2
x3
x2
.
x1
x1
1.5
4
-a2
Real Axis
-a1
-a0
a3=1
Figure 21: PZ map of the state-variable model.
(Controllable Canonical Form, Notation follows Bay)
Figure 23: Simulation diagram for system in controllable canonical form.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 65
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 66
EE/ME 701: Advanced Linear Systems
Section 7.4.0
Integrators, gain blocks and summing junctions are built with Op-Amps
C1
Integrators:
Vin(t)
+
iR1(t)
R1
V-(t)
V+(t)
Gain Blocks:
Rf
Vin(t)
Vo(t)
Ra
V+(t)
(105..107),
iRf(t)
+
Vol(t)
V (t)
iRa(t)
Figure 24: An Op-Amp integrator.
Op Amp gain is very high
negative feedback, so
Section 7.4.0
iC1(t)
EE/ME 701: Advanced Linear Systems
Figure 25: An Op-Amp gain block.
and the circuit is configured with
An Op-Amp gain block is shown in figure 26.
V (t) V + (t)
Using the principles of virtual ground and low input current
And in the circuit of figure 24, V + (t) is wired to ground. So node V
becomes a virtual ground
V (t) 0
Op Amp input currents are very small (109.. 1012 amps), so
IR1 (t) + Ic1 (t) = 0
V0 (t) =
Rf
Vin (t)
Ra
(83)
Summing junctions:
Rf
(80)
Eqn (80) gives
1
d
Vin (t) V (t) +C1
Vo (t) V (t) = 0
R1
dt
IRa (t) + IR f (t) = 0
(81)
Va(t) i
Ra
Ra
+
Vb(t) i
Rb
V-(t)
Rb
V+(t)
iRf
+
Vol(t)
With the properties of the virtual ground, Eqn (82) becomes
Figure 26: An Op-Amp summing junction.
d
1
d
1
Vin (t) +C1 Vo (t) = 0 , Vo (t) =
Vin (t)
R1
dt
dt
R1 C1
The Op-Amp virtual ground configuration sums the currents at the V
node.
Giving
Vo (t) =
Part 5: Models of Linear Systems
Z t
t0
1
Vin (t) dt
R1 C1
(Revised: Sep 10, 2012)
(82)
Page 67
V0 (t) =
Part 5: Models of Linear Systems
Rf
Rf
Va (t) + Vb (t) + ...
Ra
Rb
(Revised: Sep 10, 2012)
(84)
Page 68
EE/ME 701: Advanced Linear Systems
Section 7.4.0
A note on the inversions. Recall
Vo (t) =
Z t
t0
1
Vin (t) dt
R1 C1
EE/ME 701: Advanced Linear Systems
Section 7.4.0
Returning to the simulation diagram, we can write down the state model
directly from the simulation diagram (and vice-versa)
(82, repeated)
b3
In each of Eqns (82), (83) and (84) the output voltage is inverted relative
to the input voltage.
This is an inherent property of the virtual ground configuration.
b1
b2
u(t)
.
x3
.
x2
x3
x2
y(t)
b0
.
x1
x1
In analog computers, either
-a2
1. Introduce - signs as needed, and invert signals as needed with a gain
blocks, g = 1, or
2. Include a second Op-Amp in each element (Integrator, Gain block,
Summing junction), so that the block is non-inverting.
-a1
-a0
a3=1
(Controllable Canonical Form, Notation follows Bay)
Figure 27: Simulation diagram for system in controllable canonical form.
The output of each integrator is a state
1. Write the relevant relations for the system
(a) Define symbols for the signals and parameters
Signals
u(t)
Input
y (t)
Output
x1 (t) , x2 (t) , x3 (t) States
Parameters
b0..b3 Numerator Coefficients
a0..a2 Denominator Coefficients
Table 8: Signals and parameters of the simulation diagram.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 69
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 70
EE/ME 701: Advanced Linear Systems
Section 7.4.0
b3
b1
b2
u(t)
.
x3
.
x2
x3
-a2
x2
EE/ME 701: Advanced Linear Systems
y(t)
.
x1
-a1
y3(t)
b3
b0
Section 7.4.0
b1
b2
x1
u(t)
.
x3
.
x2
x3
-a0
-a2
a3=1
b0
.
x1
x2
y(t)
x1
-a1
-a0
a3=1
(Controllable Canonical Form, Notation follows Bay)
(Controllable Canonical Form, Notation follows Bay)
Figure 28: Simulation diagram for system in controllable canonical form.
Figure 29: Simulation diagram for system in controllable canonical form.
3. Write the differential equations in state-variable form
(b) Write the equations
Directly transcribe from (or to !) the block block diagram
Examining the block diagram we may write:
x1 (t) = x2 (t)
(85)
x2 (t) = x3 (t)
(86)
x3 (t) = a2 x3 (t) a1 x2 (t) a0x1 (t) + u (t)
y (t) = b3 (a2 x3 (t) a1 x2 (t) a0x1 (t) + u (t))
x (t)
1
x (t) = x2 (t)
x3 (t)
(87)
+b2 x3 (t) + b1 x2 (t) + b0 x1 (t)
(88)
0
1
0
= 0
0
1
a0 a1 a2
x (t)
1
x2 (t)
x3 (t)
0
+ 0 u (t) (89)
1
4. Write the equation of the output signal (or signals) using the states and
inputs (cf. Bay Eqn (1.25))
2. Identify the differential equations
(a) Determine the system order:
y3 (t) = b3 (u (t) a0 x1 (t) a1 x2 (t) a2 x2 (t))
3rd order
(b) Select the state variables
The physical coordinates of the system are
the integrator outputs
These are voltages we can observe on an
oscilloscope.
Part 5: Models of Linear Systems
x1 (t)
x (t) = x2 (t)
x3 (t)
(Revised: Sep 10, 2012)
y (t) =
x (t)
i 1
(b0 b3a0) (b1 b3a1) (b2 b3a2) x2 (t)
x3 (t)
(90)
+ [b3] u (t)
(91)
5. Check units throughout, to verify correctness.
See example 3.
Page 71
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 72
EE/ME 701: Advanced Linear Systems
Section 8.1.0
8 Some basic operations with state-variable models
8.1 Deriving the transfer function from the state-variable
model
It is straight forward to derive the transfer function from a state variable
model.
Starting with the state equation
x (t) = A x (t) + B u (t)
y (t) = C x (t) + D u (t)
EE/ME 701: Advanced Linear Systems
Section 8.1.1
8.1.1 InterpretingYthe
(s)transfer function
= C (s I A)1 B+D
U (s)
(96, repeated)
Eqn (96) can be solved symbolically by Cramers rule, to give the symbolic
transfer function.
Recall from basic linear algebra that Cramers rule gives the matrix
inverse as:
1
U 1 =
adj U
(97)
det U
Where U is an n n matrix, and adj U is the adjugate of matrix U.
Defining
Considering the Laplace transform of the state equation:
Vi, j = (1)i+ j Mi, j
(98)
adj U = V T
(99)
Then
Eqn (92) leads to:
s X (s) = A X (s) + B U (s)
(92)
Y (s) = C X (s) + D U (s)
(93)
(s I A) X (s) = B U (s)
Examples:
or
X (s) = (s I A)1 B U (s)
(94)
With (94), Eqn (93) leads to:
Y (s) = C (s I A)1 B U (s) + D U (s)
(95)
If m = 1 (one input) and p = 1 (one output), then Eqn (95) gives the transfer
function:
Y (s)
= C (s I A)1 B + D
(96)
U (s)
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Mi, j is the i, jth minor of matrix U and is the determinant of the matrix by
removing the ith row and jth column from U.
Page 73
U2 =
a b
c d
a b c
U3 = d e f
g h i
Part 5: Models of Linear Systems
adj U2 =
e f
+
h i
adj U3 =
..
.
b c
h i
...
(Revised: Sep 10, 2012)
Page 74
EE/ME 701: Advanced Linear Systems
Section 8.1.1
Using Cramers rule, we can symbolically solve Eqn (96). Since
(s I A)
It follows that
EE/ME 701: Advanced Linear Systems
Section 8.1.1
Multiplying out
adj (s I A)
=
det (s I A)
h
1
Y (s)
= 2
U (s) s + 7 s + 10
(100)
Y (s)
1
=
C adj (s I A) B + D
U (s) det (s I A)
(101)
2 s + 19
3s +6
= 28 s + 131
s2 + 7 s + 10
(103)
Today, we wouldnt want to apply Cramers rule by hand for any case larger
than 3x3.
Under the hood, this is how Matlab finds the TF from a state-variable model.
Example
A=
C=
B=
2
3
D = [0]
>> A = [ -2 3 ; 0 -5] B = [ 2 ; 3 ] C = [ 5
>> SSmodel = ss(A, B, C, D)
>> tf(SSmodel)
Transfer function:
Then
det (sI A) = det
(s + 2)
(s + 5)
adj (sI A) = adj
= (s + 2)(s + 5) 0 = s2 + 7 s + 10
(s + 2)
(s + 5)
So the TF is given by:
h
Y (s)
1
= 2
U (s) s + 7 s + 10
Part 5: Models of Linear Systems
(s + 5)
(s + 2)
i (s + 5)
3
2
+ [0] (102)
6
0
(s + 2)
3
(Revised: Sep 10, 2012)
Page 75
6 ] D = 0
28 s + 131
-------------s^2 + 7 s + 10
Note:
The denominator is given by det (s I A), showing the the poles are the
eigenvalues of the A matrix.
The B, C, and D matrices play no role in determining the poles, and thus
the stability
If we had additional inputs (additional columns in B) or outputs (additional
rows in C) the term
C adj (s I A) B
would give an array of numerator polynomials, one for each input/output
pair.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 76
EE/ME 701: Advanced Linear Systems
Section 8.1.3
From Eqn (96) we can determine the DC Gain of a state-variable model,
s j 0
Section 8.2.0
8.2 Coordinate transformation of a State Variable Model
8.1.2 DC gain of a state-variable model
KDC = lim
EE/ME 701: Advanced Linear Systems
Given a state-variable model with state vector x (t)
Y (s)
= lim C (s I A)1 B + D = C (A)1 B + (D104)
U (s) s j 0
C A1 B + D (105)
x (t) = A x (t) + B u (t)
(106)
y (t) = C x (t) + D u (t)
(107)
And an invertible transformation matrix T giving a new state vector z (t)
8.1.3 Interpreting D
z (t) = T x (t)
A direct transmission term corresponds to a transfer function that is not
strictly proper.
(108)
We can derive a new state model based on state vector z (t).
If D =0 , then
We can say that we have transformed the system from the coordinate
system of x (t) to the coordinate system of z (t).
the number of zeros equals the number of poles.
If D = 0 , we call the system strictly proper
Derivation of the transformation is straight forward. From Eqn (108) we
can solve for x (t)
If D 6= 0 , we call the system is proper, but not strictly proper
x (t) = T 1 z (t)
x (t) = T
z (t)
(109)
(110)
Plugging (108) and (109) into Eqns (106) and (107) gives
T 1 z (t) = A T 1 z (t) + B u (t)
(111)
From (111) we can write
z (t)
T A T 1 z (t) + T B u (t)
y (t) = C T
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 77
Part 5: Models of Linear Systems
(112)
z (t) + D u (t)
(Revised: Sep 10, 2012)
Page 78
EE/ME 701: Advanced Linear Systems
Section 8.2.0
EE/ME 701: Advanced Linear Systems
Eqn (112) gives the transformed state model
Equation (115) is a similarity transform.
b z (t) + Bb u (t)
z (t) = A
b = T A T 1
A
(113)
y (t) = Cb z (t) + D u (t)
with
Section 8.2.0
(114)
A similarity transform preserves eigenvalues,
b = T A T 1
A
(115)
Cb = C T 1
(117)
Bb = T B
b
eig (A) = eig A
The poles of system II are the same as the poles of the original system.
(116)
D : unchanged
Coordinate transformation is very powerful
(118)
We can convert a given model with x (t) to an equivalent model with z (t)
by choice of any invertible matrix T
The transformation is illustrated by figure 30.
This directly gives the set of all possible equivalent models !
The input and output are unchanged
Note that the D matrix, which directly couples u (t) to y (t), is unchanged
Only the internal representation of the system is changed
u(t)
A, B, C, D
y(t)
u(t)
A, B, C, D
x(t)
y(t)
z(t)
Figure 30: Block diagrams of original linear state-variable system and
transformed system.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 79
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 80
EE/ME 701: Advanced Linear Systems
Section 8.2.1
EE/ME 701: Advanced Linear Systems
Suppose we were interested in the suspension deflection
8.2.1 Example coordinate transformation
Consider the example of section 7.2.
m2
zs (t) = zw (t) zv (t)
zv(t)
If our interest was such that we wanted a state model with zs (t) and zs (t)
directly as states, we could introduce the transformation
vcar
ks
bs
m1
zs (t)
kw
r(t)
Inerial Reference
zw (t)
,
x (t) =
zv (t)
zv (t)
1 0 1
x (t) =
y1 (t) = zv (t) =
(kw +ks )
m1
mbs1 + mks1
+ mks2
+ mbs2
mks2
zw (t)
kw
m
+ mbs1
x (t) + 1 r (t)
0
1
0
mbs2
1
0
zw (t)
1 zw (t)
= T x (t)
0 zv (t)
zv (t)
1
(120)
(Revised: Sep 10, 2012)
(119)
a =
x1
x2
x3
x4
c =
y1
Page 81
1 0 1
0
1
0
Original State Variable Model (section 7.2.1)
i
zw (t)
+ [0] r (t)
0 0 1 0
zv (t)
zv (t)
Part 5: Models of Linear Systems
0 1
T =
0 0
0 0
We derived the model
zw (t)
So transformation T is given as:
Figure 31: Quarter vehicle suspension.
zs (t) 0 1
=
z (t) =
zv (t) 0 0
zv (t)
0 0
zw(t)
Road Surface
Section 8.2.1
x1
0
-440
0
4
x1 x2
0
0
x2
x3
1
0
-40
40
0
0
4
-4
x3 x4
1
0
Part 5: Models of Linear Systems
x4
0
40
1
-4
b =
x1
x2
x3
x4
d =
y1
u1
0
400
0
0
u1
0
(Revised: Sep 10, 2012)
Page 82
EE/ME 701: Advanced Linear Systems
Section 8.2.1
EE/ME 701: Advanced Linear Systems
9 State-variable Feedback control
Introduce the transformation
>> T = [ 1 0 -1 0 ; 0 1 0 -1 ; 0 0 1 0 ; 0 0 0 1]
T =
1
0
-1
0
0
1
0
-1
0
0
1
0
0
0
0
1
Put in State Feedback control, control signal u (t) is given as:
u (t) = K x (t) + N f r (t)
x1
x2
x3
x4
c =
y1
x1
0
-444
0
4
x1 x2
0
0
x2
x3
1
0
-44 -400
0
0
4
0
x3 x4
1
0
x4
0
0
1
0
Putting in feedback control is illustrated in figures 33, 34 and 35.
Input
u(t)
b =
u1
0
400
0
0
u1
0
x1
x2
x3
x4
d =
y1
Ap, Bp, Cp, Dp
System
Output
y(t)
x(t)
Figure 33: State-variable model of the open-loop system. This is the plant before
feedback control is applied, u (t) as input, and y (t) as output.
r(t)
r(t)
Nf
u(t)
y(t)
Ap, Bp, Cp, Dp
x(t)
x(t)
-K
Bode Diagram
Step Response
(121)
The control signal depends on the state vector and a reference input.
>> Ahat = T * A * inv(T);
Bhat = T * B
>> Chat = C1 * inv(T);
Dhat = D1
>> SSmodelHat = ss(Ahat, Bhat, Chat, Dhat)
a =
Section 9.0.0
50
1.4
New System
0
Magnitude (dB)
1.2
Figure 34: State-variable model of the closed-loop system with feed-forward gain
in input. The closed-loop system has r (t) as input and y (t) as output.
50
Amplitude
100
0.8
150
0
Phase (deg)
0.6
0.4
0.2
r(t)
90
Acl, Bcl, Ccl, Dcl
New System
0.5
1.5
Time (sec)
2.5
3.5
270
1
10
10
10
Frequency (rad/sec)
10
Of course, the step and frequency response are unchanged. The model
transformation changes only the internal representation of the system.
(Revised: Sep 10, 2012)
x(t)
10
Figure 32: Suspension step and frequency response of transformed model.
Part 5: Models of Linear Systems
y(t)
180
Page 83
Figure 35: State-space model of the closed-loop system, with r (t)as input and
y (t) as output.
Feedback control fundamentally transforms the system. Changing the statevariable model from figure 33 to 35.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 84
EE/ME 701: Advanced Linear Systems
Section 9.1.0
9.1 Determination of a new model with state feedback
EE/ME 701: Advanced Linear Systems
Section 9.1.0
Plugging the control law into the output equation, we find
To determine the state-variable model of the system with feedback, start
with the open-loop model (figure 33)
y (t) = C p x (t) + D p u (t) = C p x (t) D p K x (t) + D p N f r (t)
= (C p D p K) x (t) + D p N f r (t)
x (t) = A p x (t) + B p u (t)
(16, repeated)
y (t) = C p x (t) + D p u (t)
(17, repeated)
So we can write
y (t) = Ccl x (t) + Dcl r (t)
(125)
Ccl = C p D p K
(126)
Dcl = D p N f
(127)
with
Plugging the control law
u (t) = K x (t) + N f r (t)
(121, repeated)
Eqns (122)-(127) describe how we determine the state-variable model of the
system with feedback control.
into the state equation, we find
x (t) = A p x (t) + B p u (t) = A p x (t) B p K x (t) + B p N f r (t)
State feedback control is fundamentally different from single-loop,
compensated feedback:
= (A p B p K) x (t) + B p N f r (t)
Feedback is based on the state vector
So we can write
There is no compensator transfer function Gc (s)
x (t) = Acl x (t) + Bcl r (t)
(122)
Acl = A p B p K
(123)
Bcl = B p N f
(124)
The control gains form a vector, K Rmn , where m is the number of
inputs.
with
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 85
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 86
EE/ME 701: Advanced Linear Systems
Section 9.2.0
9.2 State-variable feedback example: Inverted pendulum
An inverted pendulum is a mechanism comprising a cart and pendulum that
holds the pendulum upright by feedback control. As illustrated in figure 36.
The system is open-loop unstable
EE/ME 701: Advanced Linear Systems
Section 9.2.0
Linearizing the equations about the operating point
0 = 0 , 0 = 0
For simplification, defining
p = I (M + m) + M m l 2
The system models an aspect of the challenge of rocket launch.
Choosing x (t) and forming the state-space model:
(t)
Ff=B z(t)
z (t)
0
Ap =
z (t)
,
x (t) =
(t)
(t)
z(t)
F(t)
Figure 36: Mechanical schematic of an inverted pendulum.
From Lagranges equations, the equations of motion for the inverted
pendulum:
(M + m) z (t) + b z (t) + m l (t) cos ( (t)) m l 2 (t) sin ( (t)) = F (t)(128)
I + m l 2 (t) + m l g sin ( (t)) + m l z (t) cos ( (t)) = 0 (129)
(I+m l 2 )
p
Bp =
ml
p
Cp =
(I+m l 2 ) b
p
( m2 g l 2 )
mpl b
m g l (M+m)
p
1 0 0 0
0 0 1 0
Dp =
0
0
Parameters
Mass of cart
[kg]
z (t)
Cart position
[m]
Mass of pendulum
[kg]
(t)
Pendulum Angle
[deg]
Length of Pendulum
F (t)
Applied force
[N]
Friction coef.
Inertial of pendulum
g = 9.8
Accel. of gravity
Part 5: Models of Linear Systems
[m]
kg m
2
[m/s2 ]
N
m/s
(Revised: Sep 10, 2012)
Page 87
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 88
EE/ME 701: Advanced Linear Systems
Section 9.2.0
EE/ME 701: Advanced Linear Systems
Example Data:
9.2.1 Designing a pole placement controller
M = 0.5;
i = 0.006;
m = 0.2;
g = 9.8;
>>
p = i*(M+m)+M*m*l^2,
>>
Ap = [
0
0
0
0
Ap =
0
0
0
0
b = 0.1;
l = 0.3;
p =
1
-(i+m*l^2)*b/p
0
-(m*l*b)/p
1.0000
-0.1818
0
-0.4545
>> Bp = [0;
(i+m*l^2)/p;
0;
m*l/p]
Bp =
Section 9.2.1
0
1.8182
0
4.5455
0.0132
0
(m^2*g*l^2)/p
0
m*g*l*(M+m)/p
0
2.6727
0
31.1818
>> Sdesired = [ -3 -4 -5 -6]
0;
0;
1;
0]
0
0
1.0000
0
1
0
7.0310
>> Acl = Ap - Bp * K
Acl =
0
1.0000
0
14.6939
13.9592 -63.2776
0
0
0
36.7347
34.8980 -133.6939
0
-12.7837
1.0000
-31.9592
>> Nf = 1/(Cp*inv(-Acl)*Bp)
Nf =
-8.0816
0
>> Cp = [ 1 0 0 0 ;
0 0 1 0 ]
Cp =
>> K = place(Ap, Bp, Sdesired)
K =
-8.0816
-7.7776
36.2727
0
0
0
1
0
0
>> SScl = ss(Acl, Nf(1)*Bp, Cp, Dp)
a =
x1
x2
x3
x4
x1
0
1
0
0
x2
14.69
13.96 -63.28 -12.78
x3
0
0
0
1
x4
36.73
34.9 -133.7 -31.96
b =
>> Dp = [ 0 ; 0 ]
Dp =
0
0
x1
x2
x3
x4
u1
0
-14.69
0
-36.73
Continuous-time model.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 89
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 90
EE/ME 701: Advanced Linear Systems
Section 9.2.1
With linear time-invariant systems there is a very nice optimal control result.
Consider the cost function
Step Response
1
0.5
J=
x (t)T Q x (t) + u (t)T R u (t) dt
(130)
t=0
0
Amplitude
Section 9.2.2
9.2.2 LQR Design
>> step(SScl)
To: Out(1)
EE/ME 701: Advanced Linear Systems
0.5
This cost function is minimized by a suitable choice of controller K
0.4
To: Out(2)
0.2
First LQR controller example
0.2
0.4
>> Q = diag( [1 1 1 1])
0
0.5
1.5
Time (sec)
2.5
3.5
Q =
Figure 37: Step response of Inverted Pendulum with control.
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
>> R = 1
R = 1
>> K = lqr(Ap, Bp, Q, R)
K =
-1.0000
-2.0408
>> Acl = Ap - Bp * K
Acl =
0
1.0000
1.8182
3.5287
0
0
4.5455
8.8217
20.3672
0
-34.3585
0
-61.3961
3.9302
0
-7.1458
1.0000
-17.8646
>> Nf = 1/(Cp*inv(-Acl)*Bp)
Nf =
-1.0000
0
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 91
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 92
EE/ME 701: Advanced Linear Systems
Section 9.2.2
Step Response
1.5
To: Out(1)
0.5
Amplitude
0.5
0.1
0.05
Step Response
1.5
0.05
1
0
4
Time (seconds)
To: Out(1)
0.1
Section 9.2.2
Example use of Brysons rules, to accelerate the response, place greater
cost on position error
>> Q = diag( [100 1 100 1])
Q =
100
0
0
0
0
1
0
0
0
0
100
0
0
0
0
1
>> K = lqr(Ap, Bp, Q, R)
K = -10.0000
-8.2172
38.6503
7.2975
>> Acl = Ap - Bp * K
>> Nf = 1/(Cp*inv(-Acl)*Bp)
Nf = -10.0000
0
>> Poles2 = eig(Acl)
Poles2 =
-6.9654 + 2.7222i
-2.2406 + 1.7159i
-6.9654 - 2.7222i
-2.2406 - 1.7159i
>> Poles1 = eig(Acl)
Poles1 =
-8.3843
-3.7476
-1.1020 + 0.4509i
-1.1020 - 0.4509i
To: Out(2)
EE/ME 701: Advanced Linear Systems
Figure 38: Step response of first LQR controller.
0.5
Amplitude
0.5
0.4
Second LQR controller example
0.2
To: Out(2)
Brysons rules: choose elements of Q and R to be 1/ x2i , where xi is the
allowed excursion of the ith state.
0.2
0.4
4
Time (seconds)
Figure 39: Step response of second LQR controller.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 93
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 94
EE/ME 701: Advanced Linear Systems
Section 10.0.0
10 Conclusions
Weve seen
Some of the basic properties of system models, and classification of signals.
How to build a state-variable model in 5 steps:
1. Write the relevant relations for the system
2. Identify the differential equations
3. Write the differential equations in state-variable form
4. Write the equation of the output signal (or signals) using the states and
inputs
5. Check units throughout, to verify correctness.
Advantages of state-variable modeling
List at least three reasons why state-variable models are advantageous
relative to differential equation modeling
Construction of several example state-variable models
Basic operations on a state variable model, including
Determining the transfer function
Coordinate transformation
State feedback control.
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012)
Page 95
Student
exercise
EE/ME 701: Advanced Linear Systems
EE/ME 701: Advanced Linear Systems
Solutions to the State Equation and
Modes and Modal Coordinates
Section 1.1.0
1 Modal coordinates
1.1 Derivation of modal coordinates
Contents
Considering the response of a linear system:
1 Modal coordinates
1.1
Derivation of modal coordinates . . . . . . . . . . . . . . . . . .
1.1.1
x (t) = A p x (t) + B p u (t)
(1)
y (t) = C p x (t) + D p u (t)
(2)
Choose the basis vectors to be the columns of the modal
matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.2
Example transformation into modal coordinates . . . . . .
1.1.3
Interpretation of the transformations
. . . . . . . . . . .
10
. . . . . . . . . . . . . . . . . . .
12
1.2
Example, a double pendulum
1.3
Transformation of the state model to and from modal
coordinates (Similarity transform from A p to Am or back) . . . . .
1.3.1
1.3.2
18
1 (t)
(t) = ...
n (t)
19
2.1
Example, double pendulum revisited . . . . . . . . . . . . . . . .
2.2
General form for combining complex conjugate parts of a 2nd
order mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
2.3
Deriving real basis vectors for a complex mode . . . . . . . . . .
25
2.4
Example with 2nd order modes, converting basis vectors to real . .
27
3 x (t) = A p x (t) defines A p -invariant spaces
(3)
where the i (t) are the basis coefficients and
Example, second-order system, 2 first-order modes.
19
is the representation of x (t) on basis vectors
Writing M =
e1 , e2 en
o
,
x (t) = M (t)
e1 , e2 en
o
.
so (t) = M 1 x (t)
(4)
30
4 Conclusions
Part 6: Response in Modal Coordinates
x (t) = i (t) ei
16
17
2 Complex eigenvalue pairs
Given a state vector x (t) Rn and a set of basis vectors {ei } for Rn , we
know that we can represent x (t) on basis {ei } by
i=1
Case of a full set of real, distinct eigenvalues
(diagonalizing A p ) . . . . . . . . . . . . . . . . . . . . .
. . .
where A p, B p , C p, D p form the model in physical coordinates.
31
(Revised: Sep 10, 2012)
Page 1
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 2
EE/ME 701: Advanced Linear Systems
Section 1.1.0
Likewise, we can represent the input signal, B p u (t) on the same basis:
Section 1.1.1
1.1.1 Choose the basis vectors to be the columns of the modal matrix
i (t) ei = B p u (t)
EE/ME 701: Advanced Linear Systems
(5)
i=1
However, when the ei are the columns of the modal matrix of A p (assuming
for now a complete set of independent eigenvectors), then the terms
where the i (t) are the basis coefficients representing B p u (t) on {ei }.
Writing
i (t) A p ei
i (t) i ei
become
i (t) ei = M (t)
(6)
And Eqn (7) becomes
i=1
then
B p u (t) = M (t)
so (t) = M 1 B p u (t)
i=1
If we expand x (t) in the state equation using the representation on {ei }, we
find:
from:
x (t) = A p x (t)
+ B p u (t)
n
i (t) ei
i=1
i=1
i=1
i (t) A p ei + (t) ei
(7)
x (t) A p x (t) B p u (t) = 0
n
i (t) ei i (t) A ei (t) ei
i=1
i=1
= 0
(8)
i=1
Even though the i (t) and i (t) terms in Eqn (8) are scalar, because of the
middle term, with matrix A p, Eqn (8) leads in general to the vector equation
n
i=1
i (t) ei i (t) A ei i (t) ei = 0
i=1
i=1
i=1
(10)
Since the ei are independent, Eqn (10) is verified only if each term in
parenthesis is zero, which gives a set of simultaneous scalar equations (see
Bay Eqn (6.25))
i (t) i (t) i i (t) = 0
Rearranging the state equation gives (see Bay, Eqn (6.24))
from:
n
i (t) ei i (t) i ei i (t) ei = i (t) i (t) i i (t) ei = 0
(8, repeated)
i=1
Eqn (11) we know how to solve:
i (t) = i (t0 ) e (tt0 ) +
t0
e (t) i () d
(12)
And the full solution is given from Eqn (3)
i=1
(Revised: Sep 10, 2012)
Z t
(9)
x (t) = M (t)
which is not especially helpful.
Part 6: Response in Modal Coordinates
(11)
Representing the state and input on the modal matrix of A p , an nth order
coupled differential equation becomes a set of n first order uncoupled
differential equations !
i=1
!
n
n
i (t) I
i (t) A i (t) I ei = 0
i (t) = i i (t) + i (t)
or
Page 3
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 4
EE/ME 701: Advanced Linear Systems
Section 1.1.1
The basis vectors ei that decouple the system are the columns of the modal
matrix
i
h
1
(13)
e1 e2 en = M , recall A p = M J M
Eqn (13) is the general form.
eigenvectors exists, then
M =V ,
J =U
EE/ME 701: Advanced Linear Systems
1.1.2 Example transformation into modal coordinates
Consider the system governed by
When a complete set of independent
and
A p = V U V 1
(14)
Section 1.1.2
x (t) =
0.164
0.059
0.059
0.164
with initial condition
where V is the matrix of eigenvectors and U has the eigenvalues on the main
diagonal.
The eigensystem is:
x (t) +
x (t0 ) =
1
0
1.085
0.031
u (t)
(15)
>> [V, U] = eig(Ap)
V =
-0.7071
-0.7071
0.7071
-0.7071
U =
-0.2231
0
0
-0.1054
There is a complete set of independent eigenvectors. The modal matrix is
given by:
1 1
1
(16)
V , M=
M=
0.707
1 1
Eqn (16) illustrates that the basis vectors of the modal matrix can be
scaled by a parameter (actually each vector can be scaled independently).
By scaling M to 1.0, the some of the coefficients below get simpler.
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 5
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 6
EE/ME 701: Advanced Linear Systems
Section 1.1.2
The i are given from Eqn (5), which can be rewritten as
1 (t)
.. = M (t) = B u (t)
.
p
n (t)
so
(t) = M
B p u (t) =
0.527
0.558
The initial conditions are given from
u (t)
EE/ME 701: Advanced Linear Systems
Section 1.1.2
For a step input, which gives constant i (t), the form for the solution is:
1 t
e 1
i (t) = i (0) et + i
(17)
(20)
which gives
1 0.223t
1
e
0.223
1 0.105t
0.527
1
e
0.105
1 (t) = 0.5 e0.223t 0.558
(18)
2 (t) = 0.5 e0.105t
(21)
The transformation back to physical coordinates is given by:
(t0 ) = M 1 x (t0 ) =
0.5
0.5
(19)
And the two uncoupled solutions are (with t0 = 0)
1 (t) = 0.5 e0.223t 0.558
Z t
e0.223 (t) u () d
2 (t) = 0.5 e0.105t 0.527
Z t
e0.105 (t) u () d
x (t) = M (t) = e1 1 (t) + e2 2 (t) =
1
1
1 (t) +
1
1
2 (t) (22)
Eqn (22) can be used in two ways
1. Eqn (22) shows that the output x (t) is the superposition of n
contributions. Each contributing vector is a basis vector. They are the
columns of the modal matrix. And each has a basis coefficient i (t).
2. We can add up the contributions for an individual xi (t), such as:
x1 (t) = .5 e0.223t + 2.5 1 e0.223t + .5 e0.105t + 5 1 e0.105t
(23)
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 7
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 8
EE/ME 701: Advanced Linear Systems
Section 1.1.2
Putting the pieces together:
EE/ME 701: Advanced Linear Systems
Section 1.1.3
1.1.3 Interpretation of the transformations
From
1. By transformation into modal coordinates, the states are uncoupled.
rT1
..
(t) = M 1 x (t) =
.
rTn
2. The uncoupled equations can be solved
3. The solution in physical coordinates is found by transformation from modal
coordinates back into physical coordinates.
The order of an equation in the uncoupled equations can be:
x (t)
(24)
the jth row of M 1 gives the coupling of the physical states into the jth
mode. We can write that
1 (t) = rT1 x(t) = hr1, x(t)i
..
.
First, for a real pole
Second, for a complex pole pair
(25)
n(t) = rTn x(t) = hrn , x(t)i
Eqn (11) works directly for complex pole pairs, but it may be more
convenient to organize each pair of complex poles into a 2nd order realvalued mode (see section 2).
nth order for an n n Jordan block.
When there is not a complete set of eigenvectors, the Jordan form is used,
and gives l coupled states for each chain of l eigenvectors (one regular
and l 1 generalized).
From
x (t) = M (t)
(26)
each mode contributes to the physical states according to a column of M,
and each physical state is determined by a combination of modes according
to a row of M.
The response of a linear system can be thought of as a collection of firstand second-order mode responses with forcing functions.
Through the M matrix, the response of the physical states of the system
(voltages and velocities, say), generally involves a superposition of all of
the modes of the system.
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 9
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 10
EE/ME 701: Advanced Linear Systems
Section 1.1.3
EE/ME 701: Advanced Linear Systems
1.2 Example, a double pendulum
From (assuming B p is constant)
= M 1 B p
(27)
the input is coupled to each mode according to the elements of . Following
the example of Eqn (25)
i = ri , B p
(28)
Consider 2 masses connected by a spring, this system will have two oscillatory
modes:
a lower frequency mode in which the masses swing in phase, and
a higher frequency mode in which the mass swing in opposite phase.
Notice in particular that if there is an element in that is zero, there is no
forcing function for that mode in Eqn (12).
i (t) = i (t0 ) e (tt0 ) +
Z t
t0
e (t) i () d
Section 1.2.0
2
k
(Eqn (12), repeated)
u
Since the modes are uncoupled, if a i is zero, the input u (t) is not
connected to that mode.
m1
m2
Figure 1: Two mass system at rest.
We call the system expressed in terms of (t) and using Am , Bm , Cm , D p
as the system in modal coordinates, because each state variable is uniquely
associated with a mode.
In Phase
Opposite Phase
Figure 2: Modes of the two mass system.
The linearized equations of motion are:
m1 l 2 1 + d 1 + (k + m g l) 1 = k 2 + u
m2 l 2 2 + d 2 + (k + m g l) 2
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 11
Part 6: Response in Modal Coordinates
= k 2
(Revised: Sep 10, 2012)
(29)
(30)
Page 12
Section 1.2.0
With m = 2, l = 1, d = 3, k = 20, and g = 9.8 , we find:
1 (t)
1.50 19.80 0.00 10.00
1 (t)
1.00
0.00
0.00
0.00
, Ap =
x (t) =
2 (t)
0.00 10.00 1.50 19.80
2 (t)
0.00
0.00
1.00
0.00
EE/ME 701: Advanced Linear Systems
Section 1.2.0
0.5
Theta 1
EE/ME 701: Advanced Linear Systems
(31)
0.5
0
10
5
6
t [seconds]
10
Theta 2
0.5
0.5
0
Figure 4: Two masses swinging in phase (mode 2, mode 1 not excited).
2
Theta 1 [degrees]
A slower, in-phase mode, excited from the initial condition (figure 4)
h
iT
xT (0) = 0 1 0 1
The system has two 2nd order modes:
A faster, out-of-phase mode, excited from the initial condition (figure 3)
h
iT
xT (0) = 0 1 0 1
We can excite the modes individually. Setting up initial conditions that lie
in the column space formed by the basis vectors of the mode.
General initial conditions give motion that is a superposition of all modes.
1
0
10
5
6
t [seconds]
10
1
Theta 2 [degrees]
Theta 1 [degrees]
1
0.5
0
0
0.5
0.5
1
0
1
0
1
10
Figure 5: Two masses swinging, both modes excited. (Figure 5 is the
superposition of figures 4 and 3.)
1
Theta 2 [degrees]
0.5
0.5
0
0.5
Notes:
1
0
5
6
t [seconds]
10
Figure 3: Two masses swinging out of phase (mode 1, mode 2 not excited).
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 13
The unforced response (to initial conditions) is considered in figures 3-5.
The forced response (to u(t)) can also be understood in terms of modes.
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 14
EE/ME 701: Advanced Linear Systems
Section 1.2.0
For an n dimensional system there may be n1 first order modes and n2
second order modes, where n = n1 + 2 n2
In modal coordinates, we can write the system dynamics as:
(t) = Am (t) + (t)
1.3 Transformation of the state model to and from modal
coordinates (Similarity transform from A p to Am or back)
(32)
Choosing the modal matrix M as a special transformation matrix, and
introducing (t), the state vector in modal coordinates,
x(t) = M (t)
Am = M 1 A p M
(33)
A p = M Am M 1
(34)
If the system has second order modes, Am will have 2x2 blocks, one
corresponding to each 2nd order mode.
Then:
(t) =
Or:
(t) = = Am (t) + Bm u (t)
y (t) = Cm (t) + D p u (t)
Where:
Am = M 1 A p M
Bm = M 1 B p
Cm = C p M
Modes give distinct contributions to the total output, such as
second order:
M 1 A p M x(t) + M 1 B p u(t)
y (t) = C p M (t) + D p u (t)
If the system requires the Jordan form, Am will have a block
corresponding to each Jordan block.
The important physical property of modes is that they are uncoupled. In
figure 4 mode 1 evolves without mode 2, and likewise in figure 3. In figure
5 both modes evolve, independently.
(35)
(t) = M 1 x(t)
If the system has only first order modes, Am is diagonal (all blocks are
1x1).
first order:
Section 1.3.0
Weve seen that we can transform a state model to a new set of basis vectors
with a transformation matrix.
where Am is a block diagonal matrix,
and
EE/ME 701: Advanced Linear Systems
In the general case, Am is block diagonal:
y(t) = a1e1t
1x1 block for every real eigenvalue
y(t) = a2e2t cos(2t + 2)
2x2 block for every pair of complex eigenvalues
Modal coordinates are best (and perhaps only) understood in state space.
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 15
l l block for every l l block in the Jordan form (if needed)
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 16
EE/ME 701: Advanced Linear Systems
Section 1.3.1
1.3.1 Case of a full set of real, distinct eigenvalues
(diagonalizing A p)
U =
1
...
n
..
.
..
.
Section 1.3.2
1.3.2 Example, second-order system, 2 first-order modes.
When we have real and distinct eigenvalues we will have a complete set of
independent eigenvectors, and
EE/ME 701: Advanced Linear Systems
Given the system of section 1.1.2. Recall that the solution for x (t) is of the
form
1
1
1 (t) +
2 (t)
x (t) = M (t) = e1 1 (t) + e2 2 (t) =
1
1
Figure 6 is a plot of the phase portrait, showing the state trajectory from
initial condition
1
x (0) =
0
V = v1 vn
..
..
.
.
A phase portrait is a plot of the state as a function of time.
It can be in any coordinate frame.
It will be an n-dimensional plot, for an nth order system.
Then
M =V
and Am = M
Ap M
(36)
1
The modal matrix is the eigenvector matrix.
x2
The system matrix in model coordinates, Am, is the diagonal matrix of
eigenvalues.
e2
x2
0.5
x(t)
x1
0.5
e1
1
0.5
0.5
x1
1.5
Figure 6: Response of a system with two first-order modes. The x show the
state at 1.0 second intervals.
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 17
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 18
EE/ME 701: Advanced Linear Systems
Section 2.1.0
2 Complex eigenvalue pairs
EE/ME 701: Advanced Linear Systems
When each pole is distinct, the state-response of a system is the
superposition of the individual modal responses. Putting together Eqns (25)
and (24) above, the response due to initial condition and each mode is given
by:
Complex pole pairs correspond to 2nd order, oscillatory modes.
2.1 Example, double pendulum revisited
Considering the double pendulum example of section 1.2, the A p matrix
gives the eigenvectors and eigenvalues:
0.70 + j 0.00, 0.70 j 0.00, +0.68 + j 0.00, +0.68 j 0.00
+0.02 + j 0.13, +0.02 j 0.13, 0.05 j 0.21, 0.05 + j 0.21
V =
+0.70 + j 0.00, +0.70 j 0.00, +0.68 + j 0.00, +0.68 j 0.00
0.02 j0.13, 0.02 + j0.13, 0.05 j 0.21, 0.05 + j 0.21
Am =
0.75 + j 5.41
0.00
0.00
0.00
0.00
0.75 j 5.41
0.00
0.00
0.00
0.00
0.75 + j 3.04
0.00
0.00
0.00
0.00
Section 2.1.0
0.75 j 3.04
x (t) = e1 1 (t) + e2 2 (t) + e3 3 (t) + e4 4 (t)
(37)
i (t) ei = eit ei rTi x(0)
(38)
with
When the poles are complex, Eqns (37), (38) are none-the-less valid. For
the example above,
1 (t) e1
2 (t) e2
e(0.75+ j5.41)t
e(0.75 j5.41)t
0.70 + j 0.00
+0.02 + j 0.13
+0.70 + j 0.00
0.02 j0.13
0.70 j 0.00
+0.02 j 0.13
+0.70 j 0.00
0.02 + j0.13
h
i
0.36 j 0.05, +0.00 j 1.98, +0.36 + j 0.05, 0.00 + j 1.98 x (0)
h
i
0.36 + j 0.05, +0.00 + j 1.98, +0.36 j 0.05, 0.00 j 1.98 x (0)
Making the example very specific, with
Corresponding to 2 complex modes (figures 3 and 4).
x (0) =
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 19
Part 6: Response in Modal Coordinates
and
t = 0.6
(Revised: Sep 10, 2012)
Page 20
EE/ME 701: Advanced Linear Systems
Section 2.1.0
Matlab code to evaluate Eqn (37):
EE/ME 701: Advanced Linear Systems
Section 2.1.0
The example above illustrates
>> [V, U] = eig(Ap);
>> Vinv = inv(V);
>> v1 = V(:,1)
v1 =
-0.6955 + 0.0000i
0.0175 + 0.1262i
0.6955
-0.0175 - 0.1262i
>> r1 = Vinv(1,:)
r1 =
-0.36 - 0.05i
0.00 - 1.98i
>> v2 = V(:,2) %% v1*
>> r2 = Vinv(2,:) %% r1*
>> x0 = [ 1;2;-1;-2]
>> t = 0.6
>> xi1 = exp(U(1,1)*t) * v1 * r1 * x0
xi1 =
0.0477 - 3.5723i
-0.6494 + 0.0813i
-0.0477 + 3.5723i
0.6494 - 0.0813i
>> xi2 = exp(U(2,2)*t) * v2 * r2 * x0
xi2 =
0.0477 + 3.5723i
-0.6494 - 0.0813i
-0.0477 - 3.5723i
0.6494 + 0.0813i
If i is complex, ei and ri will in general also be.
In this case Eqn (37) can make a complex contribution xi (t), but complex
terms always come in complex conjugate pairs, and two equations like
(37) make up the mode.
We can combine the two 1st order complex terms into a single 2nd order
real term.
0.36 + 0.05i
-0.00 + 1.98i
>> RoundByRatCommand(xi1+xi2) %% First and second terms
ans =
0.0954
%% create a real-valued 2nd order mode
-1.2988
-0.0954
1.2988
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 21
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 22
EE/ME 701: Advanced Linear Systems
Section 2.2.0
EE/ME 701: Advanced Linear Systems
2.2 General form for combining complex conjugate parts of a
2
nd
Section 2.2.0
All the imaginary terms cancel (as they must !) and the result reduces to:
order mode
x(t) = 2 a er rr T b er ri T a ei ri T b ei rr T x (0)
Considering two terms from Eqn (37), a 2nd order mode makes a
contribution to x (t) according to:
x(t) = [ (er + jei ) (er jei ) ]
...
e(+ j )t
e( j )t
..
.
(rr + jri )T
(rr jri )T
..
...
.
(39)
Define a + j b as
a + j b = e(+ j )t = et (cos( t) + j sin( t))
(40)
Or, equivalently,
a = et cos ( t) ,
Which is given by:
x(0)
x(t) = [ er ei ]
b a
rTr
rTi
x(0)
(42)
Considering the numerical example above
>> alpha = real( U(1,1) )
alpha = -0.7500
>> omega = imag( U(1,1) )
omega =
5.4072
>> a =
a =
>> b =
b =
b = et sin ( t)
(41)
exp(alpha*t) * cos(omega*t),
-0.6343
exp(alpha*t) * sin(omega*t),
-0.0654
Multiplying out Eqn (39) gives:
x(t) = [a er rr T + j a er ri T + j b er rr T b er ri T
+ j a ei rr T a ei ri T b ei rr T j b ei ri T
a er rr T j a er ri T j b er rr T b er ri T
j a ei rr T a ei ri T b ei rr T + j b ei ri T ] x(0)
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 23
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 24
EE/ME 701: Advanced Linear Systems
Section 2.3.0
2.3 Deriving real basis vectors for a complex mode
EE/ME 701: Advanced Linear Systems
Section 2.3.0
When M has a complex conjugate pair of eigenvectors
Going back to the definition of modal coordinates, starting with
M = ei ei
x (t) = A p x (t) + B p u (t)
Convert to modal coordinates
ei = er + j e j
with
(44)
The 2 2 element in N that selects the real and imaginary parts of ei is:
(t) = = Am (t) + Bm u (t)
11 j
N2 =
2 1 j
with
Am = M 1 A p M
Bm = M 1 B p
Then
Am
(Revised: Sep 10, 2012)
(45)
...
|
(43)
Part 6: Response in Modal Coordinates
j j
Now considering Am . The complex pair of eigenvectors will have complex
conjugate eigenvalues
and
M = er e j
| |
Then
Am = N 1 M 1 A p M N = N 1 Am N
N2
...
N =
We can introduce a modification to the modal matrix by:
M 1 = N 1 M 1
N21 =
which gives
Choose N to be
M = MN
Page 25
1
= N Am N =
...
1
j j
Part 6: Response in Modal Coordinates
...
...
i
i
...
1
2
(Revised: Sep 10, 2012)
...
(46)
1 j
...
1
Page 26
EE/ME 701: Advanced Linear Systems
Section 2.4.0
...
=
2
...
Note, if it is convenient, we can scale the columns of M as desired. For
example, scaling the largest element of each column to 1.0 gives
(47)
(48)
2.4 Example with 2nd order modes, converting basis vectors to
1.00
1 j
N=
2
0
0
then
j 0 0
0 1 1
0 j j
Finding the system matrix in Modal coordinates,
real
The double pendulum has two second order modes. Using the example from
section 2.1, choose
0.03 1 0.08 1
M =
1.00 0
1
0
0.03 1 0.08 1
where
= + j
Section 2.4.0
...
+
j ( )
=
j ( ) +
EE/ME 701: Advanced Linear Systems
0.75 5.41
5.41
1 1
1
Am = N M A p M N = N Am N =
0.75
0
0
0.75 3.04
3.04 0.75
0
The first mode oscillates in a vector subspace of state space, given by
S1 = {x : x = 1 e1 + 2 e2 }
0.70
0.00
0.68
0.02 0.13 0.05 0.21
M =NM =
0.70 0.00 0.68
0
0.02 0.13 0.05 0.21
Part 6: Response in Modal Coordinates
where e1 and e2 are the first two columns of M .
(Revised: Sep 10, 2012)
The second mode oscillates in a vector subspace of state space, given by
S2 = {x : x = 3 e3 + 4 e4 }
where e3 and e3 are the third and fourth columns of M .
Page 27
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 28
EE/ME 701: Advanced Linear Systems
Section 2.4.0
For example, the initial condition of figure 3 in physical and modal
coordinates is:
1
,
x (0) =
(0) = M p
Section 3.0.0
3 x (t) = A p x (t) defines A p-invariant spaces
Considering the unforced response of
1.0
x (0) =
EE/ME 701: Advanced Linear Systems
x (t) = A p x (t) + B p u (t)
(49)
and the initial condition of Eqn (49) excites only the first (faster) mode.
(that is, u (t) = 0), a real i and ei define a 1-D A p -invariant subspace. That
is, if
x (t = 0) {x : x = ai ei }
then
x (t) {x : x = ai ei }
x (0) =
,
0
1
(0) = M p
x (0) =
1.0
If
(50)
(Revised: Sep 10, 2012)
x (0) {x : x = a1 er + a2 ei }
then
x (t) {x : x = a1 er + a2 ei } t > 0
and the initial condition of Eqn (50) excites only the second (slower) mode.
Part 6: Response in Modal Coordinates
(51)
In the same way, complex eigenvalues define a 2-D A p-invariant subspace,
with
and
ei = imag (vi )
(52)
er = real (vi )
The initial condition of figure 4 is
t > 0
Page 29
(53)
Eqn (53) implies that rr , ri {x : x = a1 er + a2 ei } , which is verified
numerically for the example above.
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 30
EE/ME 701: Advanced Linear Systems
Section 4.0.0
4 Conclusions
The response of linear dynamic systems can be decomposed into the
responses of individual modes.
The modes are decoupled: if the input or initial condition excites only
one mode, only that mode responds.
Modes are either
First order, or
Second order, or
Correspond to a Jordan block
The basis vectors of the modes (vectors ei in Eqn (3) et seq.) are the bases
of A p -invariant subspaces in state-space, defined by the modes.
The modal matrix M is used to transform between modal and physical
coordinates.
The forcing function, u (t), can also be decomposed into forcing functions
for each of the modes.
Part 6: Response in Modal Coordinates
(Revised: Sep 10, 2012)
Page 31
EE/ME 701: Advanced Linear Systems
EE/ME 701: Advanced Linear Systems
Phase Portraits and Lyapunov Stability
Section 1.0.0
1 Introduction to Stability Theory
We will consider the stability of
Contents
Linear time-invariant systems
1 Introduction to Stability Theory
Linear time-varying systems
2 The phase plane (or phase portrait)
Nonlinear systems
2.1
Phase space in higher dimensions . . . . . . . . . . . . . . . . . .
2.2
Local stability, definitions . . . . . . . . . . . . . . . . . . . . . .
2.3
Local stability, additional terminology . . . . . . . . . . . . . . .
13
2.4
Determining local stability . . . . . . . . . . . . . . . . . . . . .
14
3 Limit Cycles
15
4 Lyapunov stability
17
4.1
Generalization of Lyapunovs energy function: . . . . . . . . . . .
20
4.2
Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . . . .
21
4.3
Lyapunov stability of a linear system
22
. . . . . . . . . . . . . . .
easy, given by the eigenvalues
harder, some unexpected results
in general quite hard
Stability of linear time-invariant systems is relatively simple:
Examine poles (eigenvalues of the system matrix)
LTI systems can show exponential decay, marginal stability, or
exponential growth
Linear time-varying systems:
We cant look at the succession of instantaneous systems
Even if the eigenvalues are always in the left half-plane, the system can
be unstable
In discrete time, a sequence of stable systems can be unstable.
More powerful tools needed.
5 Summary
23
For nonlinear systems the picture is even more complex:
Local stability,
Properties for small inputs may not extend to large input
Stability may be input dependent
Stability may be initial-condition dependent
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 1
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 2
EE/ME 701: Advanced Linear Systems
Section 1.0.0
EE/ME 701: Advanced Linear Systems
Section 2.0.0
2 The phase plane (or phase portrait)
A tool is needed to understand stability properties in general cases
For time-varying and nonlinear systems, various flavors of stability:
For general systems given by:
Stable (or delta-epsilon stable)
x (t) = f (x, u, t)
If the system starts within of the equilibrium point, it never goes
outside of distance from the equilibrium point.
(1)
y (t) = g (x, u, t)
Uniform stability
we need a tool for looking at the behavior of the system.
Stability is not dependent on the time (for time-varying systems).
One general and powerful tool is the phase plane.
Asymptotic stability
Example, consider the undamped system :
As t , the state draws ever closer to the equilibrium point.
d 2y
+ 2 y = 0
dt 2
Exponential stability
As t , the state draws ever closer to the equilibrium point at
least as fast as some exponential curve.
x (t) =
x (t) =
0 2
1
Undamped oscillator response
A time-plot shows:
1
0.8
0.6
State signals
0.4
0.2
0
0.2
0.4
0.6
0.8
1
0
10
15
20
25
time [sec]
An nth order system will have n curves in the time plot.
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 3
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 4
EE/ME 701: Advanced Linear Systems
Section 2.0.0
Phase plane (continued)
EE/ME 701: Advanced Linear Systems
Section 2.0.0
Phase plane (continued)
To draw the phase portrait, take away time, and make the axes of the plot
the states (2-D plot for 2nd order system, 3-D plot for 3rd order system, etc.)
A path is called a phase trajectory or simply a trajectory .
We can consider any number of trajectories, to create a phase portrait.
Undamped oscillator response
Undamped oscillator response
1
1.5
Velocity [m/s]
0.5
Velocity [m/s]
0.5
0.5
0.5
0.5
0
Position [m]
0.5
Figure 1: Trajectory on phase plane.
1.5
1.5
0.5
Consider the damped oscillator.
0
Position [m]
0.5
1.5
Figure 3: Phase portrait, several trajectories.
Undamped oscillator response
1
Undamped oscillator response
Each point on a phase plane corresponds to one specific value of state, x .
0.5
Velocity [m/s]
0.8
State signals
0.6
0.4
An important property of phase portraits comes from the very definition of
state:
0.2
0.5
0
What is state, it is all the information needed to determine the future
trajectory of the system (given inputs).
0.2
1
0.4
0.6
0
4
time [sec]
0.5
0
Position [m]
0.5
Corollary: the trajectory departing from a point xa is a function only of
xa , with no dependence on how the trajectory arrived to state xa .
Figure 2: Damped oscillator.
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 5
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 6
EE/ME 701: Advanced Linear Systems
Section 2.0.0
Phase plane (continued)
Section 2.2.0
2.1 Phase space in higher dimensions
At each point in phase-space (at each possible state) there is a direction of
departure.
The concept of a trajectory through phase space extends naturally to higherorder systems.
We can plot the phase arrows along trajectories, or at any point, such as grid
points.
There are two challenges:
Undamped oscillator response, vector field
1
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
0
0.2
An equilibrium point is a steady-state operating point of a system, it is a
point xe where:
x = f (xe , u, t) = 0
(2)
0.4
0.6
0.6
0.8
0.8
2.2 Local stability, definitions
0.2
0.4
For homogeneous linear systems the origin is always an equilibrium
point
x = A 0 = 0
1
1
0.5
0
Position [m]
0.5
1. We cant easily plot 3rd and higher order phase portraits.
2. The theorem that phase trajectories can not cross is less useful.
Undamped oscillator response, vector field
Velocity [m/s]
Velocity [m/s]
EE/ME 701: Advanced Linear Systems
(a) Along trajectories
0.5
0
Position [m]
0.5
If A has a null space, there will be infinitely many equilibrium points
(a null space in A corresponds to a pole at the origin, consider they
dynamics of a car, there are infinitely many places your car can stop).
(b) At grid points
Figure 4: Phase portrait, plotting x vectors at x points.
To plot 4(b), we need only be able to compute x (x), it is not necessary to
solve the differential equation.
If there is an input, the equilibrium will be shifted:
x (t) = A x (t) + B u ,
xe : A xe + B u = 0
A handy feature for nonlinear systems.
In 2-D we have powerful theorem: phase trajectories can not cross.
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 7
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 8
EE/ME 701: Advanced Linear Systems
Section 2.2.0
EE/ME 701: Advanced Linear Systems
Section 2.2.0
Equilibrium points come in four flavors
1. Stable node
3. Saddle point
2. Unstable node
4. Center
Stable node, the phase trajectories travel into xe ,
Unstable node, the phase trajectories travel away from xe ,
Figure 6: Stable and unstable equilibrium points, 2nd order modes.
Figure 5: Stable and unstable equilibrium points, 1st order modes.
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 9
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 10
EE/ME 701: Advanced Linear Systems
Section 2.2.0
Saddle Point: Trajectories move in on some paths and out on others.
EE/ME 701: Advanced Linear Systems
Section 2.2.0
Center, marginal stability (the first example).
Figure 8: Phase portrait for marginal stability.
Figure 7: Saddle points, system has both stable and unstable modes.
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 11
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 12
EE/ME 701: Advanced Linear Systems
Section 2.3.0
2.3 Local stability, additional terminology
EE/ME 701: Advanced Linear Systems
Section 2.4.0
2.4 Determining local stability
As you would expect, a variety of terminology is applied to the
characteristics of differential equations.
Consider again the most general model,
x = f (x, u, t)
In the controls literature, one will sometimes see an equilibrium point
referred to as:
(3)
With x = xe + x , x (x) is given by the Taylor series expansion by:
A fixed point
x (x) = x (xe ) +
An attractor (if the system is stable).
If the fixed point is called an attractor , we can call the region (of
state space) from which trajectories converge to the attractor the basin of
attraction.
Figure 7.2 of Bay illustrates multiple attractors and basins of attraction of a
pendulum. Notice stable and unstable equilibria.
f (x, u, t)
x + O x2
x
x=xe
(4)
0+J x
The derivative of ha vector is a matrix of thei individual scalar derivatives.
T
, then:
With f (x, u, t) = f1 () f2 () fn ()
f1
x1
f2
x1
f1
x2
f2
x2
..
.
..
.
f1
xn
f2
xn
f (x, u, t)
J=
=
x
x=xe
.. fn
fn fn
. xn
x1 x2
x=xe
Near to xe , the first-order term will dominate and the stability properties
(stable vs. unstable vs. saddle point vs. center) is determined by the
f (x, u,t)
, which is sometimes called the Jacobian matrix.
eigenvalues of
x
Local stability is usually computable, because we can usually form the
Jacobian matrix, even for non-linear systems.
Figure 9: Multiple attractors and basins of attraction (Bay figure 7.2).
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 13
Local stability is a limited result. For example, even for a stable fixed point,
it may be difficult to determine the size of the basin of attraction.
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 14
EE/ME 701: Advanced Linear Systems
Section 3.0.0
3 Limit Cycles
EE/ME 701: Advanced Linear Systems
Section 3.0.0
Nonlinear systems have trajectories which are closed curves.
Contrasting marginal stability (figure 8), a phase portrait for marginal
stability has closed curves, but these are isolated, and are not the limit of
a trajectory.
Consider different stability cases
Establishing the existence of a limit cycle has no general solution
Limited solution for the 2D case:
Poincares Index: for the region enclosed by a closed curve, write
n = N S
where N is the number of centers, foci and nodes,
and S is the number of enclosed saddle points.
For the closed curve to be a limit cycle, it is necessary but not
sufficient that n = 1 .
Summary on Stability
The local stability of a (sufficiently smooth) nonlinear system can be
determined by examining the stability of the equilibrium points.
Nonlinear systems can have a new kind of behavior: a limit cycle.
Needed: a global stability result.
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 15
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 16
EE/ME 701: Advanced Linear Systems
Section 4.0.0
4 Lyapunov stability
EE/ME 701: Advanced Linear Systems
Section 4.0.0
With nonlinear stiffness and damping
Goal: address stability without solving the differential equation.
k (y)
Lyapunov method:
b (y)
Introduce an energy function (or, in general, a Lyapunov function)
Show that energy decays everywhere, except at the equilibrium point,
where it is zero.
Write the state space model:
x=
First published in Russian in 1892,
Translated into French in 1907
x =
Fist use in controls in 1944
y
0
x1
x2
k (y) b (y)
Consider the second order system
Assuming unit mass, the sum of potential and kinetic energy is given by:
E (x1 , x2 ) =
x22
+
2
k (x1 ) dx1
Looking at the rate of change of energy in the system:
dE
= x2x2 + k (x1 ) x1
dt
= x2 [x2 + k (x1 )]
But from the dynamics, x2 = k (x1 ) b (x2), and so
dE
dt
becomes:
dE
= x2 [k (x1 ) b (x2) + k (x1)]
dt
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 17
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 18
EE/ME 701: Advanced Linear Systems
Section 4.0.0
EE/ME 701: Advanced Linear Systems
Section 4.1.0
4.1 Generalization of Lyapunovs energy function:
Introduce a scalar function of the system state:
V (x)
which is positive-definite
V (x) > 0 ,
x 6= 0
(5)
Evaluate the Lyapunov derivative:
dV
V d x
=
dt
dx dt
Conclusions:
(6)
and show:
If b () = 0 and the system has only a spring,
in the system remains constant.
dE
dt
= 0 and the total energy
dV
< 0 , x 6= 0
(7)
dt
Which establishes the Bounded-input bounded-output (BIBO) stability:
If b > 0 x2 the system is dissipative, and
V (t) V (t0)
As long as
then :
t > t0
(8)
E>0
dE
dt
Challenge: how do we find V (x) ?
<0
Energy function
Clever choice
Observations:
For any initial condition giving energy E0 , the system never goes to a
state with E (x) > E0
When b > 0 energy steadily decays, and the system converges to
E = 0.
Typical choice:
V (x) = xT P x ,
P : positive definite matrix
(9)
Of course, failure to find a suitable V (x) does not prove instability.
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 19
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 20
EE/ME 701: Advanced Linear Systems
Section 4.2.0
EE/ME 701: Advanced Linear Systems
4.2 Quadratic Forms
Section 4.3.0
4.3 Lyapunov stability of a linear system
Physical energy is often quadratic in the state variables
Given
x (t) = A x (t)
Consider Vc = 12 CV 2
(11)
Introduce the Lyapunov function
Because we want a positive definite function V , a quadratic form is often
chosen for a Lyapunov function.
V (x) = xT P x
(12)
General quadratic form:
V (x) = xT P x
(10)
where P is a symmetric positive definite matrix.
Where
Then V (x) is given by
P is a matrix
V (x) = x T P x + xT P x
It can be shown (see chapter) P may be restricted to symmetric matrices
without loss of generality
(What is the advantage of a symmetric matrix, in terms of the
eigensystem of P?)
= xT AT P x + xT P A x = xT Q x
Student
exercise
where
Q = AT P + P A
If P is positive-definite
(13)
(14)
Q positive definite proves the stability of system (11).
xT P x > 0 x 6= 0
Eqn (14) is called the matrix Lyapunov equation.
Requirement for a PD matrix:
For a stable linear system, Eqn (14) has a solution for every positive
definite P
All eigenvalues > 0.
P may be written P = RT R
(Student exercise, prove that any P written P = RT R is symmetric and
positive (semi)definite, for arbitrary choice of R.)
Equations (11)-(14) show the concept applied to a linear system
And provide a starting point for studying nonlinear systems.
P is positive-semidefinite if all eigenvalues 0, and some eigenvalues
may be 0.
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 21
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 22
EE/ME 701: Advanced Linear Systems
Section 5.0.0
5 Summary
Weve seen that stability is more complex for nonlinear systems
Property of system, and also of initial conditions and inputs
(Think of the stability of a plane in flight).
A system at equilibrium and perturbed can:
Return to equilibrium
Diverge
Go to a different equilibrium point
Go into a limit cycle
The Lyapunov stabiltiy method gives one approach for non-linear systems.
Part 7: Phase Portraits and Stability
(Revised: Sep 10, 2012)
Page 23
EE/ME 701: Advanced Linear Systems
EE/ME 701: Advanced Linear Systems
Controllability and Observability of
Linear Systems
3.2
3.3
Contents
1 Introduction
1.1
Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
Main facts about uncontrollable or unobservable modes: . . . . .
2 Tests for Controllability and Observability
Controllability, discrete time system . . . . . . . . . . . . . . . .
2.2
Controllability, continuous time system . . . . . . . . . . . . . .
Polynomial expressions on matrices, and the CayleyHamilton Thrm . . . . . . . . . . . . . . . . . . . . . . .
Controllability for continuous-time systems . . . . . . . .
10
2.3
Controllability example . . . . . . . . . . . . . . . . . . . . . . .
12
2.4
Observability, discrete time system . . . . . . . . . . . . . . . . .
15
2.5
Observability, continuous time system . . . . . . . . . . . . . . .
18
2.6
Observability example . . . . . . . . . . . . . . . . . . . . . . .
18
2.7
Popov-Belevitch-Hautus Tests . . . . . . . . . . . . . . . . . . .
19
2.7.1
PBH Test of Controllability . . . . . . . . . . . . . . . .
19
2.7.2
PBH Test of Observability . . . . . . . . . . . . . . . . .
20
2.2.2
2.8
Controllability and Observability in modal coordinates.
. . . . .
3 Controllability and observability example
3.1
3.2.1
A badly placed actuator . . . . . . . . . . . . . . . . . .
34
Summary Controllability and Observability . . . . . . . . . . . .
37
3.3.1
38
Uncontrollable and unobservable modes and rocketry . . .
4 Additional topics in Bay
4.1
39
Alternative definitions of Controllability and Observability . . . .
39
21
26
Testing Controllability and Observability . . . . . . . . . . . . . .
Part 8: Controllability and Observability
31
2.1
2.2.1
A badly placed sensor . . . . . . . . . . . . . . . . . . . . . . . .
(Revised: Sep 10, 2012)
30
Page 1
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 2
EE/ME 701: Advanced Linear Systems
Section 1.1.0
1 Introduction
EE/ME 701: Advanced Linear Systems
Section 1.1.0
Figure 1(B) shows an unobservable system.
No discussion of State-Variable models would be complete without a
discussion of observability and controllability (R. E. Kalman [1, 2]).
y(t)
s+3
u(t)
(s+3) (s+4)
Modes correspond to energy storage elements
When a pole and a zero cancel, the mode - nonetheless - remains in the
system:
(A) System with pole-zero cancellation.
Example: a pole-zero cancellation (fig 1(A)).
u(t)
y(t)
s+3
(s+3) (s+4)
The mode disappears from the transfer function.
The mode - and energy storage element - remains in the system:
(B) Unobservable system (state hidden from output)
Either the signal of the mode does not appear in the output
(Unobservable, fig 1(B));
Or the input signal cant reach the mode (Uncontrollable, fig 1(C)).
1.1 Definitions
u(t)
s+3
y(t)
(s+3) (s+4)
(C) Uncontrollable system (state untouched by input)
Controllable System: The input signal reaches each of the modes, so that the
system can be driven from any state, x0 , to the origin by suitable choice of
input signal.
Figure 1: A pole-zero cancellation can physically occur in either the input or
output path.
Figure 1(C) shows an uncontrollable system.
Observable System: The response of each mode reaches the output. An initial
condition, x0, can be determined by observing the output and input of the
system.
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 3
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 4
EE/ME 701: Advanced Linear Systems
Section 1.2.0
1.2 Main facts about uncontrollable or unobservable modes:
Even if a pole and zero cancel, the system still has that mode in its response.
EE/ME 701: Advanced Linear Systems
Section 2.1.0
2 Tests for Controllability and Observability
2.1 Controllability, discrete time system
Consider example 3.10 in [Bay]. For the LTI discrete-time system:
Challenge: if a pole and zero cancel, it means that feedback control can do
nothing about the mode:
x (k + 1) = A x (k) + B u (k)
If the mode is unstable, feedback control cannot stabilize the system.
(1)
y (k) = C x (k) + D u (k)
If the mode is slow or lightly damped, it will remain slow or lightly
damped.
If the mode is fast and well damped: being uncontrollable or
unobservable is often not a problem.
Exception: rocket, the stress introduced by oscillation of an
uncontrollable or unobservable mode can over-stress the structure.
How do uncontrollable or unobservable modes get excited:
Controllability, example 3.10: Given and arbitrary initial condition x (0),
under what conditions will it be possible find an input sequence,
u (0) , u (1) , ..., u (l) which will drive the state vector to zero, x (l) = 0.
Solution:
Consider the output for a few samples of the input:
At time k = 0 :
Disturbances (dont go through zeros of control input path),
Initial conditions,
x (1) = A x (0) + B u (0)
(2)
Non-linearities, etc.
At time k = 1 :
x (2) = A x (1) + B u (1)
(3)
= A [A x (0) + B u (0)] + B u (1)
= A2 x (0) + A B u (0) + B u (1)
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 5
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 6
EE/ME 701: Advanced Linear Systems
Section 2.1.0
At time k = 2 :
EE/ME 701: Advanced Linear Systems
Section 2.1.0
Define the controllability matrix to be:
x (3) = A x (2) + B u (2)
i
h
= A A2 x (0) + A B u (0) + B u (1) + B u (2)
= A3 x (0) + A2 B u (0) + A B u (1) + B u (2)
P , B | A B | ... | Al2 B | Al1 B
and also
The ascending powers of A arise because x (k + 1) depends on x (k)
(recursion).
For 0 = x (l), rearranging Eqn (4) gives:
then
x (l) Al x (0) = Al x (0) = B u (l 1) + A B u (l 2) + ... + Al1 B u (0)
(5)
The right hand side of (5) can be put in matrix-vector form, to give:
u
(l
1)
u (l 2)
h
i
..
Al x (0) = B | A B | ... | Al2 B | Al1 B
.
u (1)
u (0)
Part 8: Controllability and Observability
u
(l
1)
u (l 2)
..
(7)
u=
.
u (1)
u (0)
(4)
(Revised: Sep 10, 2012)
Al x (0) = P u
(8)
From Eqn (8), it is clear that if Al x (0) lies in the column space of P , then
a solution u exists which is the control sequence to drives the initial state to
the origin. Thus:
If the rank of P = n , then a control sequence u is guaranteed to exist;
and the system is controllable.
(6)
Page 7
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 8
EE/ME 701: Advanced Linear Systems
Section 2.2.1
2.2 Controllability, continuous time system
EE/ME 701: Advanced Linear Systems
2.2.2 Controllability for continuous-time systems
Controllability for a continuous time system follows from a similar analysis,
except that integrals and the Cayley-Hamilton theorem are required.
THEOREM (Controllability): A n-dimensional continuous-time LTI
system is controllable if and only if the matrix
A polynomial expression p () has the form (following Bay, chap 5):
p () = + n1
n1
+ + 1 + 0
p (A) = A + n1 A
n1
(9)
PROOF: Recall that the solution of the LTI equations is:
x (t) = eA(tt0 ) x (t0) +
+ + 1 A + 0I
Z t
eA(t) B u () d
(13)
t0
(10)
We want to establish whether a control signal u (t)exists such that
x (t1 ) = 0, so consider:
Cayley-Hamilton Theorem:
A matrix satisfies its own characteristic equation
Z t1
THEOREM (Cayley-Hamilton theorem): If the characteristic
polynomial of an arbitrary n n matrix A is denoted (), computed
as () = det (A I), then matrix A satisfies its own characteristic
equation, denoted by (A) = 0.
t0
(A) = An + n1 An1 + + 1A + 0I
n
n1
= n1 A
1 A 0I
(11)
where the i are the coefficients of the characteristic equation.
In any polynomial expression in A, we can always replace An and higher
powers with the right hand side of Eqn (11).
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
eA(t1 ) B u () d = eA(t1 t0 ) x (t0 ) , 1 Rn
(14)
We dont need to solve for the 1 for the proof, only to know it
exists.
As a result, all matrix polynomial expressions are equivalent to an
expression of order n or less, because
so
(12)
has rank n.
A matrix polynomial expression has a similar form:
n
P , B | A B | ... | Al2 B | Al1 B
2.2.1 Polynomial expressions on matrices, and the Cayley-Hamilton Thrm
Section 2.2.2
Page 9
Term eA(t1 t0 ) has a polynomial expansion, so by the CayleyHamilton theorem, it can be expressed in terms of a low-order
polynomial:
eA(t1 t0 ) =
i () Ani
(15)
i=1
We dont need to solve for the i () for the proof, only to know
they exist.
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 10
EE/ME 701: Advanced Linear Systems
Section 2.2.2
EE/ME 701: Advanced Linear Systems
2.3 Controllability example
From Eqns (14) and (15):
1 =
=
Z t1
t0
Z t1
t0
"
i () A
ni
i=1
"
Ani B i ()
i=1
Z t1 h
n1
t0
= An1 B
t0
Consider the electrical circuit given in figure 3:
B u () d
#
(16)
R
u () d
i
B 1 () u () + + A B n1 () u () + B n () u () d
Z t1
Section 2.3.0
1 () u () d + + A B
Define:
i ,
Z t1
t0
Z t1
t0
n1 () u () d + B
Z t1
t0
+
vs(t)
+
vx(t)
-
(17)
Eqn (17) with Eqn (16) gives:
n1
.
..
(18)
(Revised: Sep 10, 2012)
The state equations are given as:
2
1
1
RC
d vC (t)
C vC (t) RC
vs (t)
=
+
1
dt iL (t)
L1 0
iL (t)
L
h
i vC (t)
+ [1] vs (t)
vx (t) =
1 0
iL (t)
P = [B | A B] =
Page 11
(19)
The Controllability matrix is given as:
From Eqn (14), i is a vector depending on x (t0); from Eqns (17) and (18), if
P is full rank, then there exists a control input u (t) , t [t0 , t1 ] to give some
i such that x (t1 ) = 0 . Thus, the system is guaranteed to be controllable.
Part 8: Controllability and Observability
n () u () dx
As before, we dont need to be able to compute the i , only to know they
exist.
n1
h
i
=P
1 = B | A B | ... | Al2 B | Al1 B
+ vc(t)
Figure 2: RLC circuit example.
i () u () d
iL(t)
1
RC
1
L
1
R22C2 + LC
1
R LC
(20)
When P becomes rank-deficient, the determinant will be zero. So we can
compute the determinant to test for values of R which make the system
uncontrollable.
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 12
EE/ME 701: Advanced Linear Systems
Section 2.3.0
Note: The controllability matrix doesnt have to be square, (well see this
example with an extra input below). A determinant calculation can only
be used when P is square.
|P | =
So rank (P ) < 2 if
1
RC
1
L
1
R22C2 + LC
= 1 1
R2 LC2 L2 C
1
R LC
R=
It means that the transfer function has a pole-zero cancellation, and
that the zero is effectively in the input path:
1
s s + RC
Vx (s)
=
2
1
Vs (s) s2 + RC
s + LC
(23)
p
L/C then L = R2 C the denominator factors into:
s
Vx (s)
=
Vs (s) s2 + 2
iL(t)
+ vc(t) I (t)
b
+
vx(t)
-
(22)
What does this strange situation mean physically:
1
s + RC
1
1 2
RC s + RC
+
vs(t)
L
C
Section 2.3.0
Consider adding another control input (one way to deal with an
uncontrollable system)
(21)
What it means for the system to be uncontrollable: for arbitrary to , t1 and
x (t0 ) , we can not find a control input vs (t) , t [t0, t1 ] to give x (t1 ) = 0; in
other words, to drive the state to 0.
If R =
EE/ME 701: Advanced Linear Systems
1
s s + RC
1
1
s + RC
s + RC
Figure 3: RLC circuit example with second input.
The state model becomes:
2
1
1
1
RC
vC (t)
vs (t)
d vC (t)
C
RC
C
(24)
=
1
dt iL (t)
L1 0
iL (t)
Ib (t)
0
L
h
i vC (t)
+ [1] vs (t)
vx (t) =
1 0
iL (t)
The controllability matrix is:
s
1
s + RC
The second mode is not gone from the system, it is just unreachable from
the input.
P = [B | A B] =
1
RC
1
L
1
C
0 |
1
R22C2 + LC
1
R LC
R2
1
LC
(25)
Notice that P changes shape. It still has n rows, but now has 2n columns.
A 2 4 matrix has to be quite special to have rank less than 2.
In the present case, P is rank 2 as long as the two rows are independent.
The zero in the (2, 2) element assures the rows will be independent.
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 13
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 14
EE/ME 701: Advanced Linear Systems
Section 2.4.0
2.4 Observability, discrete time system
EE/ME 701: Advanced Linear Systems
Section 2.4.0
At time k = 0 :
Consider example 3.11 in [Bay]. For the LIT discrete-time system:
y (0) = C x (0) + D u (0)
x (k + 1) = A x (k) + B u (k)
(27)
(26)
At time k = 1 :
y (k) = C x (k) + D u (k)
y (1) = C x (1) + D u (1)
Observability, example 3.11: Given and arbitrary initial condition x (0),
when l samples of the inputs and outputs, u (k) and y (k), k {0..l},
are known, under what conditions will it be possible to determine
(observe) the initial condition x (0).
(28)
= C [A x (0) + B u (0)] + D u (1)
= C A x (0) +C B u (0) + D u (1)
At time k = 2 :
Duality: Duality situations are those where, with a systematic set of
changes or exchanges, one system turns out to have properties equivalent
to another. For example, circuit duality:
y (2) = C x (2) + D u (2)
i
h
= C A2 x (0) + A B u (0) + B u (1) + D u (2)
= C A2 x (0) +C A B u (0) +C B u (1) + D u (2)
Voltage sources Current Sources
Capacitors
Inductors
Resistances
Conductances
(29)
Generalizing, Eqn (29) becomes:
and the new dual circuit will behave identically to the original.
In controls there are several dualities, an important one is the
Observability / Controllability duality. As we will see, with an
appropriate transpose and C substituted for B, the properties carry over.
Solution:
C
y (1) C A
2
y (2)
= CA
..
..
.
.
C Al1
y (l 1)
y (0)
x (0) +
Big Term
(based on u (k))
(30)
Consider the output for a few samples of the output:
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 15
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 16
EE/ME 701: Advanced Linear Systems
Section 2.4.0
Observability (continued)
y (0)
y (1)
y (2)
..
.
y (l 1)
Section 2.6.0
2.5 Observability, continuous time system
It doesnt matter what the big term is, as long as it is known. Write
EE/ME 701: Advanced Linear Systems
Proof Q must be full rank for a continuous time system to be observable is
omitted. It can be constructed by considering y (t) and its derivatives.
Big Term
(based on u (k))
= Q x (0)
2.6 Observability example
(31)
Using the circuit example,
C
1
0
Q =
=
2
1
CA
RC C
where the observability matrix is given as:
(33)
which is independent of L and full rank for any values of R and C.
CA
2
Q ,
CA
..
C Al1
(32)
From Eqn (31), it is clear that if x (0) lies in the row space of Q , then by
knowing the left hand side it is possible to determine x (0). It is guaranteed
that x (0) lies in the row space of Q if the rank of Q is n.
If the rank of Q = n , then x (0) can be determined from known samples
of the input and output; and the system is observable.
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 17
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 18
EE/ME 701: Advanced Linear Systems
Section 2.7.2
2.7 Popov-Belevitch-Hautus Tests
EE/ME 701: Advanced Linear Systems
Section 2.7.2
2.7.2 PBH Test of Observability
These tests for controllability and observability do not involve testing any
matrix for rank, but rather examine the left eigenvectors (eigenvectors) for
orthogonality with B (C) .
LEMMA (PBH Test of Controllability): An LTI system is not
controllable if and only if there exists a eigenvector v of A such that
Cv = 0
Left Eigenvectors: Left eigenvectors are row vectors which have the property
that
vA = v
(34)
PROOF: We will see that it is a direct consequence of examining
observability in modal coordinates.
For example,
so v =
1 1
1 1
3 1
= 3 1 1
0 2
(35)
is a left eigenvector. Left eigenvectors are row vectors,
and they are the (right) eigenvectors of A .
2.7.1 PBH Test of Controllability
LEMMA (PBH Test of Controllability): An LTI system is not
controllable if and only if there exists a left eigenvector of A such
that
B = 0T
PROOF: The text establishes the PBH test by direct proof. We will see
that it is a direct consequence of examining controllability in modal
coordinates.
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 19
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 20
EE/ME 701: Advanced Linear Systems
Section 2.8.0
2.8 Controllability and Observability in modal coordinates.
EE/ME 701: Advanced Linear Systems
The rotation to modal coordinates is given by
Recall that we can use the eigenvectors of A to transform the state variable
model onto a new basis (modal coordinates) in which the modes are
decoupled.
V = Evecs A ; T = V 1
b = T A T 1
A
Bb = T B
Controllability: For any decoupled mode (all modes in modal coordinates)
we must have an input into the mode, in order for the mode to be
controllable.
The system is controllable if and only if all elements of the
input matrix B are non-zero in modal coordinates.
Observability: For any decoupled mode (all modes in modal coordinates)
we must have an output from the mode, in order for the mode to be
observable.
The system is observable if and only if all elements of the
output matrix C are non-zero in modal coordinates.
Section 2.8.0
Cb = C T 1
The eigenvectors of A are given as:
>> A = [-2/(R*C), (1/C)
-(1/L)
0]
>> B = [1/(R*C) ; (1/L)]
>> C = [ -1 0]; D = 1;
>> [V, U] = eig(A)
Controllability and Observability in modal coordinates Example:
Consider the original circuit (figure 3) with L = 1 [H], C = 1 [F] and R = 2.
p
R 6= L/C , so the system is controllable. The state model is given by:
1
1
A=
1 0
0.5
B=
1.0
C=
D = [1]
1 0
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 21
V =
0.3536 - 0.6124i
0.7071
0.3536 + 0.6124i
0.7071
Notice that the eigenvalues and eigenvectors are complex. This is fine. The
complex terms will come in complex-conjugate pairs, with the imaginary
components canceling in the output.
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 22
EE/ME 701: Advanced Linear Systems
Section 2.8.0
Controllability and Observability in modal coordinates Example (continued)
>> T = inv(V)
T =
0.0000 + 0.8165i
-0.0000 - 0.8165i
Controllability and Observability in modal coordinates Example (continued)
>> VV = [V(:,1), V2]
VV =
0.7071
-0.3536
0.7071
0.3536
0 + 0.0000i
-0.5000 - 0.8660i
>> Bhat = T * B
Bhat = 0.7071 - 0.0000i
0.7071 + 0.0000i
>> Chat = C * inv(T)
Chat = -0.3536 + 0.6124i
Section 2.8.0
Following the steps in section 4.4 to construct the T matrix (called the modal
matrix M by Bay), the transformation is given by:
0.7071 - 0.4082i
0.7071 + 0.4082i
>> Ahat = T * A * inv(T)
Ahat = -0.5000 + 0.8660i
-0.0000 - 0.0000i
EE/ME 701: Advanced Linear Systems
>> T = inv(VV)
T = 0.7071
-1.4142
0.7071
1.4142
-0.3536 - 0.6124i
The system is controllable and observable, B has no zero elements, neither
does C .
>> Ahat = T * A * inv(T)
Ahat = -1.0000
1.0000
0
-1.0000
When R = 1 [] .
The is more difficult because of the repeated eigenvalue (the repeated
pole at s = 1/ (RC)).
It is a coincidence that the pole-zero cancellation occurs at the same
value of R as the double pole, and that this A matrix has only 1
eigenvector.
>> Bhat = T * B
Bhat = 1.4142
0.0000
>> Chat = C * inv(T)
Chat = -0.7071
0.3536
It is necessary to use the Jordan form.
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 23
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 24
EE/ME 701: Advanced Linear Systems
Section 2.8.0
Section 3.0.0
3 Controllability and observability example
With
b x (t) + Bu
b (t)
x (t) = A
Notice:
EE/ME 701: Advanced Linear Systems
(36)
y (t) = Cb x (t) + D u (t)
A double pendulum will be used to illustrate how uncontrollable or
unobservable modes can arise in a physical system.
The example shows that modes arise with choices of where or how
actuators or sensors are placed.
b
1. The repeated poles at 1 in A.
2. Bb (2) = 0 (there is no input into x (2) of the system: uncontrollable).
3. Cb has all non-zero elements (the system is observable).
A typical remedy for uncontrollable or unobservable modes is to reengineer a system to change an actuator or sensor.
The double pendulum, seen in figures 4 and 5:
Has 4 energy storage elements (kinetic energy of each mass, potential
energy of each mass);
Is a 4th order system;
Has 2 oscillatory modes.
l1
M1
l2
M2
Figure 4: Double pendulum: example system for unobservable and
uncontrollable modes.
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 25
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 26
EE/ME 701: Advanced Linear Systems
Section 3.0.0
Controllability and observability example (continued)
Mode 1: Motion together, 2 = 1
(Potential energy, no spring energy)
Section 3.0.0
Controllability and Observability Example (continued)
Mode 2: Motion together, 2 = 1
(Potential energy and spring energy, faster oscillation)
Ap = 0
Figure 5: Double pendulum: two modes of the double pendulum.
-19.8000 0
10.0000
1.0000 0
0
0
0
10.0000 0
-19.8000
0
0
1.0000 0
>> OLPoles = eig(Ap)
OLPoles = 0.0000 +
0.0000 -0.0000 +
-0.0000 -
The modeling equations are:
M1 l12 1 (t) = 1 (t) = M1 g l1 1 k (1 2 ) b 1
M2 l22 2 (t) = 2 (t) = M2 g l2 2 k (2 1 ) b 2
h
iT
With state vector x (t) = 1 1 2 2 we get the state variable model:
>>
>>
>>
>>
>>
>>
>>
>>
EE/ME 701: Advanced Linear Systems
>> %% Double Pendulum, y/observability example
%% Setup parameters
M1 = 2; M2 = M1;
% Ball mass
l1 = 1; l2 = l1;
% Link length
b = 0.1;
% Damping factor
k = 20
% Spring stiffness
M1l2 = M1*l1^2
% Link M1 inertia
M2l2 = M2*l2^2
% Link M2 inertia
g = 9.8
% Gravity constant
5.4589i
5.4589i
3.1305i
3.1305i
%% Fast mode
%% Slow mode
And if we apply an input force on M1and measure 2 as illustrated in
figure 6, we get the input and output matrices:
>> Bp = [ l1/M1l2 0 0 0]
%% Input is force on M1
>> Cp = [ 0 0 0 1 ]
>> Dp = [ 0]
%% Output is 2
>> Ap = [ -b/M1l2 -k/M1l2-g/l1 0
+k/M1l2 ;
1
0
0
0 ;
0
+k/M2l2
-b/M2l2 -k/M2l2-g/l2 ;
0
0
1
0 ]
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 27
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 28
EE/ME 701: Advanced Linear Systems
Section 3.0.0
This model gives the step response of figure 7, where both modes are
present.
EE/ME 701: Advanced Linear Systems
Section 3.1.0
3.1 Testing Controllability and Observability
1. Controllability: construct the controllability matrix
2(t)
(Sensor)
>>
%% Controllability matrix
>> Ccontrol = [Bp, Ap*Bp, Ap*Ap*Bp, Ap*Ap*Ap*Bp]
Ccontrol = 0.5000
0 9.9000 0
0
0.5000 0
-9.9000
0
0
5.0000 0
0
0
0
5.0000
u(t)
(Applied Force)
M1
M2
>> rank(Ccontrol)
ans = 4
Figure 6: Double pendulum with sensor and actuator.
The system is controllable.
Step Response
2. Observability: construct the observability matrix
0.05
0.04
>>>> %% Observability matrix
>> Oobserve = [Cp; Cp*Ap; Cp*Ap*Ap; Cp*Ap*Ap*Ap]
Oobserve =
0
0
0
1.0000
0
0
1.0000
0
0
10.0000
0
-19.8000
10.0000
0
-19.8000
0
Amplitude
0.03
0.02
0.01
0.01
0.02
50
100
150
200
250
Time (sec)
>> rank(Oobserve)
ans =
4
Figure 7: Response of the first double pendulum model.
The system is observable.
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 29
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 30
EE/ME 701: Advanced Linear Systems
Section 3.2.0
3.2 A badly placed sensor
u(t)
>> Oobserve1 = [Cp1; Cp1*Ap; Cp1*Ap*Ap; Cp1*Ap*Ap*Ap]
Oobserve1 = 0
0.5000 0
0.5000
0.5000 0
0.5000 0
0
-4.9000 0
-4.9000
-4.9000 0
-4.9000 0
>> rank(Oobserve1)
ans = 2
(Sensor)
k
M1
M2
Figure 8: Double pendulum with sensor moved to a new location.
The output signal is given by:
y (t) =
C p1 =
The controllability matrix:
l1 1 + l2 2
2
0 l1 /2 0 l2 /2
P = B p A p B p A2p B p A3p B p ;
The observability matrix:
>> Cp1 = [ 0 l1/2 0 l2/2 ]
Cp1 = [0
0.5000 0
0.5000 ]
y(t)
or
Section 3.2.0
A badly placed sensor (continued)
Now lets model a system with a position sensor placed in the middle of the
spring (rather than sensing position 2 ):
(Applied Force)
EE/ME 701: Advanced Linear Systems
Rank(Q ) < n , the system is unobservable.
i
unchanged
(37)
Moving the sensor does not change the controllability of a system.
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 31
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 32
EE/ME 701: Advanced Linear Systems
Section 3.2.0
A badly placed sensor (continued)
EE/ME 701: Advanced Linear Systems
Section 3.2.1
3.2.1 A badly placed actuator
Lets take a look at the poles and zeros
Lets go back to the original system, and rather than applying a force to M1,
lets put in a linear motor that acts between M1 and M2
Original system:
>> Sys0 = ss(Ap, Bp, Cp, Dp);
>> Poles0 = pole(Sys0)
Poles0 = 0.0000 + 5.4589i
0.0000 - 5.4589i
-0.0000 + 3.1305i
-0.0000 - 3.1305i
>> Zeros0 = zero(Sys0)
Zeros0 = Empty matrix: 0-by-1
%% Fast mode
Linear Motor
f(t) = ks va(t)
u(t)
%% Slow mode
2(t)
(Sensor)
u(t)
(Applied Force)
(Applied Force)
k
M1
%% No zeros
M2
Figure 9: Double pendulum with actuator moved to a new location.
System with the sensor in the middle of the spring:
>> Sys1 = ss(Ap, Bp, Cp1, Dp);
>> Poles1 = pole(Sys1);
%% Poles1:
>> Zeros1 = zero(Sys1)
Zeros1 =
0 + 5.4589i
0 - 5.4589i
Unchanged.
0
B p2 =
l2 /M2 l22
The input signal is given by:
%% Zeros overlap Fast mode
The new sensor position introduces zeros which collide with the fast mode
poles.
This system is controllable but unobservable.
The observability matrix:
l1 /M1 l12
Cp
Cp A p
Q =
C p A2p
C p A3p
unchanged (38)
Moving the actuator does not change the observability of a system.
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 33
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 34
EE/ME 701: Advanced Linear Systems
Section 3.2.1
A badly placed actuator (continued)
EE/ME 701: Advanced Linear Systems
Section 3.2.1
A badly placed actuator (continued)
The controllability matrix:
This system is observable by uncontrollable, take a look at the system in
modal coordinates:
>> Bp2 = [l1/M1l2; 0; -l2/M2l2; 0 ]
Bp2 = 0.5000
0
-0.5000
0
%% Rotate to modal coordinates
[V, P] = eig(Ap);
mpT = V;
mpT1 = inv(mpT)
Am = mpT1 * Ap * mpT
= diag([0+j5.5, 0-j5.5, 0+j3.1, 0-j3.1])
>> Ccontrol2 = [Bp2, Ap*Bp2, Ap*Ap*Bp2, Ap*Ap*Ap*Bp2]
Ccontrol2 = 0.5000 0
-14.9000
0
0
0.5000
0
-14.9000
-0.5000 0
14.9000
0
0
-0.5000
0
14.9000
>> rank(Ccontrol2)
ans = 2
Input matrix of system with badly placed actuator:
Bm = mpT1 * Bp2
= [3.6; 3.6; 0; 0]
%% slow mode uncontrollable.
Output matrix of system with badly placed sensor:
Cm = Cp2 * mpT
= [0, 0, 0+j0.22, 0-j0.22]
Rank(P ) < n , the system is uncontrollable.
%% fast mode unobservable
Lets take a look at the poles and zeros
System with the actuator between the links.
>> Sys2 = ss(Ap, Bp2, Cp, Dp);
>> Zeros2 = zero(Sys2)
Zeros2 =
0 + 3.1305i
%% Zeros overlap Slow mode
0 - 3.1305i
The new actuator design introduces zeros which collide with the slow mode
poles.
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 35
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 36
EE/ME 701: Advanced Linear Systems
Section 3.3.0
3.3 Summary Controllability and Observability
EE/ME 701: Advanced Linear Systems
Section 3.3.1
3.3.1 Uncontrollable and unobservable modes and rocketry
The order of the system is based on the number of storage elements (or 1st
order differential equations).
If a pole and zero cancel, it does not remove an energy storage element
from the system: the mode is still in there.
Space launch vehicles are one type of system where controllability and
observability problems arise:
Flexible structures
Restricted freedom to place sensors or actuators
Controllability and observability are properties of the system, not the
controller.
The poles and zeros move around vigorously during launch, as the
aerodynamic condition changes and mass changes.
There is no way to fix a controllability or observability problem by
choosing a different controller.
Uncontrollable or unobservable modes must be very carefully considered
for rocketry
Most often controllability or observability problems are fixed by changing
(or adding) an actuator; or changing (or adding) a sensor.
Open-loop unstable mode (without control, the rocket will fall over)
Lightly damped structural modes:
Unobservable or uncontrollable modes may not be a problem if they are
sufficiently fast and well damped.
Can be excited by vibrations during launch
Moderate oscillations can (and did!) exceed structural limits.
Transfer functions provide little insight into controllability and observability. State variable models were needed to see what was going on (R.E.
Kalman [1])
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 37
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 38
EE/ME 701: Advanced Linear Systems
Section 4.1.0
4 Additional topics in Bay
EE/ME 701: Advanced Linear Systems
Section 4.1.0
Definitions related to Controllability
Bay also addresses time varying systems and alternative definitions of
Controllability and Observability
4.1 Alternative definitions of Controllability and Observability
We have several alternative definitions for the concepts of Controllability
and Observability
In most cases (except as noted), these are equivalent to basic Controllability
and Observability for Linear Time-Invariant Systems.
The differences become more important (and subtle) for nonlinear or timevarying systems.
Controllability:
A system is controllable if there exists a u (t) to drive from an arbitrary
state x0 to the origin.
Reachability:
Can find u (t)to drive from an arbitrary state x0 to any second
arbitrary state x1 .
LTI systems: Controllability are reachability are equivalent for continuous
systems. For discrete systems, there are (defective) discrete
systems which are controllable in a trivial way, consider:
xk+1 =
This is partly why it takes a Ph.D. to design control for an air or space
craft.
0 0
0 0
xk +
0
0
uk
Reachability is actually the more interesting property, requiring
Invertibility of the controllability matrix.
Stabilizability:
A system is stabilizable if its uncontrollable modes, if any, are openloop stable. Its controllable modes may be stable or unstable.
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 39
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 40
EE/ME 701: Advanced Linear Systems
Section 4.1.0
Definitions related to Observability
Observability:
Can the initial state be determined.
Reconstructability:
Can the final state be determined.
Detectability:
A system is detectable if its unobservable modes, if any, are open-loop
stable. Its observable modes may be stable or unstable.
Stabilizability and Detectability
Stabilizability and Detectability relate to whether a closed-loop system can
be stabilized.
For simple stability (as opposed to good performance), we dont need to
observe or control open-loop stable modes.
If any unobservable or uncontrollable modes are fast enough and well
damped, and not excessively excited by disturbances, we may also be
able to achieve good control.
References
[1] R.E. Kalman. On the general theory of control systems. In Proc. 1st Inter.
Congress on Automatic Control, pages 481493. Moscow: IFAC, 1960.
[2] R.E. Kalman, Y.C. Ho, and K.S. Narendra. Controllability of Linear
Dynamical Systems, Vol 1. New York: John Wiley and Sons, 1962.
Part 8: Controllability and Observability
(Revised: Sep 10, 2012)
Page 41