KEMBAR78
Chap 2 Computational Method | PDF | Matrix (Mathematics) | Determinant
0% found this document useful (0 votes)
94 views26 pages

Chap 2 Computational Method

The document describes direct methods for solving systems of linear equations, including Gaussian elimination and triangular factorization. It discusses representing systems of equations in matrix and augmented matrix form. Gaussian elimination manipulates the matrix to put it into upper or lower triangular form so the solutions can be read directly from the diagonal. The method works by eliminating variables below the diagonal at each step.

Uploaded by

Umair Ali Rajput
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
94 views26 pages

Chap 2 Computational Method

The document describes direct methods for solving systems of linear equations, including Gaussian elimination and triangular factorization. It discusses representing systems of equations in matrix and augmented matrix form. Gaussian elimination manipulates the matrix to put it into upper or lower triangular form so the solutions can be read directly from the diagonal. The method works by eliminating variables below the diagonal at each step.

Uploaded by

Umair Ali Rajput
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Chapter 2 Direct methods of solving systems of

linear algebraic equation

2 Direct methods for solving systems of linear equations


• 2.1 Background;

• 2.2 Gaussian elimination;

• 2.3 Triangular factorization;

• 2.4 Error and norm.

2.1 Background
2.1.1 Problem formulation

We shall introduce a general algorithm for the system of linear equations. Given a
square matrix An×n and an arbitrary n− component column vector bn×1 , find a column
vector x such that
Ax = b,
or in component form


 a11 x1 + a12 x2 + · · · + a1n xn = b1
 a21 x1 + a22 x2 + · · · + a2n xn = b2

.. (2.1)


 .
a1n x1 + a2n x2 + · · · + ann xn = bn ,

where the matrix


 
a11 a12 ··· a1n

 a21 a22 ··· a2n 

 .. .. .. .. 
 . . . . 
an1 an2 · · · ann
is called the coefficient matrix of Ax = b, the n dimensional column vectors are
   
b1 x1
 b2   x2 
b= .  and the unknown x =  .  .
   
 ..   .. 
bn n×1
xn n×1

1
Since the linear algebraic system is decided by the coefficient matrix A and the column
vector b on the right side, there is another representation way for the linear system, which
is one augmented matrix
 
a11 a12 · · · a1n b1
 a21 a22 · · · a2n b2 
..  .
 
 .. .. .. ..
 . . . . . 
an1 an2 · · · ann bn
It is well known that this problem has one and only one solution if and only if the
determinant of the coefficient matrix A is non-zero (or det(A) 6= 0, or |A| 6= 0), or A is
non-singular or invertible. In many circumstances, the exact solution x can be obtained
easily, especially in the case of the very large n.
In addition, one system Ax = b is said to be equivalent to the other system Bx = z if
they have the same solutions. In other words, if one column vector satisfies Ax = b and
Bx = z at the same time, then these two systems are said to be equivalent. Thus, in order
to solve a system of equations, we instead solve any equivalent system.

2.1.2 Background of the problem

In many cases, it is inevitable to solve the linear algebraic equation system. For
example
(1) Polynomial interpolation,
(2) Numerical solution of ordinary differential equation or partial differential equation.

2.1.3 Cramer rule

In the linear algebra theory, there is one theoretical method-Cramer rule, which can
present a perfect result of x. Now let’s recall the method.
Denote the coefficient matrix A by
A = (a1 , a2 , · · · , an ),
where ai is n component column vector,
 
ai1
 ai2 
ai =  .
 
..
 . 
ain
If we replace the column ai in A by the right column vector b, a new matrix is derived
Di = (a1 , a2 , · · · , b, · · · , an )
= (a1 , a2 , · · · , Ax, · · · , an )
= A(e1 , e2 , · · · , x, · · · , en ), (2.2)
where ei is the i-th unit column vector
 
0

 0 

 .. 
 . 
ei =  

 1 

 .. 
 . 
0

2
whose i−th element is 1 and other elements are zeros.
Taking determinants for both sides of (2.2), we have

|Di | = |A|xi

which implies that

|Di |
xi = , i = 1, 2, · · · , n.
|A|

Example Apply Cramer Rule for the 3-dimensional system


    
1 2 0 x1 2
 1 0 2   x2  =  1  .
0 2 1 x3 −1

It seems perfect to calculate all xi , i = 1, 2, · · · , n. However, this expression is not suit-


able for the computation by the computer, because of the count of the operations of Cramer
rule. To calculate all xi , we have calculate n + 1 determinants |D1 |, |D2 |, · · · , |Dn |, |A|.
For every determinant of n square matrix, such as the determinant of A,
n!
X
|A| = (−1)J(i1 ,··· ,in ) a1i1 a2i2 · · · anin ,
J(i1 ,··· ,in )

where J(i1 , · · · , in ) denotes the permutation of i1 , · · · , in , it requires the sum of n factorial


terms and n−1 multiplications for each term. In a whole, every determinant takes (n−1)n!
multiplications. For Cramer rule, the total operation is

(n + 1)(n − 1)n! + n.

For n = 20, the operation is about 9.7 × 1020 multiplications which takes 308, 000 years
even by a computer performing 1 × 108 multiplications per second.
Note that, the amount of CPU time required to perform a multiplication or division
on a computer is approximately the same and considerably greater than that required
to perform an addition or subtraction. Thus, the addition or subtraction are omitted
compared to the multiplication or division.

2.2 Gaussian elimination


Direct methods are those methods that theoretically give the exact solution to the
system after a finite number of steps. In practice, of course, the solution obtained will
be contaminated by the round-off error that is involved with the arithmetic being used.
Analyzing the effect of this round-off error and determining ways to keep it under control
will be the main component of this chapter.
Now let’s commence looking for an appropriate method for the solution to the system.
The structure of the coefficient matrix A is complicated without any special rule. But if
the coefficient matrix A is a diagonal matrix, lower triangular matrix or upper triangular
matrix, could we get the solution of the system directly? Could these special cases provide
some inspirations for the general case?

3
2.2.1 Ideas of Gaussian elimination: assume aii 6= 0.
 
a11 0 ··· 0
 0 a22 ··· 0 
1. If A is a diagonal matrix A =  . , what is the solution of
 
.. .. ..
 .. . . . 
0 0 · · · ann
    
a11 0 ··· 0 x1 b1
 0 a22 ··· 0   x2   b2 
Ax = b ? The linear system is    ..  =   which implies
    
.. .. .. .. ..
 . . . .  .   . 
0 0 ··· ann xn bn
that


 a11 x1 = b1
 a22 x2 = b2

..


 .
ann xn = bn ,

that is
 b1
 x1 = a11
b2

 x2 =

a22
..


 .
 bn
xn = ann .

After we calculate the solution x, for each xi , it takes 1 division. In a whole, for the
solution x, it takes n divisions.
2. If the coefficient matrix A is a lower triangular matrix, the system of linear equation
is
    
a11 0 0 ··· 0 x1 b1
 a21 a22 0 · · · 0    x2   b2 
   

 a31 a32 a33 · · · 0   x3  =  b3  ,
   

 .. .. .. . .. . .
..   ..   .. 
    .
 . . . 
an1 an2 an3 · · · ann xn bn

in component form


 a11 x1 = b1
 21 x1 + a22 x2
a = b2



a31 x1 + a32 x2 + a33 x3 = b3
 ..
.




an1 x1 + an2 x2 + · · · + ann xn = bn .

Considering the relation of the n linear equations, we solve them one by one orderly
starting from the first one, which is called Forward Substitution. From the first equation,
we get
b1
x1 = .
a11
Substituting x1 calculated into the second equation, it yields the
b2 − a21 x1
x2 = .
a22

4
Thus, the general formula has the form
i−1
X
bi − aik xk
k=1
xi = , i = 1, 2, · · · , n.
aii
For each xi , it takes i multiplications which results in the whole operation is 1+2+· · ·+n =
n(n+1)
2 .
3. If A is an upper triangular matrix, it follows that
    
a11 a12 a13 · · · a1n x1 b1
 0 a22 a23 · · · a2n   x2   b2 
..   ..  =  .. 
    
 .. .. .. ..
 . . . . .  .   . 
0 0 0 ··· ann xn bn

in the component form




 a11 x1 + a12 x2 + · · · + a1n xn = b1
 a22 x2 + · · · + a2n xn = b2



..
 .
an−1,n−1 xn−1 + an−1,n xn = bn−1




ann xn = bn .

bn
Commencing from the last equation, we get xn = ann . Substituting the known xn into
the last equation to one, it holds that
bn−1 − an−1,n xn
xn−1 = .
an−1,n−1

It can also be deduced that


n
X
bi − aik xk
k=i+1
xi = , i = n, n − 1, · · · , 1,
aii
n(n + 1)
referred as Backward Substitution, with multiplications.
2
Example Use backward substitution to solve the linear system


 4x1 − x2 + 2x3 + 3x4 = 20
−2x2 + 7x3 − 4x4 = −7


 6x3 + 5x4 = 4
3x4 = 6.

The answer is


 x4 = 2
x3 = −1

x = −4
 2


x1 = 3.

Note that the condition of aii 6= 0 is essential. If the requirement is not fulfilled, either no
solution exists or infinitely many solutions exist.

5
1 1
 
 2 x1 = 2  x1 = 1
Example Please solve the system x1 − 2x2
=3 The answer is x2 = −1 .
x1 + 2x2 + x3 = 0. x3 = 1
 
Example Solve the upper triangular systems such as
    
3 −2 1 −1 x1 8
 0 4 −1 2   x2   −3 
  = 
 0 0 2 3   x3   11 
0 0 0 5 x4 15

and
    
5 −3 −7 1 x1 −14
 0 11 9 5   x2
    22 
=
  −11  .
 
 0 0 3 −13   x3
0 0 0 7 x4 14

(2) Solve the lower triangular system and find det(A)




 5x1 = −10
x1 + 3x2 =4


 3x 1 + 4x 2 + 2x 3 =2
−x1 + 3x2 − 6x3 − x4 = 5

4. Elementary operations
Let’s come back to the general coefficient matrix A. What can we do to connect the
general case to the three special cases? Can we change the general and full matrix A into
triangular or diagonal matrix? If yes, we can use back (or forward) substitution to solve
x.
Assume an upper triangular matrix
 
r11 r12 · · · r1n
 r22 · · · r2n 
R=
 
.. .. 
 . . 
rnn

Could we transform Ax = b to Rx = y? R is an upper triangular matrix. Our purpose is


to solve x for Ax = b. If two systems Ax = b and Rx = y have exactly the same solution
x, we could solve Rx = y, where R is an upper triangular matrix. The remaining is how
to deduce Rx = y from the given system Ax = b,

[A : b] −→ [R : y] .

Recall the elementary operations for a matrix:


(1) Interchanges: the order of two rows can be changed.
(2) Scaling: multiplying one row by a nonzero constant.
(3) Replacing: an row can be replaced by the difference of itself and a multiple of the
other row.
Using these transformation, the original system [A : b] can be transformed into the
upper triangular system [R : y].
The next question is if they are equivalent, they have the same solution or three
transformations do not change the solution to Ax = b.

6
Theorem 2.1. These three operations can keep the solution to Ax = b invariant.
Proof. (1) Interchange: For the augmented matrix [A : b], if we change the order of two
equations, its solution doesn’t change.
(2) Scaling: Multiplying a nonzero constant for one equation, the solution of
Ax = b is the same.
(3) Replacing: If a n-dimensional column x satisfies the system, Which means
that x1 , x2 , · · · , xn are the solutions to each equation, then x1 , x2 , · · · , xn also satisfy the
replaced system with a constant c 6= 0.


 a11 x1 + a12 x2 + · · · + a1n xn = b1
 a21 x1 + a22 x2 + · · · + a2n xn = b1




 ..
.




 ai1 x1 + ai2 x2 + · · · + ain xn

= bi
.. (2.3)


 .
(cai1 + aj1 )x1 + (cai2 + aj2 )x2 + · · · + (cain + ajn )xn = bj + cbi





 ..
.




an1 x1 + an2 x2 + · · · + ann xn = bn

How about the reverse? If x1 , x2 , · · · , xn are solutions to (2.3), they also satisfy the system
(2.1).

2.2.2 The derivation of Gaussian elimination

To solve Ax = b, transform Ax = b to a new system Rx = y, where R is an upper


triangular matrix. Then apply the backward substitution to solve x from Rx = y to get
the solution x.
Example Find the solutions xi , i = 1, 2, 3, by using Gaussian elimination,

 x1 + 2x2 + x3 = 0 (1)
2x1 + 2x2 + 3x3 = 3 (2)
−x1 − 3x2 = 2 (3)

Step 1: (2)0 ⇐ (2) − 2 × (1)


(3)0 ⇐ (1) + (3)

 x1 + 2x2 + x3 = 0 (1)
−2x2 + x3 = 3 (2)0
−x2 + x3 = 2 (3)0

1
Step 2: (3)00 ⇐ (3)0 − × (2)0
−2

 x1 + 2x2 + x3 = 0 (1) 0

−2x2 + x3 = 3 (2)
1 1
(3)00

 x3 =
2 2
Finally, we get an upper triangular coefficient matrix. By backward substitution method,
it yields

 x3 = 1
x2 = −1
x1 = 1

7
Summarize this process by the augmented matrix as follows
     
1 2 1 0 1 2 1 0 1 2 1 0
[A : b] =  2 2 3 3 → 0 −2 1 3 → 0 −2 1
    3 
−1 −3 0 2 0 −1 1 2 0 0 21 1
2

Ex .
1

1
   
0 5 1 2 6
 2 1 −1 1  4
x= 9

4 −1 1 0 − 29

Ex .
2

      
1 0 1 x1 5 4
 2 −1 0   x2  =  10  x =  −2 
1 2 1 x3 1 1

Ex .
3

      
1 −2 3 x1 −20 1
 −1 1 2   x2  =  −8  x= 3 
2 4 1 x3 9 −5

2.2.3 The general Gaussian elimination algorithm

Two steps. The first step-Elimination: transform Ax = b into Rx = y.


The second step-Backward substitution: solve Rx = y by backward substitution to get
x.
Summarize the process for the general system of dimension n.
Given Ax = b, in the augmented matrix form
 (0) (0) (0) (0)

a11 a12 · · · a1n b1
 (0) (0)
(0) (0)
 a21 a22 · · · a(0)2n
(0) 
b2 
[A : b ] =   .. .. .. .. 
 . . ··· . . 

(0) (0) (0) (0)
an1 an2 · · · ann bn

Elimination process :

(0) (0) (0) (0)


Step 1: Eliminate a21 , a31 , · · · , an1 . Assume that a11 6= 0, do

(0) (0)
the 2nd row - a21 /a11 × the first row = the new second row;

(0) (0)
the 3rd row - a31 /a11 × the first row = the new 3rd row;

(0) (0)
the nth row - an1 /a11 × the first row = the new nth row;

(0) (0)
the ith row - ai1 /a11 × the first row = the new ith row i = 2, 3, · · · , n.

8
After these operations, we have an equivalent linear system in augmented matrix
 (0) (0) (0) (0)

a11 a12 · · · a1n b1
 0 a(1) (1) (1) 
· · · a2n b2 

(1) (1) 22
[A : b ] =  . .. .. .
.. 
 .. . ··· . . 
(1) (1) (1)
0 an2 ··· ann bn
(1)
Step 2: If a22 6= 0, then do
(1) (1)
the third row = the third row a32 /a22 × the second row;

(1) (1)
the 4th row = the 4th row a42 /a22 × the second row;

(1) (1)
the nth row = the nth row an2 /a22 × the second row;

(1) (1)
the ith row = the ith row ai2 /a22 × the second row, i = 3, 4, · · · , n.

Then we get an equivalent system


 (0) (0) (0) (0) (0) 
a11 a12 a13 ··· a1n b1
(1) (1) (1) (1)
 0 a22 a23 ··· a2n b2
 

(2) (2) (2)
[A(2) : b(2) ] = 
 
 0 0 a33 ··· a3n b3 .
 .. .. .. .. ..


 . . . ··· . . 
(2) (2) (2)
0 0 an3 ··· ann bn

Along the same idea, after n − 1 steps, we deduce the final equivalent system whose
coefficient matrix is an upper triangular matrix
 (0) (0) (0) (0) (0) 
a11 a12 a13 · · · a1n b1
(1) (1) (1) (1)
 0 a22 a23 · · · a2n b2
 

(2) (2) (2)
[A(n−1) : b(n−1) ] = 
 
 0 0 a33 · · · a3n b3 .
 .. .. .. .. ..


 . . . ··· . . 
(n−1) (n−1)
0 0 0 ··· ann bn

The next step is the backward substitution to calculate all solutions to [A(n−1) : b(n−1) ].
The algorithm is the following

xn = b(n−1)
n /a(n−1)
nn ;
..
.
n
(k−1) (k−1)
X
xk = (bk − akj xj )/ak−1
kk , k = n − 1, n − 2, · · · , 1.
j=k+1

Procedures in computer

Elimination process: n − 1 steps for elimination process


For k = 1, 2, 3, · · · , n − 1 do
For i = k + 1, k + 2, · · · , n rows do

9
(k−1) (k−1)
set lik = aik /akk
for j = k + 1, k + 2, · · · , n, n + 1 augmented matrix columns do
(k−1) (k−1)
akij ⇐ aij − lik akj
end do

end do
end do
The computational complexity:

n−1
X
(n − k)(((n + 1) − (k + 1) + 1) + 1)
k=1
n−1
X
= (n − k)(n − k + 2)
k=1
n3
 
= O multiplications
3

Backward substitution process

xn = b(n−1)
n /a(n−1)
nn ;
For k = n − 1, n − 2, · · · , 1 row do
n
(k−1) (k−1) (k−1)
X
xk = (bk = akj xj )/akk
j=k+1
end do

The computational complexity


n
X n(n + 1)
k= multiplications
2
k=1

The computational
 3 complexity of Cramer rule is (n + 1)!(n − 1) + n, which is much
n
larger than O 3 of Gauss elimination. Thus Gauss elimination is more suitable for
computer. Summarize three operations to simplify the linear system:
1 Equation Ei can be multiplied by any nonzero constant λ with the resulting used
in place of Ei . This operation is denoted by (λEi ) → (Ei ).
2 Equation Ej can be multiplied by any constant λ and added to equation Ei with
the resulting equation used in place of Ei . This operation is denoted by (Ei +λEj ) → (Ei ).
3 Equation Ei and Ej can be transposed in order. The operation is denoted (Ei ) ↔
(Ej ). By a sequence of these operations, a linear system that is more easily solved and
has the same solutions.

10
2.2.4 Pivoting Strategy
(k−1)
The condition of Gaussian elimination is akk 6= 0. Note that the constant lik =
aik
akk  provided that akk is very small, whose round-off error is very large.
Example. Apply Gaussian elimination to solve the system

E1 : 10−5 x1 + 2x2 = 1


E2 : 2x1 + 3x2 = 2.

For a 4-digit computer, calculate the approximate solution and compare them to the
exact solutions

x1 = 0.250001875
x2 = 0.499998749.

By the process

10−5 2 1 10−5
   
2 1
→ 2×2 2×1
2 3 2 0 3 − 10−5 2− 10−5

4
3− = 3 − 4 × 105 = −(400000 − 3) = −0.4 × 106
10−5
2 − 2 × 105 = −0.2 × 106 ,

the second solution is obtained by


.
−0.4 × 106 x2 = −0.2 × 106 ⇒ x
f2 = 0.5.

Substituting x
f2 into the first equation, we have
.
10−5 x
f1 + 2 × x
f2 = 1 ⇒ x
f1 = 0.

There is a big difference between x1 and x


f1 , though x
f2 is near to x2 . The reason is

x1 × 10−5 + 2x2 = 1

1
x −5
f1 × 10 + 2f x2 = 1 2

Formula
1 subtracts formula ,
2 the enor is derived as

f1 ) × 10−5 + 2(x2 − x
(x1 − x f1 ) = 0

f1 = −2 × 105 × (x2 − x
⇒ x1 − x f1 )
∆x1 = −2 × 105 × ∆x2
It is shown that the error ∆x1 is magnified by a factor 105 multiplying ∆x2 . Thus, it is
necessary to avoid a very small denominator or divisor. That is, in Gaussian elimination,
the divisor is the diagonal elements. We should avoid a small diagonal entry by three
elementary transformations.

11
2.3 Matrix Factorization
In this subsection, we will focus on the matrix factorization methods applied to a
linear system. If the coefficient matrix A can be factorized into A = LR, where L is lower
triangular and R is upper triangular, then substitute this formula into the system Ax = b
and we get
LRx = b,
Then the system can be written into two systems by setting Rx = y,

Ly = b 1
Rx = y . 2

The second equation 2 can be solved to get y firstly. Then substituting the solution
y into the equation ,
1 it yields x. Equation 1 and 2 are easier to be solved since
the special constructures of L and R. The corresponding computational complexity is
n(n + 1)
.
2
The next question is how to realize the matrix factorization.

2.3.1 The derivation of matrix factorization

Recall the procedure of Gaussian elimination, whose main technique is


aik
The ith row - the kth row × ,
akk
where k = 1, 2, · · · , n − 1 and i = k + 1, k + 2, · · · , n. In fact, this technique has an
equivalent operation matrix multiplication on the left side. For example
   
a11 a12 a13 b1 a11 a12 a13 b1
 a21 a22 a23 b2  →  0 a22 − a21 a12 a23 − a23 a13 b2 − a21 b1 
a11 a11 a11
a31 a32 a33 b3 a31 a32 a33 b3

  
1 0 0 a11 a12 a13 b1
↔  − aa11
21
1 0   a21 a22 a23 b2 
0 0 1 a31 a32 a33 b3

Note that the inverse of the left matrix is


 
1 0 0
 a21 1 0  .
a11
0 0 1

Go back to n − 1 steps Gauss elimination.


Step 1: Transform [A : b] into
 (0) (0) (0) (0)

a a12 · · · a1n b1
 11
 0 a(1) ···
(1) (1) 
22 a2n b2 
 . .. .. .. .
 .
 . . ··· . .


(1) (1) (1)
0 an2 ··· ann bn

We need take these operations


(0)
a21
the 2nd row - the 1st row × (0)
a11

12
which can be realized by matrix multiplication

1 0 ··· 0
 
(0)
a21
 − 1 ··· 0
 
(0) 
 a11  [A : b].
 . .. . . ..
 ..

. . . 
0 0 ··· 1 n×n

Based on the obtained matrix, the technique


(0)
a31
the 3rd row the 1st row × (0)
a11
can be done by
 
1 0 ··· 0 
1 0 ··· 0


 0 1 ··· 0   a(0)
 a(0)   − 21
(0) 1 ··· 0

 − 31 0 ··· 0  a11

 [A : b].
 a(0) 
11 . .. . . ..
..   ..
 
 .. .. . . .

. . 
 . . . . 
0 0 ··· 1
0 0 ··· 1

Along the idea, after step 1 with i = 2, 3, · · · , n, we have


 (0)

a21

 a(0) 1 0 · · · 0  (0) (0) (0) (0)

11
 a a12 · · · a1n b1
 a(0)  11
 
 − (0)
31
0 1 ··· 0   0 a(1) ···
(1) (1)
 
22 a2n b2
 = [A(1) : b(1) ]

 a11  [A : b] =  . . .. ..
 . . . . . . ..
 .. .. .. . . . ..   . . . . .
  

  (1) (1) (1)
 a(0)  0 an2 · · · ann bn
− (0)
n1
0 0 ··· 1
a11

(1) (1) (1)


Step 2: To cancel a32 , a42 , · · · , an2 to be zeros, we multiply the resulting matrix by
one matrix on the left
 
1 0 0 0 ··· 0  
(0) (0) (0) (0)
 0
 1 0 0 · · · 0 
 a11 a12 ··· a1n b1
(1)  (1) (1) (1) 
 0 − a32

1 0 · · · 0

  0 a22 ··· a2n b2 
(1)
(2) (2) (2)
 
a
···
   0
 22
(1)
 (1) (1) a33 a3n b3 
a42  [A : b ] =  .

 0 − (1) 0 1 · · · 0  (2) (2) (2)

 0 a43 ··· a4n b4

 a22   .

 .
 .. .. .. .. . . .. 
  . .. .. .. .. 
 . . . . .   . . . . .


(1) (2) (2) (2)
 a
0 − n2 0 0 · · · 1
 0 an3 ··· ann bn
(1)
a22

Finally, after n − 1 steps,


 
1 0 0 ··· 0  (0) (0) (0)
 a(0) (0) (0) 
 − 21
 a11 a12 a13 ··· a1n b1
 a(0) 1 0 ··· 0 
(1) (1) (1) (1)
 0 a22 a23 ··· a2n b2
11
  
 a(0) (1)
  
a32 (2) (2) (2)
 − (0)
31
− 1 ··· 0
 (0) (0)  
 a11 (1)
a22
 [A : b ] =  0 0 a33 ··· a3n b3 
 .. .. .. .. ..
  
 .
 .. .. .. .. ..  .. 
. . . .   . . . . . . 
  (n−1) (n−1)
 a(0) an2
(1) (2)
an3  0 0 0 ··· ann bn
− n1
(0) − (1) − (2) ··· 1
a11 a22 a33
.
= [R..y].

13
Denote by L−1
k the matrix accomplishing the operations of Step k, k = 1, 2, · · · n − 1.
The elimination process can be represented

−1 −1 (0) .. (0) ..
L−1
n−1 · · · L2 L1 [A .b ] = [R.y]
. .
L−1 [A(0) ..b(0) ] = [R..y]
 
−1 (0) .. −1 (0) .
L A .L b = [R..y].

It follows that

L−1 A = R and L−1 b = y,

that is

A = LR and Ly = b

2.3.2 Solving systems by matrix factorization

The matrix form of Gaussian elimination is A = LR, where L is an unity lower


triangular matrix and R is upper triangular. Therefore, we have two steps to solve Ax = b.

LRx = b,
Step 1: Ly = b ⇒ y,
Step 2: Rx = y ⇒ x.

Step 1: Based on the system


    
1 y1 b1
 l21 1  y2   b2 
    
 l31 l32 1  y3   b3 
= ,
    
 l41 l42 l43 1  y4 b4
    
 . .. ..
 ..
   
 .   . 
ln1 ln2 ln3 ln4 · · · 1 yn bn

we have

y1 = b1 ,
y2 = b2 − l21 b1 ,
y3 = b3 − l31 b1 − l32 b2 ,
..
.,
k−1
X
yk = bk − lkj bj .
j=1

Step 2: For the system with the upper triangular matrix


    
r11 r12 r13 · · · r1n x1 y1

 r22 r23 · · · r2n   x2
   
  y2 


 r33 · · · r3n 
  x3

=
  y3 ,

..  . ..
  ..
    
 .   . 
rnn xn yn

14
we get
yn
xn = ,
rnn
(yn−1 − rn−1,n xn )
xn−1 = ,
rn−1,n−1
..
. ,
n
X
(yk − rkj xj )
j=k+1
xk = .
rkk
Moreover, if Gaussian elimination can be applied to Ax = b, then A can be factorized
into the product of triangular matrices A = LR. Of course, the way to calculate L and
R can be obtained from Gaussian elimination. However, it is complicated. We shall show
how the factorization A = LR can be carried out, only provided that in certain steps of
the computation of divisors are met encountered. Notice that not every matrix has such
a factorization.

2.3.3 Factorization of matrix A (Dolittle factorization)

Suppose that A can be factorized into A = LR, in matrix form


  
  1 r11 r12 r13 · · · r1n
a11 a12 · · · a1n  l21 1  r22 r23 · · · r2n 
 a21 a22 · · · a2n    

 ..
  l31 l32 1
 = 

 r33 · · · r3n .

 .   ..  .. 
 .  . 
an1 an2 · · · ann
ln1 ln2 ln3 · · · 1 rnn
For every aij , it yields that
 
r1j

 r2j 

 .. 

 . 

aij = (li1 li2 · · · li i−1 1 0 · · · 0) 
 rjj 


 0 

 .. 
 . 
0
n min(i,j)
X X
= lik rkj = lik rkj .
k=1 k=1

To develop it clearer, let’s see the recurrence. The first row of R can be derived as follows
a11 = r11 ⇒ r11 = a11 ;
a12 = r12 ⇒ r12 = a12 ;
··· ;
a1n = r1n ⇒ r1j = a1j .
The next step is to calculate the first column of L
a21 
a21 = l21 r11 ⇒ l21 =  ai1
r11
a31  ⇒ ai1 = li1 r11 ⇒ li1 = .
a31 = l31 r11 ⇒ l31 = r11
r11

15
Based on the obtained row and column, the third step is for the second row of R

a22 = l21 r12 + r22 ⇒ r22 = a22 − l21 r12 

a23 = l21 r13 + r22 ⇒ r23 = a23 − l21 r13 

.. ⇒ r2j = a2j − l21 r1j .
. 


a2n = l21 r1n + r22 ⇒ r2n = a2n − l21 r1n

Then the second column of L is


(a32 − l31 r12 )
a32 = l31 r12 + l32 r22 ⇒ l32 =
r22
(ai2 − li1 r12 )
⇒ li2 = .
r22
To derive a general algorithm for the LR-factorization of A, we start with the formula
min(i,j)
X
aij = lik rkj .
k=1

Here each step in this process determines one new row of R and one new column in L.
At step i, we can assume that rows 1,2, · · · , i − 1 have been computed in R and that
columns 1, 2, · · · , i − 1 have been computed in L. If i ≤ j, it follow that
i
X i−1
X i−1
X
aij = lik rkj = lik rkj + lii rij = lik rkj + rij
k=1 k=1 k=1

which implies that


i−1
X
rij = aij − lik rkj .
k=1
When i > j, we have
j
X j−1
X
aij = lik rkj = lik rkj + lij rjj
k=1 k=1
provided that the obtained columns 1, 2, · · · , j − 1 of L and the rows 1, 2, 3, · · · , j − 1
in R.
Before calculating the jth column of L, we have already obtained the jth row in R.
Based on this we derive
(aij − j−1
P
k=1 lik rkj )
lij = .
rjj
In fact, it is not necessary to remember all these formulae. Using the storage way of the
computer as follows
 
  r11 r12 r13 ··· r1n
a11 a12 · · · a1n
 a21 a22 · · · a2n 
 l21
 r22 r23 ··· r2n 


 ..  →  l31
  l32 r33 ··· r3n .

 .   .. .. .. .. .. 
 . . . . . 
an1 an2 · · · ann
ln1 ln2 ln3 ··· rnn

On the other hand, we can compute rij and lij depending on their storage place.
Step 1: r1j = a1j , j = 1, 2, · · · , n;
ai1
Step 2: li1 = , i = 2, 3, · · · , n;
r11

16
Step 3: r2j = a2j − l21 r1j , j = 2, 3, · · · , n;
ai2 − li1 r12
Step 4: li2 = , i = 2, 3, · · · , n;
r22
..
.
It is said that A = LR is Dolittle factorization
Examples:  
2 2 3
(1) A =  4 7 7 
−2 4 5
 
2 4 −6
(2) A =  1 5 3 
1 3 2
 
−5 2 1
(3) A =  1 0 3 
3 1 6
    
1 0 3 1 1 0 3
(4) A =  3 1 6  =  3 1  1 −3 
−5 2 1 −5 2 1 22
 
4 2 1
(5) A =  2 5 −2 
1 −2 7
 
1 −2 7
(6) A =  4 2 1 
2 5 −2
 
1 1 0 4
 2 −1 5 0 
(7) A =   5

2 1 0 
−3 0 2 6
After the factorization, we substitute Dolittle factorization into Ax = b.
Step 1: Factorizing A into A = LR, we obtain L and R;
Step 2: Substituting A = LR into Ax = b, it yields that LRx = b. Assume Rx = y.
Then we can solve y from Ly = b;
Step 3: By Rx = y, the solution x can be obtained;
Recall the calculation formula of y, we can combine Step 1 and Step 2 as follows
 
  r11 r12 · · · r1n y1
a11 a12 · · · a1n b1  l21 r22 · · · r2n y2 
 a21 a22 · · · a2n b2   
 l31 l32 · · · r3n y3 

 
 .. .. .. .
.. .
.. 
  
 . . .  .. .. . . .. .. 
 . . . . . 
an1 an2 · · · ann bn
ln1 ln2 · · · rnn yn

which results in


 y1 = b1 ,
 y2 = b2 − l21 y1 ,

..


 .,
yn = bn − an1 y1 − an2 y2 − · · · − an n−1 yn − 1.

This is the matrix factorization method for Ax = b.


Solve the following systems by matrix factorization.

17
Example 1
 .. 
1 2 3 −4 . −2
 .. 
−3 −4 −12 13 . 5
 
;
 
 ..
 2 10
 0 −3 . 10 

..
4 14 9 −13 . 7
Example 2
 . 
1 3 7 .. 1
5
 . 
2 −1 3 5 .. 2
 
;
 
 .

 0 0 2 5 .. 3 

.
−2 −6 −3 1 .. 4
Example 3
 .. 
4 8 4 . 8 0
 . 
 1 5 4 −3 .. −4 
 
.
 1 4 7 2 ... 10 

 
..
1 3 0 −2 . −4
The algorithm code for carrying out Dolittle’s factorization is as follows

Input (n, (aij ), b)


for k = 1 to n do
lkk ← 1
for j = k to n do
k−1
X
rkj ← akj − lks rsj
s=1
k−1
X
yj ← bj − lks ys
s=1
end do
for i = k + 1 to n do
aik − k−1
P
s=1 lis rsk
lik ←
rkk
end do
end do
output (lij ), (rij )

Counting operations:
n
n3
X  
[2(k − 1)(n − k + 1) + k(n − k)] ∼ O
3
k=1
which is the same as Gaussian elimination. In fact, it is the matrix form of Gaussian
elimination.

18
2.4 Two kinds of matrices
(1) Diagonally dominant matrix The n × n matrix A is said to be diagonally
dominant when
n
X
|aii | ≥ |aij |
j=1
j6=i
holds for each i = 1, 2, · · · , n.
Strictly diagonally dominant when the inequality is strict for each i, that is, when
n
X
|aii | > |aij |.
j=1
j6=i

Examples
 
7 2 0
A =  3 5 −1  → strictly
0 5 −6
 
6 4 −3
A =  4 −2 0  → not
−3 0 1
Note that strictly diagonally dominant matrix A is nonsingular. There is no need to
use pivoting strategy.
(2) Positive definite matrix
A matrix B is said to be positive definite matrix if it is symmetric and xT Ax > 0 for
every n-dimensional x 6= 0.
To be precise,
  
a11 · · · a1n x1
n n
 a21 · · · a2n   x2  X X
  
xT Ax = x1 , · · · , xn  .

. = aij xi xj > 0.
 ..   .. 
  
i=1 j=1
an1 · · · ann xn
Example Show that the matrix
 
2 −1 0
A =  −1 2 −1 
0 −1 2
is positive definite.
Solution. Suppose that x is any 3-dimensional column vector. Then it follows that

xT Ax = 2x21 − 2x1 x2 + 2x22 − 2x2 x3 + 2x33 .

Rearranging the terms gives

xT Ax = x21 + (x1 − x2 )2 + (x2 − x3 )2 + x23

which implies that


xT Ax > 0
unless x1 = x2 = x3 = 0.
From the above properties, using the definition to determine if a matrix is positive
definite is difficult. Fortunately, there are more easily verified criteria.

19
Theorem 2.2. If A is a symmetric positive definite matrix, then A can be factorized in
the form L
eLe T ; where L
e is a lower triangular matrix. It is called Cholesky factorization.
The mathematical form is
  
l11 l11 l21 l31 · · · ln1
 l21 l22


 l22 l32 · · · ln2 

 l31 l32 l33


 l33 · · · ln3 
 (2.4)
 ..  .. 
 .  . 
ln1 ln2 ln3 · · · lnn lnn
This factorization is a special case of Dolittle factorization. The process of solving the
system by Cholesky factorization is as follows
Step 1
   
a11 a12 · · · a1n b1 l11 l21 · · · ln1 y1
 a21 a22 · · · a2n b2   l21 l22 · · · ln2 y2 

   
 .. .. .. .. ..   .. .. .. .. .. 
 . . . . .   . . . . . 
an1 an2 · · · ann bn ln1 ln2 · · · lnn yn
to calculate all lij and yj .
Step 2 Solve x from
e T x = y.
L
From the formula (2.4), Backward substitution can be applied to get

 i−1
! 12
 X
2
 lii = aii − lik



k=1
 j−1
X

 lij = aij − lik ljk



k=1

where i > j.
Examples
 Solve these
 systems
 by Cholesky
 factorization.
2 −1 0 x1 3
(1)  −1 2 −1   x2  =  −3  .
0 −1 2 x 1
  3   
4 1 1 1 x1 0.65
 1 3 −1 1   x2   0.05 
(2)  1 −1 2 0   x3  =  0  .
   

1 1 0 2 x4 0.5
 
4 1 −1 0 7
 1 3 −1 0 8 
(3)  −1 −1 5 2 −4 

0 0 2 4 6

2.5 Chasing method


If the coefficient matrix A = (aij ) has a special structure, such as aij = 0 for all pairs
(i, j) satisfying |i − j| > 1. Thus, in the ith row, only ai,i−1 , ai,i , ai,i+1 can be different
from 0. Three vectors are enough to store the nonzero elements.
Here is a system of a tridiagonal matrix

Ax = d,

20
where the invertible matrix A has the form
 
b1 c1
 a2 b2 c2 0 
 
A=
 a3 b3 c3 

 .. 
 0 . cn−1 
an bn
with the condition |b1 | > |c1 |, |bi | ≥ |ai | + |ci |, i = 2, · · · , n − 1 and |bn | > |an |.
We don’t have to use pivoting technique since |bi | is larger than |ai | + |ci |.
Definition The Dolittle factorization applied to solve a tridiagonal system is called
Chasing method.
A tridiagonal matrix is factorized by the Dolittle method
 
  1  
b1 c1  l2 1 r1 c1
 a2 b2 c2 0 
   r2 c2 0 
   l3 1  
A=
 a3 b3 c3  = 
    r3 c3 
. l4 1 
..


 .. 


 ..



0 . cn−1


 0 . 
an bn rn
ln 1
= LR.

In order to solve x, we get



r1 = b1 ai
li = ,
ri = bi − li ci−1 , i = 2, · · · , n ri−1
and

LRx = d,

that is,

Ly = d ⇒ y
Rx = y ⇒ x
in component form
    
1 y1 d1
 l2 1  y2   d2 
    

 l3 1 
 y3 =
  d3 

 ..  ..   .. 
 .  .   . 
ln 1 yn dn
with

y1 = d1
yi = di − li yi−1 , i = 2, · · · , n
and
    
r1 c1 x1 y1

 r2 c2 
 x2  
  y2 

 r3 c3  x3   y3 
=
    
 .. ..  .. .. 

 . . 
 .  
  . 

 rn−1 cn−1   xn−1   yn−1 
rn xn yn

21
with
xn = drnn


xi = yi − ci xi+1 , i = n − 1, n − 2, · · · , 1.

Here is the algorithm:


Input n, (ai ), (bi ), (ci ), (di )

set r1 = b1 , y1 = d1 ,

for i = 2 to n do
ai
li = ri−1 ;
yi = di − li yi−1 ;
ri = bi − li ci−1 ;
end do
xn = rbnn
for i = n − 1 to 1 do
xi = yi − ci xi+1 ;
end do
Output x
Examples Solve the system Ax = d. (1)
     
1 2 0 0 0 5 1
 2 3 1 0 0   9   2 
     
A =  0 −3 4 2 0  , d = 
 
 2 ,x = 
  1 .

 0 0 4 7 1   19   2 
0 0 0 −5 6 −4 1

(2)
   
2 −1 0 0 1
 −1 2 −1 0   0 
 0 −1 2 −1  , d = 
A=   .
0 
0 0 −1 2 1

Homework 2 Write and test programs for Chasing method.


Test example: Let A be the 10 × 10 tridiagonal matrix given by aii = 2, ai,i−1 =
−1,ai−1,i = −1, for each i = 2, · · · , 9, and a1,1 = a10,10 = 2, a12 = a10,9 = −1. Assume
d be the ten-dimensional column vector given by d1 = d10 = 1 and di = 0, for each
i = 2, 3, · · · , 9.
The discussion of Dolittle and Cholesky is primarily for historical reasons. On the other
hand, the Cholesky procedure is particularly well suited for symmetric positive definite
systems.

2.6 Norms and error analysis


To discuss the errors in numerical problems involving vectors or matrices, it is useful
to employ norms. The direct methods in this chapter are the Gaussian elimination and
triangular factorization methods in which there is only round-off errors. During the process
of solving the system
Ax = b

22
numerically, we only get the approximated x
e, not the exact solution x. Two question are
provided directly

(1) How to measure a vector ∆x = x − x


e?

(2) What does affect the error?


1. Vector norms
Let Rn denote the set of all n-dimensional vectors with real entries.
Definition. The norm of a vector x on Rn is a function, k x k, form Rn → R, such
that
1 k x k≥ 0, and k x k⇔ x = 0
2 k αx k= |α| k x k, ∀α ∈ R, x ∈ Rn
3 k x + y k≥k x k + k y k, ∀x, y ∈ Rn .
From ,3 it is deduced that

| k x k − k y k | ≤k x − y k

Consider k x k as the length or magnitude of the vector x. A norm on a vector space


generalizes the notion of absolute value, |r|, for a real or complex number.
Three familiar norms on Rn are defined as
l1 -norm:
Xn
k x k1 = |xi |
i=1

l2 -norm:
n
!1
X 2

k x k2 = x2i
i=1

l∞ -norm:
k x k∞ = max |xi |
1≤i≤n

Example Using the `∞ norm k · k∞ , compare the length of the following two vectors
in R3 for
   
2 0
x1 =  −5  and x2 =  5 
3 5

Then repeat it in the norm, for k · k1 , k · k2 .


Provided the definition of norm for a vector, we can estimate the magnitude of k x−e
xk
with any norm.
2. Matrix norm
Definition: Let Rn×n , denoted the set of all n × n matrices. Norm of A, denote by
k A k, is a function from Rn×n → R, with the following properties:
1 k A k≥ 0, k A k= 0 if and only if A = 0;
2 k αA k= |α| k A k, ∀α ∈ R, ∀A ∈ Rn×n ;
3 k A + B k≤k A k + k B k;
4 k AB k≤k A k k B k.
From the property , 4 it is deduced that k Am k≤ (k A k)m
Als we obtain k A k − k B k≤k A − B k based on the property . 5

23
Three popular norms of matrix are
1-norm(column):
n
X
k A k1 = max |aij |
1≤j≤n
i=1

∞-norm(row norm):
n
X
k A k∞ = max |aij |
1≤i≤n
j=1

2-norm(spectral norm): q
k A k2 = λmax (AT A),

where λmax is the maximum eigenvalue of the matrix AT A.


Example Find 1-norm, 2-norm, ∞-norm of A, and B
   
1 −2 1 1
A= , B= .
3 4 −3 3

Solution. k A k1 = {4, 6} = 6,
k A k∞ ={3, 7} = 7,

T 10 10
A A= ,
10 20
 
T 0−λ 10
A A= = λ2 − 30λ + 100 = 0,
10 20 − λ

λ1,2 = 15 ± 5 5,

so λmax = 15 + 5 5 =k A k∼ .

3. Properties of the norm of the matrix

(1) For any matrix A ∈ Rn×n , it yields that

ρ(A) ≤k A k,

where ρ(A) = max1≤i≤n |λi | is the spectral radius of A.


proof
Ax = λx
k A kk x k≥k Ax k= |λ| k x k
|λ| ≤k A k
ρ(A) = max |λ| ≤k A k .
(2) If k B k< 1, then the matrix I − B is nonsingular and
1
k (I − B)−1 k≤ .
1− k B k

(3) This is a special case of (2).


If k A−1 ∆A)−1 k< 1, then I − A−1 ∆A is nonsingular, and
1
k (I − A−1 ∆A)−1 k≤ .
1− k A−1 ∆A k
4. Analysis of round-off error

24
Given a linear system Ax = b whose exact solution is x, and an approximate solution
x
e solved by the direct methods for Ax = b. Regard x e as the exact solution of the other
linear system Ae
ex = b.
e
Denote
∆x = x − x
e, ∆A = A − A, e ∆b = b − eb.

Assume the terms ∆A and ∆b are very small and k A−1 ∆A k< 1, then the linear
system Ae
ex = eb can be written

(A − ∆A)(x − ∆x) = b − ∆b. (2.5)

The equation (2.5) is subtracted by Ax = b,

Ax − (A − ∆A)(x − ∆x) = b − (b − ∆b)


Ax − Ax + A∆x + ∆Ax − ∆A∆x = b − b + ∆b
(A − ∆A)∆x = ∆b − ∆Ax
A(I − A−1 ∆A)∆x = ∆b − ∆Ax
(I − A−1 ∆A)∆x = A−1 (∆b − ∆Ax)
∆x = (I − A−1 ∆A)−1 A−1 (∆b − ∆Ax)
Taking the norm on the both sides of the above equation, it follows that

k ∆x k≤k (I − A−1 ∆A)−1 kk A−1 kk ∆b − ∆Ax k (2.6)

According to the assumption of k A−1 ∆A k< 1, then we have


1
k (I − A−1 ∆A)−1 k≤ . (2.7)
1− k A−1 ∆A k

Substituting (2.7) into the inequality (2.6), we get

k A−1 k
k ∆x k≤ (k ∆b k + k ∆A kk ∆x k),
1− k A−1 ∆A k
which yields that

k A−1 k
 
k ∆x k k ∆b k k A k
k δx k= ≤ + k ∆A k
kxk 1− k A−1 ∆A k k x k k A k
k A−1 kk A k
 
k ∆b k k ∆A k
≤ +
1− k A−1 ∆A k kbk kAk
−1
k A kk A k
≤ (k δb k + k δA k)
1− k A−1 kk A k k∆Ak
kAk
k
= (k δb k + k δA k)
1 − k k δA k

where k =k A−1 kk A k.
It is said
Cond(A) = k =k A kk A−1 k
is the condition number of the coeffierent matrix A.

25
If the relative error δA and δb are very small, then the magnitude of δx depends on
the condition number k which is only related to A itself.
If k is very large, k δx k becomes large;
If k is small, k δx k is small, in which Ax = b is said well-conditioned system.
In fact
k =k A−1 kk A k≥k A−1 A k=k I k= 1 ⇒ k ≥ 1.
Once we are encountered with one linear system, we should check if the condition
number of k of A is very large. Our direct method are suitable for the well-conditioned
system. If k is very large, there are other special techniques for it.
Example Find the condition number of the matrix.
 
1 1+ε
A=
1−ε 1

provided that  is a small positive number.


It is easy to calculate the inverse of A
 
−1 1 1 −(1 + ε)
A = 2
ε ε−1 1

Cond∞ (A) = k A k∞ k A−1 k∞


1
= (2 + ε) × 2 (2 + ε)
ε
1
= 2 (2 + ε)2
ε
4 1
= 2 + +1
ε ε

When ε → 0, Cond∞ (A) → ∞.

26

You might also like