KEMBAR78
Unit 2 | PDF | Matrix (Mathematics) | System Of Linear Equations
0% found this document useful (0 votes)
44 views27 pages

Unit 2

Uploaded by

billusaanda1010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
44 views27 pages

Unit 2

Uploaded by

billusaanda1010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Indian Institute of Technology Roorkee

MAB-103: Numerical Methods


Unit-II Solutions of system of linear equations Session-2025-26
Consider the simplest example
ax = b
where a and b are given real numbers, with a 6= 0, whose solution is
x = a−1 b
Next, we shall consider a different generalisation of this elementary problem:
Let A be an n × n matrix with aij as its entry in row i and column j and b a given column
vector of size n with jth entry bj ; find a column vector x of size n such that Ax = b.
Denoting by xi the ith entry of the vector x, we can also write Ax = b in the following expanded
form:
a11 x1 + a12 x2 + · · · + a1n xn = b1
a21 x1 + a22 x2 + · · · + a2n xn = b2
·················· .············
an1 x1 + an2 x2 + · · · + ann xn = bn
Recall that in order to ensure that for real numbers a and b the single linear equation ax = b has
a unique solution, we need to assume that a 6= 0. In the case of the above system of n linear
equations in n unknowns we shall have to make an analogous assumption on the matrix A.

0.1 Gaussian elimination and backward substitution


We showcase the following in this section:
• How to solve linear equations when A is in upper triangular form. The algorithm is called
backward substitution.
• How to transform a general system of linear equations into an upper triangular form, to
which backward substitution can be applied. The algorithm is called Gaussian elimination.

0.1.1 Backward substitution


In specific instances, the matrix A exhibits structural properties that render the solution compu-
tationally efficient. For example, when A is diagonal, defined as diagonal, written as
 
a11
 a22 
A=
 
 . . .


ann
(by this notation we mean that the off-diagonal elements are all 0), then the linear equations are
uncoupled, reading aii xi = bi , and the solution is obviously
bi
xi = , i = 1, 2, · · · , n
aii

1
A more complex example of a special structure is an upper triangular matrix
 
a11 a12 . . . a1n
. . 
a22 . . .. 

A=

.. . 
 . .. 
ann

where all elements above the main diagonal are zero: aij = 0 for all i > j . In this case each
equation is possibly coupled only to those following it but not to those preceding it. Thus, we can
solve an upper triangular system of equations backwards.

• Algorithm: Backward Substitution.


Given an upper triangular matrix A and a right-hand-side b,

f or k = n : −1 : 1
bk − nj=k+1 akj xj
P
xk =
akk
end

• Cost of backward substitution


What is the cost of this algorithm? In a simplistic way we just count each floating point
operation (such as + and ∗) as a flop. The number of flops required here is
n−1
X
1+ ((n − k) + (n − k) + 1) = n2
k=1

0.1.2 Forward substitution


A more complex example of a special structure is an upper triangular matrix
 
a11 0 · · · 0
 a21 a22 · · · 0 
A =  ..
 
.. . . .. 
 . . . . 
an1 an2 · · · ann

where all elements above the main diagonal are zero: aij = 0 for all i < j . The situation is similar
to the upper triangular case, except that we apply forward substitution to compute a solution.
The algorithm is given on this page.

• Algorithm: Forward Substitution.


Given an upper triangular matrix A and a right-hand-side b,

f or k = 1 : n
Pk−1
bk − j=1 akj xj
xk =
akk
end

2
0.1.3 Gaussian elimination
We use elementary row transformations to reduce the given linear system to an equivalent upper
triangular form. Then we solve the resulting system using backward substitution. The reason this
works is that the solution to our problem Ax = b is invariant to

• multiplication of a row by a constant

• subtraction of a multiple of one row from another, and

• row interchanges.

These claims can be easily verified directly. Simply recall that Ax = b is a concise form for the
system of equations

a11 x1 + a12 x2 + · · · + a1n xn = b1


a21 x1 + a22 x2 + · · · + a2n xn = b2
·················· .············
an1 x1 + an2 x2 + · · · + ann xn = bn

and check what each of these operations amounts to.


Performing each of these operations can be recorded directly on the augmented matrix
 
a11 a12 · · · a1n | b1
 a21 a22 · · · a2n | b2 
(A|b) =  ..
 
.. . . .. .. 
 . . . . | .
an1 an2 · · · ann | bn

Eliminating one column at a time

• Eliminate the first column elements below the main diagonal:


a21
– Subtract a11
× the first row from the second row.
a31
– Subtract a11
× the first row from the third row.
..
.
an1
– Subtract a11
× the first row from the nth row

This produces
 
a11 a12 · · · a1n | b1
 0 a(1) (1)
· · · a2n
(1)
| b2 
22
(A(1) |b(1) ) =  ..
 
.. . . . .. .. 
 . . . | . 
(1) (1) (1)
0 an2 · · · ann | bn

• Next, consider the (n − 1) × n submatrix of (A(1) |b(1) ) obtained by ignoring the first row and
column. These are the elements that have been modified by the first stage of the process,
described above, but not set to 0. Apply exactly the same step as before to this submatrix;

3
a32
i.e., subtract (1) × the second row from the third row, etc.
a22
After this second stage we have an augmented matrix of the form
a11 a12 a13 · · · a1n | b1
 
 0 a(1)
22
(1) (1)
a23 · · · a2n
(1)
| b2 
 
(2) (2) (2)
(A(2) |b(2) ) =  0
 0 a33 · · · a3n | b3  
 . .. .. . . . .. 
 .. . . . .. | . 
(2) (2) (2)
0 0 an3 · · · ann | bn
The superscripts in the above expression show for each element at which stage it has last
been modified.
• Repeat the process. After n − 1 such stages we obtain an upper triangular system
a11 a12 a13 · · · a1n | b1
 
 0 a(1)22
(1) (1)
a23 · · · a2n | b2 
(1)
 
(2) (2) (2) 
(A (n−1) (n−1)
|b
0
)= 0 a 33 · · · a3n | b 3 
 . . . . . .
. . . . . .

 . . . . . | . 
(n−1) (n−1)
0 0 0 · · · ann | bn
This procedure leads to the Gaussian elimination algorithm in its simplest form given on the
next page. In stating the Gaussian elimination algorithm we have not bothered to zero out
the elements below the main diagonal in the modified matrix A: this part of the matrix is
not considered in the ensuing backward substitution. Of course, for this algorithm to work
(k−1)
we need to assume that all those pivotal elements akk by which we divide are nonzero and
in fact not very close to 0 either. We deal with the question of ensuring this later
– Algorithm: Gaussian Elimination.
Given a real, nonsingular n × n matrix A and a vector b of size n, first transform into
upper triangular form,
f or k =1 : n − 1
f or i = k + 1 : n
aik
lik =
akk
f or i = k + 1 : n
aij = aij − lik akj
end
bi = bi − lik bk
end
end

• The cost of the Gaussian elimination algorithm in terms of operation count is approximately
n−1 n−1
X X 2
((n − k) + 2(n − k)(n − k + 1)) ≈ 2 (n − k)2 = n3 + O(n2 )
k=1 k=1
3

flops.

4
0.2 LU Decomposition
In this section we show that the process of Gaussian elimination in fact decomposes the matrix A
into a product L × U of a lower triangular matrix L and an upper triangular matrix U . Let us
(k−1)
continue to assume that the elements akk encountered in the Gaussian elimination process are
all bounded safely away from 0. We will remove this assumption in the next section.
• Elementary lower triangular matrices:
Consider the first stage of Gaussian elimination described above. These are elementary row
opera- tions applied to the matrix A to zero out its first column below the (1, 1) entry.
These operations are also applied to the right-hand-side vector b. Note, however, that the
operations on b can actually be done at a later time because they do not affect those done
on A. We can capture the effect of this first stage by defining the elementary n × n lower
triangular matrix
 
1
 −l21 1 
 
(1)  −l31 1
M =


 .. ... 
 . 
−ln1 1

Then the effect of this first stage is seen to produce

A(1) = M (1) A and b(1) = M (1) b

Likewise, the effect of the second stage of Gaussian elimination, which is to zero out the
second column of A(1) below the main diagonal, can be written as

A(2) = M (2) A(1) = M (2) M (1) A and b(2) = M (2) b(1) = M (2) M (1) b

where
 
1

 1 

 −l32 1 
M (2) =
 
 −l42 1 

 .. .. 
 . . 
−ln2 1

• Obtaining the LU decomposition :


This procedure is continued, and after n − 1 such stages it transforms the matrix A into an
upper triangular matrix, which we call U , by

U = A(n−1) = M (n−1) · · · M (2) M (1) A.

Likewise,
b(n−1) = M (n−1) · · · M (2) M (1) b.
Multiplying U by [M (n−1) ]−1 , then by [M (n−2) ]−1 , etc., we obtain

A = LU

5
where
L = [M (1) ]−1 · · · [M (n−2) ]−1 [M (n−1) ]−1
.

• Let us verify the above claims for the matrix


 
1 −1 3
A = 1 1 0 
3 −2 1

– The Gaussian elimination process for the first column yields l21 = 1/1 = 1 and l31 =
3/1 = 3, so
   
1 0 0 1 −1 3
M (1) = −1 1 0 , A(1) = M (1) A = 0 2 −3
−3 0 1 0 1 −8

– The Gaussian elimination process for the second column yields l32 = 1/2, so
   
1 0 0 1 −1 3
M (2) = 0 1 0 , U = A(2) = M (2) A(1) = 0 2 −3 
0 −1/2 1 0 0 −6.5

– Note that
   
1 0 0 1 0 0
[M (1) ]−1 = 1 1 0 , [M (2) ]−1 = 0 1 0 ,
3 0 1 0 1/2 1
 
1 0 0
[M (1) ]−1 [M (2) ]−1 = 1 1 0 = L,
3 1/2 1

and     
1 0 0 1 −1 3 1 −1 3
LU = 1 1 0 0 2 −3  = 1 1 0 = A.
3 1/2 1 0 0 −6.5 3 −2 1
– This explicitly specifies the LU decomposition of the given matrix A.

• Rather than constructing L and U by using the elimination steps, it is possible to solve
directly for these matrices. Let us illustrate the direct computation of L and U in the case
of n = 3. Write A = LU as
    
a11 a12 a13 1 0 0 u11 u12 u13
a21 a22 a23  = l21 1 0  0 u22 u23 
a31 a32 a33 l31 l32 1 0 0 u33

The above procedure of LU decompo- sition is called Doolittles method.

6
• We can also write A = LU as
    
a11 a12 a13 l11 0 0 1 u12 u13
a21 a22 a23  =  l21 l22 0  0 1 u23 
a31 a32 a33 l31 l32 l33 0 0 1

The above procedure of LU decomposition is called Crout’s method.

• Algorithm: Solving Ax = b by LU Decomposition:


Given a real nonsingular matrix A, apply LU decomposition first:

A = LU
Given also a right-hand-side vector b:

– Forward substitution: solve


Ly = b
– Backward substitution: solve
Ux = y

• Symmetrizing the LU decomposition


Consider the LU decomposition of a symmetric positive definite matrix, written as
  u12 u13 
1 u11 u11 · · · uu1n

u11 11
u23 u2n 
 u22  1 u 22
· · · u22

.. . .
 
. .

U =
 . 
 . .  = DŨ
 . ..
 ... . .
  . 
unn 1

so the LU decomposition reads


A = LDŨ .
But transposing it we have
A = A> = Ũ > DL>
Since the decomposition is unique we must have Ũ > = L, i.e., the decomposition reads

A = LDL>

where L is unit lower triangular and D is diagonal with positive elements ukk .
It is also possible to write D = D1/2 D1/2 with
√ √ √
D1/2 = diag( u11 , u22 · · · , unn )

whence the LU decomposition is written as

A = GG>

with G = LD1/2 a lower triangular matrix. This is called the Cholesky decomposition.

7
• Next we modify the algorithm of Gaussian elimination by pivoting, thus making it generally
stable. The process of Gaussian elimination (or LU decomposition) described above cannot
be applied unmodified to all nonsingular matrices.

– For instance, the matrix  


0 1
A=
1 0
is clearly nonsingular, and yet a11 = 0, so the algorithm breaks down immediately. Here
is another example, slightly more subtle.
– Consider the system

x1 + x2 + x3 = 1
x1 + x2 + 2x3 = 2
x1 + 2x2 + 2x3 = 1.

The matrix A defined by these equations is nonsingular (indeed, the unique solution
is x = (1, −1, 1)> ), also all elements of A are nonzero, and yet Gaussian elimination
yields after the first stage
   
1 1 1 | 1 1 1 1 | 1
1 1 2 | 2 =⇒ 0 0 1 | 1
1 2 2 | 1 0 1 1 | 0
(1)
Next, a22 = 0 and we cannot proceed further.
In this case there is an obvious remedy: simply interchange the second and the third
rows! This yields
   
1 1 1 | 1 1 1 1 | 1
1 2 2 | 1 =⇒ 0 1 1 | 0
1 1 2 | 2 0 0 1 | 1

and backward substitution proceeds as usual.


(k−1)
– In the above examples the process broke down because a zero pivotal element akk
(k−1)
was encountered. But akk does not have to exactly equal 0 for trouble to arise:
(k−)
undesirable accumulation of roundoff error may arise also when akk is near 0.
– Consider a perturbation of the previous example that reads

x1 + x2 + x3 = 1
x1 + 1.0001x2 + 2x3 = 2
x1 + 2x2 + 2x3 = 1.

The exact solution, correct to 5 digits, is x ≈ (1, −1.0001, 1.0001)> . Now, Gaussian
elimination in exact arithmetic can be completed and yields
     
1 1 1 | 1 1 1 1 | 1 1 1 1 | 1
1 1.0001 2 | 2 =⇒ 0 0.0001 1 | 1 =⇒ 0 0.0001 1 | 1 
1 2 2 | 1 0 1 1 | 0 0 0 −9999 | −10000

8
Assume next that we are using floating point arithmetic with base β = 10 and precision
t = 3. Then backward substitution gives

x3 = 1, x2 = 0, x1 = 0.

On the other hand, if we interchange the second and third rows we obtain
     
1 1 1 | 1 1 1 1 | 1 1 1 1 | 1
1 2 2 | 1 =⇒ 0
  1 1 | 0 =⇒ 0 1 1 | 0
1 1.0001 2 | 2 0 0.0001 1 | 1 0 0 .9999 | 1

Now backward substitution with 3 decimal digits gives

x3 = 1.000, x2 = −1.000, x1 = 1.000

which is correct, in that the difference between this solution and the exact solution is
less than the rounding unit.

• Partial pivoting
The problem highlighted in these simple examples is real and general. Recall from last unit
that roundoff errors may be unduly magnified if divided by a small value. Here, if a pivotal
(k−1)
element akk is small in magnitude, then roundoff errors are amplified, both in subsequent
stages of the Gaussian elimination process and (even more apparently) in the backward
substitution phase.
These simple examples also suggest a strategy to resolve the difficulty, called partial pivoting:
as the elimination proceeds for k = 1, · · · , n − 1, at each stage k choose q = q(k) as the
smallest integer for which
(k−1) (k−1)
|aqk | = max |aik |,
k≤i≤n

and interchange rows k and q. Then proceed with the elimination process.

• We could modify the partial pivoting strategy into one called scaled partial pivoting. Thus,
define initially for each row i of A a size

si = max |aij |.
1≤j≤n

Then, at each stage k of the Gaussian elimination procedure, choose q = q(k) as the smallest
integer
(k−1) (k−1)
|aqk | |aik |
= max
sq k≤i≤n si
and interchange rows k and q.

• A strategy that does guarantee stability is complete pivoting: at each stage choose q and r
as the smallest integers for which
(k−1) (k−1)
|aqr | = max |aij |,
k≤i,j≤n

and interchange both row q and column r with row i and column j, respectively. However,
this strategy is significantly more expensive than partial pivoting.

9
Solution Using Gaussian Elimination with Partial Pivoting
Given the system:

3x2 + 5x3 = 1.20736


3x1 − 4x2 = −2.34066
5x1 + 6x3 = −0.329193

Step 1: Augmented Matrix


 
0 3 5 | 1.20736
3 −4 0 | −2.34066 
5 0 6 | −0.329193

Step 2: Partial Pivoting (Swap Rows)


Swap row 1 and row 3:  
5 0 6 | −0.329193
3 −4 0 | −2.34066 
0 3 5 | 1.20736

Step 3: Eliminate x1 from Row 2


Multiplier: m21 = 53 = 0.6
Row 2 ← Row 2 − 0.6× Row 1:
 
5 0 6 | −0.329193
0 −4 −3.6 | −2.1431442
0 3 5 | 1.20736

Step 4: Eliminate x2 from Row 3


3
Multiplier: m32 = −4 = −0.75
Row 3 ← Row 3 − (−0.75)× Row 2:
 
5 0 6 | −0.329193
0 −4 −3.6 | −2.1431442
0 0 2.3 | −0.39999

Step 5: Back Substitution

−0.39999
x3 = ≈ −0.17391 ≈ −0.174
2.3
−2.14314 − (−3.6)(−0.174)
x2 = ≈ 0.69238 ≈ 0.692
−4
−0.329193 − 6(−0.174)
x1 = ≈ 0.14296 ≈ 0.143
5

10
Solution of the Linear System Using Gaussian Elimination with Scaling
We solve the system: 
3x + 2y + 100z = 105

−x + 3y + 100z = 102

x + 2y − z = 2

Step 1: Augmented Matrix


 
3 2 100 105
 −1 3 100 102 
1 2 −1 2

Step 2: Scaling Factors


Compute the maximum absolute value in each row:

s1 = max(|3|, |2|, |100|) = 100


s2 = max(| − 1|, |3|, |100|) = 100
s3 = max(|1|, |2|, | − 1|) = 2

Step 3: Partial Pivoting with Scaling


For Column 1, compute scaled ratios:

|3| | − 1| |1|
= 0.03, = 0.01, = 0.5
100 100 2
Row 3 has largest ratio ⇒ Swap Row 1 and Row 3

New matrix:
   
0.03 0.02 1 1.05 0.5 1 −0.5 1
 −0.01 0.03 1 1.02  →  −0.01 0.03 1 1.02 
0.5 1 −0.5 1 0.03 0.02 1 1.05

Step 4: Eliminate x from Rows 2 and 3


• Row 2 ← Row 2 + 0.02 Row 1:
 
0.5 1 −0.5 1
 0 0.05 0.99 1.04 
0.03 0.02 1 1.05

• Row 3 ← Row 3 - 0.06 × Row 1:


 
0.5 1 −0.5 1
 0 0.05 0.99 1.04 
0 −0.04 1.03 .99

11
Step 5: Partial Pivoting for Column 2
Scaled ratios for Column 2:
|0.05| | − 0.04|
= 0.05, = 0.03
0.99 1.03
No row swap needed.

Step 6: Eliminate y from Row 3


4
Row 3 ← Row 3 + 5
× Row 2:
 
0.5 1 −0.5 1
 0 0.05 0.99 1.04 
0 0 1.822 1.822

Step 7: Back Substitution

1.822z = 1.822 ⇒ z = 1
0.05y + 0.99(1) = 1.04 ⇒ y = 1
0.5x + 1(1) − 1(1) = 2 ⇒ x = 1

Solution Using Doolittle’s LU Decomposition


Given the system: 
x + y + z = 1

4x + 3y − z = 6

3x + 5y + 3z = 4

Step 1: Matrix Representation


   
1 1 1 1
A = 4 3 −1 , b = 6

3 5 3 4

Step 2: LU Decomposition
Find L and U such that A = LU where:
   
1 0 0 u11 u12 u13
L = l21 1 0 , U =  0 u22 u23 
l31 l32 1 0 0 u33

Step 3: Compute Elements

u11 = 1, u12 = 1, u13 = 1


l21 = 4, l31 = 3
u22 = −1, u23 = −5

12
l32 = −2
u33 = −10

Step 4: Forward Substitution (Ly = b)



y 1 = 1

y2 = 2

y3 = 5

Step 5: Backward Substitution (U x = y)



z = −0.5

y = 0.5

x=1

Final Solution
x=1, y = 0.5 , z = −0.5

Problem
Assume the LU decomposition of a square matrix A ∈ R500×500 takes 5 seconds. How many
systems of the form
Axi = bi , i = 1, 2, . . . , k
can be solved in the next 6 seconds after the decomposition is complete?

Solution
Once the LU decomposition is computed, each system Ax = b can be solved by performing:
1. Forward substitution to solve Ly = b
2. Backward substitution to solve U x = y
Each substitution step is an O(n2 ) operation, so the total cost to solve one system is propor-
tional to n2 . Let c denote the constant of proportionality. Then the LU decomposition time is
approximately:
2
TLU = n3 · c.
3
Given that TLU = 5 seconds for n = 500, we solve for c:
2 5·3 15
5 = · (500)3 · c ⇒ c = = = 6 × 10−8 .
3 2 · 5003 250 · 106
Now, the time required to solve one linear system is:
Tsolve = 2n2 · c = 2 · (500)2 · 6 × 10−8 = 2 · 250,000 · 6 × 10−8 = 0.03 seconds.
Thus, the number of systems that can be solved in 6 seconds is:
6
k= = 200.
0.03

13
Conclusion
200 systems can be solved in 6 seconds after the LU decomposition is complete.

Conclusion
200 systems can be solved in 6 seconds after the LU decomposition is complete.
Given the system: 
3x + 2y + 4z = 7

2x + y + z = 4

3x + 5y + 3z = 3

Step 1: Matrix Representation


   
3 2 4 7
A = 2 1 1 , b = 4
3 5 3 3

Step 2: LU Decomposition
Find L and U such that A = LU where:
   
l11 0 0 1 u12 u13
L = l21 l22 0  , U = 0 1 u23 
l31 l32 l33 0 0 1

Step 3: Compute Elements

l11 = 3, l21 = 2, l31 = 3


u12 = 32 = .6666, u13 = 43 = 1.3333
l22 = − 13 = 0.3333, l32 = 3
u23 =5
l33 = −16

Step 4: Forward Substitution (Ly = b)



7
y1 = 3 = 2.3333

y2 = 2

y3 = 23

Step 5: Backward Substitution (U x = y)



2
z = 3

y = − 50
3
 65
x= 3

14
Final Solution
65
x= 3
, y = − 50
3
, z= 4
3

Final Solution
x=1, y=1, z=1

Step 1: Augmented Matrix


 
3 2 100 105
 −1 3 100 102 
1 2 −1 2

Step 2: Scaling Factors


Compute the maximum absolute value in each row:

s1 = max(|3|, |2|, |100|) = 100


s2 = max(| − 1|, |3|, |100|) = 100
s3 = max(|1|, |2|, | − 1|) = 2

Step 3: Partial Pivoting with Scaling


For Column 1, compute scaled ratios:

|3| | − 1| |1|
= 0.03, = 0.01, = 0.5
100 100 2
Row 3 has largest ratio ⇒ Swap Row 1 and Row 3

New matrix:  
1 2 −1 2
 −1 3 100 102 
3 2 100 105

Step 4: Eliminate x from Rows 2 and 3


• Row 2 ← Row 2 + Row 1:  
1 2 −1 2
 0 5 99 104 
3 2 100 105

• Row 3 ← Row 3 - 3 × Row 1:


 
1 2 −1 2
 0 5 99 104 
0 −4 103 99

15
Step 5: Partial Pivoting for Column 2
Scaled ratios for Column 2:
|5| | − 4|
= 1, =1
5 4
No row swap needed.

Step 6: Eliminate y from Row 3


4
Row 3 ← Row 3 + 5
× Row 2:  
1 2 −1 2
 0 5 99 104 
0 0 182.2 182.2

Step 7: Back Substitution

182.2z = 182.2 ⇒ z = 1
5y + 99(1) = 104 ⇒ y = 1
x + 2(1) − 1(1) = 2 ⇒ x = 1

Verification
Substitute (x, y, z) = (1, 1, 1) into original equations:

3(1) + 2(1) + 100(1) = 105

−1(1) + 3(1) + 100(1) = 102

1(1) + 2(1) − 1(1) = 2

All equations are satisfied.

Vector Norms
A vector norm is a function k · k from Rn to R satisfying:

1. Positive definiteness: kxk ≥ 0, kxk = 0 ⇐⇒ x = 0.

2. Homogeneity: kαxk = |α| kxk.

3. Triangle inequality: kx + yk ≤ kxk + kyk.

Common vector norms:


n n
!1/2
X X
2
kxk1 = |xi |, kxk2 = |xi | , kxk∞ = max |xi |.
1≤i≤n
i=1 i=1

16
Matrix Norms
A matrix norm kAk satisfies:

1. Positive definiteness: kAk ≥ 0, kAk = 0 ⇐⇒ A = 0.

2. Homogeneity: kαAk = |α| kAk.

3. Triangle inequality: kA + Bk ≤ kAk + kBk.

4. Submultiplicativity: kABk ≤ kAk kBk.

The induced (operator) norm based on a vector norm k · kv is:

kAxkv
kAk = max .
x6=0 kxkv

Common induced norms:


n
X n
X p
kAk1 = max |aij |, kAk∞ = max |aij |, kAk2 = ρ(AT A).
1≤j≤n 1≤i≤n
i=1 j=1

Vector Norms Example


Let
x = (3, −4, 12)T .
kxk1 = |3| + | − 4| + |12| = 19,
p √
kxk2 = 32 + (−4)2 + 122 = 169 = 13,
kxk∞ = max(3, 4, 12) = 12.

Matrix Norms Example


Let  
1 −2 3
A= .
−4 5 −6
Column sums:
5, 7, 9 ⇒ kAk1 = 9.
Row sums:
6, 15 ⇒ kAk∞ = 15.
Spectral norm:
kAk2 ≈ 9.508.

17
Condition Number of a Matrix
Let A be a nonsingular matrix. The condition number of A with respect to a matrix norm k · k is
defined as
κ(A) = kAk · kA−1 k.
In the 2-norm (spectral norm) this becomes
σmax (A)
κ2 (A) = ,
σmin (A)
where σmax (A) and σmin (A) are the largest and smallest singular values of A, respectively.
A matrix with κ(A) ≈ 1 is called well-conditioned, while a matrix with a large condition
number is called ill-conditioned. The condition number provides an upper bound on the relative
error:
kδxk kδbk
. κ(A) · .
kxk kbk

Example 1: Well-Conditioned Matrix


Let  
2 0
A= .
0 1
We have  
−1 1/2 0
kAk2 = max(|2|, |1|) = 2, A = , kA−1 k2 = 1.
0 1
Thus,
κ2 (A) = 2 × 1 = 2.

Example 2: Ill-Conditioned Matrix


Let  
1 0.99
B= .
0.99 0.98
The determinant is
det(B) = 1 · 0.98 − 0.99 · 0.99 = −0.0001,
which is very small, so B −1 will have large entries. Numerically,

kBk2 ≈ 1.99, kB −1 k2 ≈ 2 × 104 ,

so
κ2 (B) ≈ 4 × 104 .

Example 3: Identity Matrix


For 
1 0
I= ,
0 1
we have
kIk2 = 1, kI −1 k2 = 1, κ2 (I) = 1.

18
Definition
A square matrix A = [aij ] ∈ Rn×n is called strictly diagonally dominant if:
n
X
|aii | > |aij |, for all i = 1, 2, . . . , n.
j=1
j6=i

That is, in each row of A, the magnitude of the diagonal entry is strictly greater than the sum of
the magnitudes of the other (non-diagonal) entries.

Interpretation
Geometrically, strict diagonal dominance means that each equation in the linear system Ax = b
has its main variable coefficient significantly larger (in magnitude) than the coefficients of other
variables in that equation. This property often guarantees stability and convergence of iterative
methods such as the Gauss–Seidel and Jacobi methods.

Example
Consider the matrix:  
4 −1 0
A = −1 4 −1 .
0 −1 3
Check strict diagonal dominance:

|4| > | − 1| + |0| = 1 (True)


|4| > | − 1| + | − 1| = 2 (True)
|3| > |0| + | − 1| = 1 (True)
Since all three inequalities hold, A is strictly diagonally dominant.

Non-example
The matrix  
2 3
B=
1 1
fails the first row test:
|2| >
6 |3| ⇒ Not strictly diagonally dominant.

Applications
Strictly diagonally dominant matrices are important because:
• They guarantee nonsingularity (invertibility).
• They ensure convergence of iterative solvers.
• They often arise from discretizations of PDEs (e.g., Laplace equation).

19
Definition
A real symmetric matrix A ∈ Rn×n is said to be positive definite if:

xT Ax > 0 ∀ x ∈ Rn , x 6= 0.

Similarly, A is positive semi-definite if:

xT Ax ≥ 0 ∀ x ∈ Rn .

Equivalent Conditions
For a real symmetric matrix A, the following are equivalent:

1. xT Ax > 0 for all nonzero x.

2. All eigenvalues λi > 0.

3. All leading principal minors are positive (Sylvester’s criterion): det(Ak ) > 0 for k =
1, 2, . . . , n, where Ak is the top-left k × k submatrix of A.

Example
Consider the matrix:  
2 −1
A= .
−1 2

Quadratic form check


Let x = (x1 , x2 )T . Then:

xT Ax = 2x21 − 2x1 x2 + 2x22 = (x1 − x2 )2 + x21 + x22 > 0

for all nonzero x.

Eigenvalue check
We solve det(A − λI) = 0:
 
2 − λ −1
det = (2 − λ)2 − 1 = 0,
−1 2 − λ

which simplifies to:


λ2 − 4λ + 3 = 0.
Thus:
λ = 1, 3 > 0.
All eigenvalues are positive.

20
Principal minors check
det([2]) = 2 > 0,
det(A) = (2)(2) − (−1)(−1) = 4 − 1 = 3 > 0.
Since all leading principal minors are positive, A is positive definite.

0.3 Iterative methods


Loosely speaking, discretization of one- and two-dimensional problems yields linear systems that
are small enough to be efficiently solved with direct methods. However, three-dimensional problems
usually leads to linear systems that are too large and expensive for direct methods. Instead,
cheaper iterative methods must be utilized. Unlike direct methods, iterative methods do not have
a fixed number of floating point operations attached to them for computing the solution x to a
linear system. Instead, a sequence of approximations x(k) is sought successively, such that x(k) → x
in the limit k → ∞. Of course, the unspoken hope is that convergence in one way or another will
be reached with only a small number of iterations.

0.3.1 Basic Iterative methods


It is actually quite simple to create a framework for a set of basic iterative methods. To this end
consider, again, the linear system Ax = b, and let us split A into

A=M −K

where M is any non-singular matrix, and K the remainder K = M − A. This gives

(M − K)x = b
M x = Kx + b
x = M −1 Kx + M −1 b

Now, if we have a starting guess x(0) for x, this suggests the iteration

x(k+1) = M −1 Kx(k) + M −1 b.

• Although we do not know for which linear systems the iteration scheme (6.24) converges, if
any, let us tacitly take it as our basic iterative method for solving linear systems.

• For this algorithm to be computationally practical, it is important that the splitting of A


is chosen such that M −1 K and M −1 b are easy to calculate, or at least their action on any
given vector. Recall that we do not want to compute any inverses.

• In the following we shall study splittings of A of the form

A=D−L−U

where D is the diagonal of A, and −U and −L the strictly upper and lower triangular part
of A, respectively. This leads to two classical iterative methods, known as the Jacobi and
the Gauss-Seidel methods.

21
0.3.2 The Jacobi Method
Jacobi iteration is defined by choosing M = D and K = L + U , which gives the iteration scheme

x(k+1) = D−1 (L + U )x(k) + D−1 b.

We observe that D is easy to invert, since it is a diagonal matrix. For example, n = 3, we have
 k+1   −1    k   −1  
x1 a11 0 −a12 −a13 x1 a11 b1
xk+1 −1 k −1
2
 =  a22   −a21 0 −a23   x2 +  a22   b2  .
k+1 −1 k −1
x3 a33 −a31 −a32 0 x3 a33 b3

Hence, it gives

xk+1
1 = a−1 k k
11 (−a12 x2 − a13 x3 + b1 ),
xk+1
2 = a−1 k k
22 (−a21 x1 − a23 x3 + b2 ),
xk+1
3 = a−1 k k
33 (−a31 x1 − a32 x2 + b3 ).

Let’s solve the following system of linear equations using the Jacobi method:

4x1 − x2 + 0x3 = 15,


−x1 + 4x2 − x3 = 10,
0x1 − x2 + 4x3 = 10.
This corresponds to the matrix form:

Ax = b,
where
     
4 −1 0 x1 15
A = −1 4 −1 , x = x2  ,
 b = 10 .

0 −1 4 x3 10
The Jacobi method updates each component of x iteratively as follows:

(k+1) 1 (k)

x1 = 15 + x2 ,
4
(k+1) 1 (k) (k)

x2 =10 + x1 + x3 ,
4
(k+1) 1 (k)

x3 = 10 + x2 .
4
T
Given an initial guess x(0) = 0 0 0 , the iterations proceed until the solution converges.

22
0.3.3 The Gauss-Seidel Method
In the Gauss-Seidel method for M = D − L and K = U , which gives the iteration scheme

x(k+1) = (D − L)−1 U x(k) + (D − L)−1 b.

We observe that since D − L has a lower triangular structure, the effect of (D − L)−1 can be
computed by forward elimination. For example, n = 3, we have

xk+1
1 = a−1 k k
11 (−a12 x2 − a13 x3 + b1 ),
xk+1
2 = a−1 k+1
22 (−a21 x1 − a23 xk3 + b2 ),
xk+1
3 = a−1 k+1
33 (−a31 x1 − a32 xk+1
2 + b3 ).

Consider the linear system:


4x1 − x2 + 0x3 = 15,
−1x1 + 4x2 − 1x3 = 10,
0x1 − 1x2 + 4x3 = 10.
This can be written in matrix form as:
    
4 −1 0 x1 15
−1 4 −1 x2  = 10 .
0 −1 4 x3 10

Using the Gauss-Seidel method, the iterative updates for each variable are:

(k+1) 1 (k)

x1 = 15 + x2 ,
4
(k+1) 1 (k+1) (k)

x2 = 10 + x1 + x3 ,
4
(k+1) 1 (k+1)

x3 = 10 + x2 .
4
Starting with an initial guess x(0) = (0, 0, 0)> , the iterations proceed as follows:
• Iteration 1:
(1) 1
x1 = (15 + 0) = 3.75,
4
(1) 1
x2 = (10 + 3.75 + 0) = 3.4375,
4
(1) 1
x3 = (10 + 3.4375) = 3.359375.
4
• Iteration 2:
(2) 1
x1 = (15 + 3.4375) = 4.609375,
4
(2) 1
x2 = (10 + 4.609375 + 3.359375) = 4.4921875,
4
(2) 1
x3 = (10 + 4.4921875) = 3.623046875.
4
• Continue iterating until convergence.

23
0.3.4 Convergence Analysis
We now return to the question of convergence of our basic iterative method. By inspection, we
see that it can be rewritten as
x(k+1) = Hx(k) + c (1)
where H = M −1 K is the relation matrix, and c = M −1 b. Let e(k) = x − xk be the error after
k iterations. A relation between the errors in successive approximations can be be derived by
subtracting x = Hx + c from (1)
e(k+1) = xk+1 − x = H(x(k) − x) = · · · = H k+1 (x(0) − x) = H k+1 e0
Here, for convergence, we require H k+1 e → 0 as k → ∞ for any e. It turns out that this
requirement is equivalent to ρ(H) < 1, where ρ(H) = max1≤k≤n |λk (H)| is the so-called spectral
radius of H, that is, the magnitude of the extremal eigenvalue of H.
• Both Jacobi and Gauss-Seidel methods converge if A is strictly diagonally dominant.
• The Gauss-Seidel method also converges if A is SPD.

Convergence of the Jacobi Method


Given the system
Ax = b, A ∈ Rn×n ,
write
A = D − L − U,
where D is diagonal, L is strictly lower triangular, and U is strictly upper triangular.
The Jacobi iteration is
x(k+1) = D−1 (L + U )x(k) + D−1 b,
with iteration matrix
HJ = D−1 (L + U ).

Convergence Condition
The Jacobi method converges if and only if
ρ(HJ ) < 1,
where ρ(·) denotes the spectral radius.

Sufficient Condition
If A is strictly diagonally dominant, i.e.,
X
|aii | > |aij |, ∀i,
j6=i

then P
|aij |
j6=i
kHJ k∞ = max <1
i |aii |
and hence ρ(HJ ) < 1, guaranteeing convergence.

24
Convergence of the Gauss-Seidel Method
For the Gauss-Seidel method, the iteration matrix is given by:

HGS = (D − L)−1 U,

where A = D − L − U is the decomposition of matrix A into its diagonal (D), strictly lower
triangular (L), and strictly upper triangular (U ) parts.
The eigenvalues λ of HGS satisfy the characteristic equation:

det(HGS − λI) = det (D − L)−1 U − λI = 0.




Using properties of determinants, this can be rewritten as:

det (D − L)−1 (U − λ(D − L)) = 0.




Since det((D − L)−1 ) 6= 0, this simplifies to:

det(U − λ(D − L)) = 0.

Multiplying by (−1) and rearranging terms, we obtain:

det(λD − λL − U ) = 0.

Let us denote:
Aλ = λD − λL − U.

Key Observation: Diagonal Dominance


If A = D − L − U is strictly diagonally dominant, then for any λ with |λ| ≥ 1, the matrix Aλ
remains strictly diagonally dominant.
Proof of Diagonal Dominance: For each row i, the diagonal entry of Aλ is λaii , and the
off-diagonal entries are scaled by λ (for L) or remain unchanged (for U ). Since |λ| ≥ 1, the
magnitude of the diagonal entry |λaii | grows at least as fast as the sum of the magnitudes of the
off-diagonal entries. Thus:
X X X
|λaii | > |λ| |aij | + |aij | ≥ |(Aλ )ij |,
j<i j>i j6=i

which ensures that Aλ is strictly diagonally dominant.

Implications for Eigenvalues


A strictly diagonally dominant matrix is nonsingular (its determinant cannot be zero). Therefore:

det(Aλ ) 6= 0 for |λ| ≥ 1.

However, the eigenvalue equation requires det(Aλ ) = 0. This leads to a contradiction unless
|λ| < 1.
Conclusion: All eigenvalues λ of HGS must satisfy |λ| < 1, which guarantees the convergence
of the Gauss-Seidel method for strictly diagonally dominant matrices.

25
Extension to the Jacobi Method
The proof for the Jacobi method follows a similar reasoning. The iteration matrix for Jacobi is:

HJ = D−1 (L + U ).

The eigenvalue analysis leads to:


det(λD − L − U ) = 0.
If A is strictly diagonally dominant, then for |λ| ≥ 1, the matrix λD − L − U is also strictly
diagonally dominant and thus nonsingular. Hence, |λ| < 1, proving convergence.

0.4 Successive over-relaxation (SOR) method


• The SOR method is an enhancement of the Gauss-Seidel method, introducing a relaxation
factor to accelerate convergence.
• The SOR method builds upon the Gauss-Seidel method, which iteratively updates each
component of the solution vector by solving the corresponding equation assuming all other
components remain fixed.
• Relaxation Factor (ω): SOR introduces a relaxation factor ω(where 1 ≤ ω < 2) to modify
the update rule, allowing for over-relaxation and faster convergence: If ω = 1, the method
reduces to the Gauss-Seidel method. If 1 < ω < 2, the method is over-relaxed, often leading
to faster convergence. If ω > 2 or ω < 1, the method may diverge.
• SOR Algorithm: Given a system Ax = b, where A is decomposed as A = D − L − U (with
D being the diagonal, L the lower triangular, and U the upper triangular parts of A), the
SOR iteration is defined by:
xk+1 = Hxk + c,
where
H = (D − ωL)−1 ((1 − ω)D + ωU ), c = ω(D − ωL)−1 rk .
The alternative form is as follows:
!
ω X X
xk+1
i = (1 − ω)xki + bi − aij xk+1
j − aij xkj
aii j<i j>i

for i = 1, 2, · · · , n, where xk+1


i is the updated value of the i − th component of the solution
at iteration k + 1.
• Consider the system of linear equations:
    
4 −1 0 x1 15
−1 4 −1 x2  = 10
0 −1 4 x3 10
 
0
We apply the Successive Over-Relaxation (SOR) method with an initial guess x(0) = 0

0
and relaxation factor ω = 1.25.

26
Iteration 1
(1) 1.25
x1 = (15 − (−1) · 0 − 0) = 4.6875
4
(1) 1.25
x2 = (10 − (−1) · 4.6875 − (−1) · 0) = 4.78515625
4
(1) 1.25
x3 = (10 − (−1) · 4.78515625) = 4.9932861328125
4
Thus,  
4.6875
x(1) =  4.78515625  .
4.9932861328125

Iteration 2
(2) 1.25
x1 = (1 − 1.25) · 4.6875 + (15 − (−1) · 4.78515625 − 0) = 4.95758056640625
4
(2) 1.25
x2 = (1−1.25)·4.78515625+ (10−(−1)·4.95758056640625−(−1)·4.9932861328125) = 5.00056266784
4
(2) 1.25
x3 = (1−1.25)·4.9932861328125+ (10−(−1)·5.00056266784668) = 5.006518870592117
4
Thus,  
4.95758056640625
x(2) =  5.00056266784668  .
5.006518870592117

27

You might also like