KEMBAR78
Solving Linear Equation Systems | PDF | System Of Linear Equations | Equations
0% found this document useful (0 votes)
42 views12 pages

Solving Linear Equation Systems

Uploaded by

eacar.manuel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views12 pages

Solving Linear Equation Systems

Uploaded by

eacar.manuel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

BASIC MATRIX ALGEBRA

Solving linear systems of equations


Data Science Institute, Hasselt University

Contents
Theory . . . . . . . . . . . . . . . . . . . . 1

Examples . . . . . . . . . . . . . . . . . . 4

R-software . . . . . . . . . . . . . . . . . . 6

Exercises . . . . . . . . . . . . . . . . . . . 7

Advanced: Statistical Applications . . 7

Acknowledgment . . . . . . . . . . . . . 9

Solutions Exercises . . . . . . . . . . . . 10

Theory
Many problems in statistics, engineering, etc reduce to the solution of an equation or a set of
equations. An equation is a type of mathematical expression which contains one or more un-
known quantities which you will be required to find. The unknown quantities are referred to as
parameter estimates in statistics.

It is quite hard to solve non-linear systems of equations, while linear systems are quite easy
to study. There are numerical techniques which help to approximate non-linear systems with
linear ones in the hope that the solutions of the linear systems are close enough to the solutions
of the non-linear systems. We will not discuss this here. Instead, we will focus our attention on
linear systems.

Linear system of equations

Definition
The equation
ax + by + cz + dw = h,
where a, b, c, d, and h are known numbers, while x, y, z, and w are unknown numbers, is
called a linear equation. If h = 0, the linear equation is said to be homogeneous. A linear
system is a set of linear equations and a homogeneous linear system is a set of homogeneous
linear equations.

For example, 
2x − 3y = 1
x + 3y = −2

1
Data Science Institute, Hasselt University

and

x+y−z =1
x + 3y + 3z = −2
are linear systems, while

2x − 3y 2 = −1

x+y+z =1
is a non-linear system (because of y 2 ).

The system

 2x −3y −3z +w =0
x +3y =0
x −y +w =0

is an homogeneous linear system.

Generally, when more than one unknown needs to be found, additional equations are neces-
sary. To completely determine two unknowns x and y, at least two equations are needed. To
make it clear that these connections must hold simultaneously, they are written in a system of
equations. In this document, we discuss methods used to solve such systems of linear equations.
For the sake of simplicity, we will restrict ourselves to three, at most four, unknowns. The reader
interested in the case of more unknowns may easily extend the following ideas.

Matrix representation of a linear system

Matrices are helpful in rewriting a linear system in a very simple form. The algebraic prop-
erties of matrices may then be used to solve systems. First, consider the linear system


 ax +by +cz +dw = e
f x +gy +hz +iw = j


 kx +ly +mz +nw = p
qx +ry +sz +tw = u

Set the matrices


     
a b c d e x
 f g h i   j
 and X =  y 
  
A=
 k
, C = 
l m n   p   z 
q r s t u w
Using matrix multiplications, we can rewrite the linear system above as the matrix equation
A.X = C.
As you can see this is far nicer than the equations. But sometimes it is worth to solve the system
directly without going through the matrix form. The matrix A is called the matrix coefficient
of the linear system. The matrix C is called the non-homogeneous term. When C = O,
the linear system is homogeneous. The matrix X is the unknown matrix. Its entries are the
unknowns of the linear system. The augmented matrix associated with the system is the
matrix [A|C], where
 
a b c d e
 f g h i j 
[A|C] = 
 k l m n p 

q r s t u

In general, if the linear system has n equations with m unknowns, then the coefficients matrix
will be a n × m matrix and the augmented matrix an n × (m + 1) matrix. Now we turn our
attention to the solutions of a system.

2
Data Science Institute, Hasselt University

Solving linear systems of equations

Definition
Two linear systems with n unknowns are said to be equivalent if and only if they have the
same set of solutions.

This definition is important since the idea behind solving a system is to find an equivalent sys-
tem which is easy to solve. You may wonder how we will come up with such system? Easy,
we do that through elementary operations. Indeed, it is clear that if we interchange two
equations, the new system is still equivalent to the old one. If we multiply an equation with a
nonzero number, we obtain a new system still equivalent to the old one. And finally replacing
one equation with the sum of two equations, we again obtain an equivalent system. These op-
erations are called elementary operations on systems. Let us see how it works in a particular case.

Consider the linear system



 x +y +z =0
x −2y +2z =4
x +2y −z =2

The idea is to keep the first equation and work on the last two. In doing that, we will try to kill
one of the unknowns and solve for the other two. For example, if we keep the first and second
equation, and subtract the first one from the last one, we get the equivalent system

 x +y +z = 0
x −2y +2z = 4
y −2z = 2

Next we keep the first and the last equation, and we subtract the first from the second. We get
the equivalent system

 x +y +z = 0
−3y +z = 4
y −2z = 2

Now we focus on the second and the third equation. We repeat the same procedure. Try to kill
one of the two unknowns (y or z). Indeed, we keep the first and second equation, and we add
the second to the third after multiplying it by 3. We get

 x +y +z = 0
−3y +z = 4
−5z = 10

This obviously implies z = −2. From the second equation, we get y = −2, and finally from the
first equation we get x = 4. Therefore, the linear system has one solution

x = 4, y = −2, z = −2

Going from the last equation to the first while solving for the unknowns is called back-solving.

Keep in mind that linear systems for which the matrix coefficient is upper-triangular are easy
to solve. This is particularly true, if the matrix is in echelon form. So the trick is to perform
elementary operations to transform the initial linear system into another one for which the co-
efficient matrix is in echelon form. Using our knowledge about matrices, is there anyway we
can rewrite what we did above in matrix form which will make our notation (or representation)
easier? Indeed, consider the augmented matrix
 
1 1 1 0
 1 −2 2 4 
1 2 −1 2

3
Data Science Institute, Hasselt University

Let us perform some elementary row operations on this matrix. Indeed, if we keep the first and
second row, and subtract the first one from the last one we get
 
1 1 1 0
 1 −2 2 4 
0 1 −2 2
Next we keep the first and the last rows, and we subtract the first from the second. We get
 
1 1 1 0
 0 −3 1 4 
0 1 −2 2
Then we keep the first and second row, and we add the second to the third after multiplying it
by 3 to get
 
1 1 1 0
 0 −3 1 4 
0 0 −5 10
This is a triangular matrix which is not in echelon form. The linear system for which this matrix
is an augmented one is

 x +y +z = 0
−3y +z = 4
−5z = 10

As you can see we obtained the same system as before. In fact, we followed the same elementary
operations performed above. In every step the new matrix was exactly the augmented matrix
associated to the new system. This shows that instead of writing the systems over and over
again, it is easy to play around with the elementary row operations and once we obtain a trian-
gular matrix, write the associated linear system and then solve it. This is known as Gaussian
elimination. Let us summarize the procedure:

Gaussian Elimination. Consider a linear system


1 Construct the augmented matrix for the system;
2 Use elementary row operations to transform the augmented matrix into a triangular one;
3 Write down the new linear system for which the triangular matrix is the associated aug-
mented matrix;
4 Solve the new system. You may need to assign some parametric values to some unknowns,
and then apply the method of back substitution to solve the new system.

Examples
Example 1: Solve the following system via Gaussian elimination


 2x −3y −z +2w +3v = 4
4x −4y −z +4w +11v = 4

 2x
 −5y −2z +2w −v = 9
2y +z +4v = −5

The augmented matrix is


 
2 −3 −1 2 3 4
 4
 −4 −1 4 11 4 

 2 −5 −2 2 −1 9 
0 2 1 0 4 −5

4
Data Science Institute, Hasselt University

We use elementary row operations to transform this matrix into a triangular one. We keep the
first row and use it to produce all zeros elsewhere in the first column. We have
 
2 −3 −1 2 3 4
 0
 2 1 0 5 −4  
 0 −2 −1 0 −4 5 
0 2 1 0 4 −5
Next we keep the first and second row and try to have zeros in the second column. We get
 
2 −3 −1 2 3 4
 0
 2 1 0 5 −4  
 0 0 0 0 1 1 
0 0 0 0 −1 −1
Next we keep the first three rows. We add the last one to the third to get
 
2 −3 −1 2 3 4
 0
 2 1 0 5 −4  
 0 0 0 0 1 1 
0 0 0 0 0 0
This is a triangular matrix. Its associated system is

 2x −3y −z +2w +3v = 4
2y +z +5v = −4
v = 1

Clearly we have v = 1. Set z = s and w = t, then we have


1 5 9 1
y = −2 − z − v = − − s
2 2 2 2
The first equation implies
3 1 3
x=2+ y+ z−w− v
2 2 2
Using algebraic manipulations, we get
25 1
x=− − s−t
4 4
Putting everything together, we have

− 25 1
   
x 4 − 4s − t
 y

 
  − 2 − 21 s
9 

 z = s .
   
 w   t 
v 1

Example 2. Use Gaussian elimination to solve the linear system



x −y = 4
2x −2y = −4

The associated augmented matrix is


 
1 −1 4
2 −2 −4
We keep the first row and subtract the first row multiplied by 2 from the second row. We get
 
1 −1 4
0 0 −12

5
Data Science Institute, Hasselt University

This is a triangular matrix. The associated system is



x −y = 4
0 = −12
Clearly the second equation implies that this system has no solution. Therefore, this linear
system has no solution.

Definition
A linear system is called inconsistent or overdetermined if it does not have a solu-
tion. In other words, the set of solutions is empty. Otherwise the linear system is called
consistent.

Following the example above, we see that if we perform elementary row operations on the
augmented matrix of the system and get a matrix with one of its rows equal to (0, 0, . . . , c),
where c ̸= 0, then the system is inconsistent.

R-software
R is a free software environment for statistical computing and graphics. It compiles and runs on
a wide variety of UNIX platforms, Windows and MacOS. You can download this software for
free from the website (https://www.r-project.org/). You can also use the basic functions of
R without installing the software using R-Fiddle (http://www.r-fiddle.org/). R-Fiddle is a
tool that provides you with a free and powerful environment to write and run R-code right inside
your browser.

Solve the following system of equations:



 x +2y +3z = 20
2x +5y +9z = 100
5x +7y +8z = 200

The corresponding matrices are:


   
1 2 3 20
A= 2 5 9 , C =  100 
5 7 8 200
##### Solve the system using R
> A <- matrix(data=c(1, 2, 3, 2, 5, 9, 5, 7, 8), nrow=3, ncol=3, byrow=TRUE)
> A # To print the matrix A
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 5 9
[3,] 5 7 8
> b <- matrix(data=c(20, 100, 200), nrow=3, ncol=1, byrow=FALSE)
> b ## To print the matrix b
[,1]
[1,] 20
[2,] 100
[3,] 200
> solve(A, b) # Function used to solve the system of equations
[,1]
[1,] 320
[2,] -360
[3,] 140

6
Data Science Institute, Hasselt University

Exercises
Solve the following systems of linear equations (according to the values of the parameters).

1.  6. 
 x −y +z =4  x +y +z = 6
2x −y =6 x −y −z = −4
x +y −z =6 3x +y +z = 8
 

2.  7.
 x −y +z =4 
 x −y −z = −2
2x −y +2z =6 2y +4z = a
x +y −3z =0

x +y +az = 1

3. 
 0.4x +0.6y =1 8. 
0.5x +y/3 +z/6 =1  x +ay −3z = 1

x +y −z =1 y −3z = a
x +y +2z = a

4. 

 x +y +5z +w =1
y +2z +w =0


 x +3y +7z +2w =1
x +y +5z +2w =1

5. 
 2x +y +3z =1
2x +6y +8z =3
6x +8y +18z =5

The first five exercises can be solved using R or R-Fiddle, but a manual calculation is recom-
mended as a preparation for the last three exercises, which can only be solved manually.

Advanced: Statistical Applications


The classical linear regression model is given by
yi = β0 + x1i β1 + . . . + xpi βp + εi ,
for i = 1, . . . , n. The parameters β = (β0 , β1 , . . . , βp )T can be estimated using the method of
least squares which corresponds to
X n
minimize ri2 ,
β
i=1
where ri = yi − β0 − x1i β1 − . . . − xpi βp are the residuals.

Let us denote the vector of the least squares estimated regression coefficients b0 , b1 , . . . , bp as b:
 
b0
 b1 
b= .
 
 ..


bp−1
The least square normal equations for the general linear regression model are:

X ′ Xb = X ′ Y
which is similar to the matrix representation of a system of linear equations shown before. The
Gaussian elimination, procedure presented above, can be used to solve this problem. However,
some other appropriate techniques are implemented in most of statistical software to solve the
normal equations in efficient ways.

7
Data Science Institute, Hasselt University

Illustration

Given the data:


Y1 X1 X2 X3 X4 X5
125 113 13 18 25 11
158 115 39 18 59 30
207 126 52 50 62 53
182 119 29 43 50 29
196 107 50 37 65 56
175 135 64 19 79 49
145 111 11 27 17 14
144 130 22 23 31 17
160 122 30 18 34 22
175 114 51 11 58 40
151 121 27 15 29 31
161 105 41 22 53 39
200 131 51 52 75 36
173 123 37 36 44 27
175 121 23 48 27 20
162 120 43 15 65 36
155 109 38 19 62 37
230 130 62 56 75 50
162 134 28 30 36 20
153 124 30 25 41 33
Use the Gaussian elimination method to find the regression coefficients of a statistical model in
which Y1 is the dependent variable and two independent variables are X1 and X2.
We fit a regression model of the form

Y1i = β0 + X1i β1 + X2i β2 + εi ,

The corresponding matrix representation is: X ′ Xb = X ′ Y where X = [1, X1 , X2 ].

> #### Solution


> data = matrix(c(125, 113, 13, 18, 25, 11,
+ 158, 115, 39, 18, 59, 30,
+ 207, 126, 52, 50, 62, 53,
+ 182, 119, 29, 43, 50, 29,
+ 196, 107, 50, 37, 65, 56,
+ 175, 135, 64, 19, 79, 49,
+ 145, 111, 11, 27, 17, 14,
+ 144, 130, 22, 23, 31, 17,
+ 160, 122, 30, 18, 34, 22,
+ 175, 114, 51, 11, 58, 40,
+ 151, 121, 27, 15, 29, 31,
+ 161, 105, 41, 22, 53, 39,
+ 200, 131, 51, 52, 75, 36,
+ 173, 123, 37, 36, 44, 27,
+ 175, 121, 23, 48, 27, 20,
+ 162, 120, 43, 15, 65, 36,
+ 155, 109, 38, 19, 62, 37,
+ 230, 130, 62, 56, 75, 50,
+ 162, 134, 28, 30, 36, 20,
+ 153, 124, 30, 25, 41, 33),20,6,byrow=T);
>
> data = as.data.frame(data);
> names(data) = c(’Y1’,’X1’,’X2’,’X3’,’X4’,’X5’);

8
Data Science Institute, Hasselt University

> Y<-data[,1]
> X<-as.matrix(data.frame(inter=rep(1,nrow(data)),data[,2:3]))
> XX<-t(X)%*%X
> XX
inter X1 X2
inter 20 2410 741
X1 2410 291916 89923
X2 741 89923 31667
>
> XY<-t(X)%*%Y
> XY
[,1]
inter 3389
X1 409662
X2 130863
>
> solve(XX,XY)
[,1]
inter 82.923571
X1 0.347247
X2 1.206023
> #### Solution using statistical package
> RES = lm(formula = Y ~ X1 + X2, data = data);
> summary(RES)

Call:
lm(formula = Y ~ X1 + X2, data = data)

Residuals:
Min 1Q Median 3Q Max
-31.987 -10.849 -3.986 11.604 27.161

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 82.9236 51.1410 1.621 0.123316
X1 0.3472 0.4362 0.796 0.437013
X2 1.2060 0.2613 4.616 0.000246 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 16.42 on 17 degrees of freedom


Multiple R-squared: 0.5989,Adjusted R-squared: 0.5517
F-statistic: 12.69 on 2 and 17 DF, p-value: 0.0004245

Acknowledgment
This contribution was written by M. Aerts, L. Garcia Barrado, T. Bigirumurame, N. Cosemans
& Y. Vandendijck.

Last update: January, 2025

Elaborate reading material includes


• Chapter “Systems of Linear Equations” in “Fundamentals of Matrix Algebra” by Hartman
G. (3th edition in 2011)

9
Data Science Institute, Hasselt University

Solutions Exercises
1.      
1 −1 1 4 1 −1 1 4 1 −1 1 4
 2 −1 0 6 ∼ 0 1 −2 −2  ∼  0 1 −2 −2 
1 1 −1 6 0 2 −2 2 0 0 2 6

 
 x −y +z = 4  x = 5
⇒ y −2z = −2 ⇔ y = 4
2z = 6 z = 3
 

Using R:

> A = matrix(c(1,-1,1,2,-1,0,1,1,-1),3,3,byrow=T);
> b = matrix(c(4,6,6),3,1);
>
> solve(A,b)
[,1]
[1,] 5
[2,] 4
[3,] 3

2.      
1 −1 1 4 1 −1 1 4 1 −1 1 4
 2 −1 2 6 ∼ 0 1 0 −2  ∼  0 1 0 −2 
1 1 −3 0 0 2 −4 −4 0 0 −4 0

 
 x −y +z = 4  x = 2
⇒ y = −2 ⇔ y = −2
−4z = 0 z = 0
 

Using R:

> A = matrix(c(1,-1,1,2,-1,2,1,1,-3),3,3,byrow=T);
> b = matrix(c(4,6,0),3,1);
>
> solve(A,b)
[,1]
[1,] 2
[2,] -2
[3,] 0

3.
2 3
       
5 5 0 1 1 1 −1 1 1 1 −1 1 1 1 −1 1
 1
2
1
3
1
6 1  ∼ *  12 1
3
1
6 1 ∼ 0 − 16 4
6
1
2
∼ 0 − 61 4
6
1
2

2 3 1 2 3 6 6
1 1 −1 1 5 5 0 1 0 5 5 5 0 0 5 5

 
 x +y −z = 1  x = 1
⇒ − 16 y + 46 z = 1
2 ⇔ y = 1
6 6
5z = z = 1
 
5

*
Tip: Switch first and third row for easier calculations.

10
Data Science Institute, Hasselt University

Using R:

> A = matrix(c(0.4,0.6,0,0.5,1/3,1/6,1,1,-1),3,3,byrow=T);
> b = matrix(c(1,1,1),3,1);
>
> solve(A,b)
[,1]
[1,] 1
[2,] 1
[3,] 1

4.      
1 1 5 1 1 1 1 5 1 1 1 1 5 1 1
 0 1 2 1 0   0 1 2 1 0   0 1 2 1 0 
 ∼ ∼ 
 1 3 7 2 1   0 2 2 1 0   0 0 −2 −1 0 
1 1 5 2 1 0 0 0 1 0 0 0 0 1 0

 

 x +y +5z +w = 1 
 x = 1
y +2z +w = 0 y = 0
 
⇒ ⇔

 −2z −w = 0 
 z = 0
w = 0 w = 0
 

Using R:

> A = matrix(c(1,1,5,1,0,1,2,1,1,3,7,2,1,1,5,2),4,4,byrow=T);
> b = matrix(c(1,0,1,1),4,1);
>
> solve(A,b)
[,1]
[1,] 1
[2,] 0
[3,] 0
[4,] 0

5.      
2 1 3 1 2 1 3 1 2 1 3 1
 2 6 8 3 ∼ 0 5 5 2 ∼ 0 5 5 2 
6 8 18 5 0 5 9 2 0 0 4 0

  3
 2x +y +3z = 1  x = 10
2
⇒ 5y +5z = 2 ⇔ y = 5
4z = 0 z = 0
 

Using R:

> A = matrix(c(2,1,3,2,6,8,6,8,18),3,3,byrow=T);
> b = matrix(c(1,3,5),3,1);
>
> solve(A,b)
[,1]
[1,] 3.000000e-01
[2,] 4.000000e-01
[3,] -5.551115e-17

11
Data Science Institute, Hasselt University

6.      
1 1 1 6 1 1 1 6 1 1 1 6
 1 −1 −1 −4  ∼  0 −2 −2 −10  ∼  0 −2 −2 −10 
3 1 1 8 0 −2 −2 −10 0 0 0 0


  x = 1
x +y +z = 6
⇒ ⇔ y = 5−α
−2y −2z = −10
z = α (α ∈ R)

7.
     
1 −1 −1 −2 1 −1 −1 −2 1 −1 −1 −2
 0 2 4 a ∼ 0 2 4 a ∼ 0 2 4 a 
1 1 a 1 0 2 a+1 3 0 0 a−3 3−a

For a = 3 
 x = − 12 − α
y = 32 − 2α
z = α (α ∈ R)

For a ̸= 3 
 x = 21 a − 1
y = 12 a + 2
z = −1

8.
     
1 a −3 1 1 a −3 1 1 a −3 1
 0 1 −3 a ∼ 0 1 −3 a ∼* 0 1 −3 a 
2
1 1 2 a 0 1−a 5 a−1 0 0 8 − 3a a −1

*
a ̸= 1

8
For a = 3 the system has no solution.

8
For a ̸= 3 
−5a2 +5
 x
 = 8−3a
8a−3
y = 8−3a
a2 −1

 z = 8−3a

Check: for a = 1  
 x +y −3z = 1  x = 0
⇒ y −3z = 1 ⇔ y = 1
5z = 0 z = 0
 

Note this is solution falls under the solution of a ̸= 83 .

12

You might also like