Solving Linear Equation Systems
Solving Linear Equation Systems
Contents
Theory . . . . . . . . . . . . . . . . . . . . 1
Examples . . . . . . . . . . . . . . . . . . 4
R-software . . . . . . . . . . . . . . . . . . 6
Exercises . . . . . . . . . . . . . . . . . . . 7
Acknowledgment . . . . . . . . . . . . . 9
Solutions Exercises . . . . . . . . . . . . 10
Theory
Many problems in statistics, engineering, etc reduce to the solution of an equation or a set of
equations. An equation is a type of mathematical expression which contains one or more un-
known quantities which you will be required to find. The unknown quantities are referred to as
parameter estimates in statistics.
It is quite hard to solve non-linear systems of equations, while linear systems are quite easy
to study. There are numerical techniques which help to approximate non-linear systems with
linear ones in the hope that the solutions of the linear systems are close enough to the solutions
of the non-linear systems. We will not discuss this here. Instead, we will focus our attention on
linear systems.
Definition
The equation
ax + by + cz + dw = h,
where a, b, c, d, and h are known numbers, while x, y, z, and w are unknown numbers, is
called a linear equation. If h = 0, the linear equation is said to be homogeneous. A linear
system is a set of linear equations and a homogeneous linear system is a set of homogeneous
linear equations.
For example,
2x − 3y = 1
x + 3y = −2
1
Data Science Institute, Hasselt University
and
x+y−z =1
x + 3y + 3z = −2
are linear systems, while
2x − 3y 2 = −1
x+y+z =1
is a non-linear system (because of y 2 ).
The system
2x −3y −3z +w =0
x +3y =0
x −y +w =0
Generally, when more than one unknown needs to be found, additional equations are neces-
sary. To completely determine two unknowns x and y, at least two equations are needed. To
make it clear that these connections must hold simultaneously, they are written in a system of
equations. In this document, we discuss methods used to solve such systems of linear equations.
For the sake of simplicity, we will restrict ourselves to three, at most four, unknowns. The reader
interested in the case of more unknowns may easily extend the following ideas.
Matrices are helpful in rewriting a linear system in a very simple form. The algebraic prop-
erties of matrices may then be used to solve systems. First, consider the linear system
ax +by +cz +dw = e
f x +gy +hz +iw = j
kx +ly +mz +nw = p
qx +ry +sz +tw = u
q r s t u
In general, if the linear system has n equations with m unknowns, then the coefficients matrix
will be a n × m matrix and the augmented matrix an n × (m + 1) matrix. Now we turn our
attention to the solutions of a system.
2
Data Science Institute, Hasselt University
Definition
Two linear systems with n unknowns are said to be equivalent if and only if they have the
same set of solutions.
This definition is important since the idea behind solving a system is to find an equivalent sys-
tem which is easy to solve. You may wonder how we will come up with such system? Easy,
we do that through elementary operations. Indeed, it is clear that if we interchange two
equations, the new system is still equivalent to the old one. If we multiply an equation with a
nonzero number, we obtain a new system still equivalent to the old one. And finally replacing
one equation with the sum of two equations, we again obtain an equivalent system. These op-
erations are called elementary operations on systems. Let us see how it works in a particular case.
The idea is to keep the first equation and work on the last two. In doing that, we will try to kill
one of the unknowns and solve for the other two. For example, if we keep the first and second
equation, and subtract the first one from the last one, we get the equivalent system
x +y +z = 0
x −2y +2z = 4
y −2z = 2
Next we keep the first and the last equation, and we subtract the first from the second. We get
the equivalent system
x +y +z = 0
−3y +z = 4
y −2z = 2
Now we focus on the second and the third equation. We repeat the same procedure. Try to kill
one of the two unknowns (y or z). Indeed, we keep the first and second equation, and we add
the second to the third after multiplying it by 3. We get
x +y +z = 0
−3y +z = 4
−5z = 10
This obviously implies z = −2. From the second equation, we get y = −2, and finally from the
first equation we get x = 4. Therefore, the linear system has one solution
x = 4, y = −2, z = −2
Going from the last equation to the first while solving for the unknowns is called back-solving.
Keep in mind that linear systems for which the matrix coefficient is upper-triangular are easy
to solve. This is particularly true, if the matrix is in echelon form. So the trick is to perform
elementary operations to transform the initial linear system into another one for which the co-
efficient matrix is in echelon form. Using our knowledge about matrices, is there anyway we
can rewrite what we did above in matrix form which will make our notation (or representation)
easier? Indeed, consider the augmented matrix
1 1 1 0
1 −2 2 4
1 2 −1 2
3
Data Science Institute, Hasselt University
Let us perform some elementary row operations on this matrix. Indeed, if we keep the first and
second row, and subtract the first one from the last one we get
1 1 1 0
1 −2 2 4
0 1 −2 2
Next we keep the first and the last rows, and we subtract the first from the second. We get
1 1 1 0
0 −3 1 4
0 1 −2 2
Then we keep the first and second row, and we add the second to the third after multiplying it
by 3 to get
1 1 1 0
0 −3 1 4
0 0 −5 10
This is a triangular matrix which is not in echelon form. The linear system for which this matrix
is an augmented one is
x +y +z = 0
−3y +z = 4
−5z = 10
As you can see we obtained the same system as before. In fact, we followed the same elementary
operations performed above. In every step the new matrix was exactly the augmented matrix
associated to the new system. This shows that instead of writing the systems over and over
again, it is easy to play around with the elementary row operations and once we obtain a trian-
gular matrix, write the associated linear system and then solve it. This is known as Gaussian
elimination. Let us summarize the procedure:
Examples
Example 1: Solve the following system via Gaussian elimination
2x −3y −z +2w +3v = 4
4x −4y −z +4w +11v = 4
2x
−5y −2z +2w −v = 9
2y +z +4v = −5
4
Data Science Institute, Hasselt University
We use elementary row operations to transform this matrix into a triangular one. We keep the
first row and use it to produce all zeros elsewhere in the first column. We have
2 −3 −1 2 3 4
0
2 1 0 5 −4
0 −2 −1 0 −4 5
0 2 1 0 4 −5
Next we keep the first and second row and try to have zeros in the second column. We get
2 −3 −1 2 3 4
0
2 1 0 5 −4
0 0 0 0 1 1
0 0 0 0 −1 −1
Next we keep the first three rows. We add the last one to the third to get
2 −3 −1 2 3 4
0
2 1 0 5 −4
0 0 0 0 1 1
0 0 0 0 0 0
This is a triangular matrix. Its associated system is
2x −3y −z +2w +3v = 4
2y +z +5v = −4
v = 1
− 25 1
x 4 − 4s − t
y
− 2 − 21 s
9
z = s .
w t
v 1
5
Data Science Institute, Hasselt University
Definition
A linear system is called inconsistent or overdetermined if it does not have a solu-
tion. In other words, the set of solutions is empty. Otherwise the linear system is called
consistent.
Following the example above, we see that if we perform elementary row operations on the
augmented matrix of the system and get a matrix with one of its rows equal to (0, 0, . . . , c),
where c ̸= 0, then the system is inconsistent.
R-software
R is a free software environment for statistical computing and graphics. It compiles and runs on
a wide variety of UNIX platforms, Windows and MacOS. You can download this software for
free from the website (https://www.r-project.org/). You can also use the basic functions of
R without installing the software using R-Fiddle (http://www.r-fiddle.org/). R-Fiddle is a
tool that provides you with a free and powerful environment to write and run R-code right inside
your browser.
6
Data Science Institute, Hasselt University
Exercises
Solve the following systems of linear equations (according to the values of the parameters).
1. 6.
x −y +z =4 x +y +z = 6
2x −y =6 x −y −z = −4
x +y −z =6 3x +y +z = 8
2. 7.
x −y +z =4
x −y −z = −2
2x −y +2z =6 2y +4z = a
x +y −3z =0
x +y +az = 1
3.
0.4x +0.6y =1 8.
0.5x +y/3 +z/6 =1 x +ay −3z = 1
x +y −z =1 y −3z = a
x +y +2z = a
4.
x +y +5z +w =1
y +2z +w =0
x +3y +7z +2w =1
x +y +5z +2w =1
5.
2x +y +3z =1
2x +6y +8z =3
6x +8y +18z =5
The first five exercises can be solved using R or R-Fiddle, but a manual calculation is recom-
mended as a preparation for the last three exercises, which can only be solved manually.
Let us denote the vector of the least squares estimated regression coefficients b0 , b1 , . . . , bp as b:
b0
b1
b= .
..
bp−1
The least square normal equations for the general linear regression model are:
X ′ Xb = X ′ Y
which is similar to the matrix representation of a system of linear equations shown before. The
Gaussian elimination, procedure presented above, can be used to solve this problem. However,
some other appropriate techniques are implemented in most of statistical software to solve the
normal equations in efficient ways.
7
Data Science Institute, Hasselt University
Illustration
8
Data Science Institute, Hasselt University
> Y<-data[,1]
> X<-as.matrix(data.frame(inter=rep(1,nrow(data)),data[,2:3]))
> XX<-t(X)%*%X
> XX
inter X1 X2
inter 20 2410 741
X1 2410 291916 89923
X2 741 89923 31667
>
> XY<-t(X)%*%Y
> XY
[,1]
inter 3389
X1 409662
X2 130863
>
> solve(XX,XY)
[,1]
inter 82.923571
X1 0.347247
X2 1.206023
> #### Solution using statistical package
> RES = lm(formula = Y ~ X1 + X2, data = data);
> summary(RES)
Call:
lm(formula = Y ~ X1 + X2, data = data)
Residuals:
Min 1Q Median 3Q Max
-31.987 -10.849 -3.986 11.604 27.161
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 82.9236 51.1410 1.621 0.123316
X1 0.3472 0.4362 0.796 0.437013
X2 1.2060 0.2613 4.616 0.000246 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Acknowledgment
This contribution was written by M. Aerts, L. Garcia Barrado, T. Bigirumurame, N. Cosemans
& Y. Vandendijck.
9
Data Science Institute, Hasselt University
Solutions Exercises
1.
1 −1 1 4 1 −1 1 4 1 −1 1 4
2 −1 0 6 ∼ 0 1 −2 −2 ∼ 0 1 −2 −2
1 1 −1 6 0 2 −2 2 0 0 2 6
x −y +z = 4 x = 5
⇒ y −2z = −2 ⇔ y = 4
2z = 6 z = 3
Using R:
> A = matrix(c(1,-1,1,2,-1,0,1,1,-1),3,3,byrow=T);
> b = matrix(c(4,6,6),3,1);
>
> solve(A,b)
[,1]
[1,] 5
[2,] 4
[3,] 3
2.
1 −1 1 4 1 −1 1 4 1 −1 1 4
2 −1 2 6 ∼ 0 1 0 −2 ∼ 0 1 0 −2
1 1 −3 0 0 2 −4 −4 0 0 −4 0
x −y +z = 4 x = 2
⇒ y = −2 ⇔ y = −2
−4z = 0 z = 0
Using R:
> A = matrix(c(1,-1,1,2,-1,2,1,1,-3),3,3,byrow=T);
> b = matrix(c(4,6,0),3,1);
>
> solve(A,b)
[,1]
[1,] 2
[2,] -2
[3,] 0
3.
2 3
5 5 0 1 1 1 −1 1 1 1 −1 1 1 1 −1 1
1
2
1
3
1
6 1 ∼ * 12 1
3
1
6 1 ∼ 0 − 16 4
6
1
2
∼ 0 − 61 4
6
1
2
2 3 1 2 3 6 6
1 1 −1 1 5 5 0 1 0 5 5 5 0 0 5 5
x +y −z = 1 x = 1
⇒ − 16 y + 46 z = 1
2 ⇔ y = 1
6 6
5z = z = 1
5
*
Tip: Switch first and third row for easier calculations.
10
Data Science Institute, Hasselt University
Using R:
> A = matrix(c(0.4,0.6,0,0.5,1/3,1/6,1,1,-1),3,3,byrow=T);
> b = matrix(c(1,1,1),3,1);
>
> solve(A,b)
[,1]
[1,] 1
[2,] 1
[3,] 1
4.
1 1 5 1 1 1 1 5 1 1 1 1 5 1 1
0 1 2 1 0 0 1 2 1 0 0 1 2 1 0
∼ ∼
1 3 7 2 1 0 2 2 1 0 0 0 −2 −1 0
1 1 5 2 1 0 0 0 1 0 0 0 0 1 0
x +y +5z +w = 1
x = 1
y +2z +w = 0 y = 0
⇒ ⇔
−2z −w = 0
z = 0
w = 0 w = 0
Using R:
> A = matrix(c(1,1,5,1,0,1,2,1,1,3,7,2,1,1,5,2),4,4,byrow=T);
> b = matrix(c(1,0,1,1),4,1);
>
> solve(A,b)
[,1]
[1,] 1
[2,] 0
[3,] 0
[4,] 0
5.
2 1 3 1 2 1 3 1 2 1 3 1
2 6 8 3 ∼ 0 5 5 2 ∼ 0 5 5 2
6 8 18 5 0 5 9 2 0 0 4 0
3
2x +y +3z = 1 x = 10
2
⇒ 5y +5z = 2 ⇔ y = 5
4z = 0 z = 0
Using R:
> A = matrix(c(2,1,3,2,6,8,6,8,18),3,3,byrow=T);
> b = matrix(c(1,3,5),3,1);
>
> solve(A,b)
[,1]
[1,] 3.000000e-01
[2,] 4.000000e-01
[3,] -5.551115e-17
11
Data Science Institute, Hasselt University
6.
1 1 1 6 1 1 1 6 1 1 1 6
1 −1 −1 −4 ∼ 0 −2 −2 −10 ∼ 0 −2 −2 −10
3 1 1 8 0 −2 −2 −10 0 0 0 0
x = 1
x +y +z = 6
⇒ ⇔ y = 5−α
−2y −2z = −10
z = α (α ∈ R)
7.
1 −1 −1 −2 1 −1 −1 −2 1 −1 −1 −2
0 2 4 a ∼ 0 2 4 a ∼ 0 2 4 a
1 1 a 1 0 2 a+1 3 0 0 a−3 3−a
For a = 3
x = − 12 − α
y = 32 − 2α
z = α (α ∈ R)
For a ̸= 3
x = 21 a − 1
y = 12 a + 2
z = −1
8.
1 a −3 1 1 a −3 1 1 a −3 1
0 1 −3 a ∼ 0 1 −3 a ∼* 0 1 −3 a
2
1 1 2 a 0 1−a 5 a−1 0 0 8 − 3a a −1
*
a ̸= 1
8
For a = 3 the system has no solution.
8
For a ̸= 3
−5a2 +5
x
= 8−3a
8a−3
y = 8−3a
a2 −1
z = 8−3a
Check: for a = 1
x +y −3z = 1 x = 0
⇒ y −3z = 1 ⇔ y = 1
5z = 0 z = 0
12