MA1513 Chapter 2 Lecture Note
MA1513 Chapter 2 Lecture Note
Note that the red arrow above is not the only arrow that represent u. Any other arrow with the same
direction and length as the red one also represent u. There are infinitely many of them.
1
We can construct an arrow to represent 3-vectors in the xyz-space in a similar way. For general n-
vector, we can’t explicitly draw it as an arrow, but sometime it might still help to think of them as
arrows in order to understand certain concepts regarding vectors.
u1
u2
and call an n x 1 matrix � ⋮ � a column vector.
u𝑛𝑛
We are free to write vectors in row or column forms, but under certain circumstances, specific form
has to be used.
The special n-vector (0, 0, …, 0) is called the zero vector, and it correspond to either the 1 x n or n x 1
zero matrix 0.
In the case of 2- and 3-vectors, we can also give geometrical interpretation to these operations in
terms of the arrows as shown in the last column of the table above.
Vector addition of two vectors: we can move the arrows of the two vectors (without changing
directions) so that the end point of one arrow (orange) join with the starting point of the other
(blue). Then the sum of the two vectors (green) is obtained by joining the third side of the triangle as
shown.
Vector subtraction of two vectors: we move the arrows so that the starting points of the two vectors
(orange and blue) coincide. Then we join the third side of the triangle to get the difference (green).
Negative of a vector: It is just represented by the arrow pointing in exactly the opposite direction as
the original vector. Both arrows have the same length.
Scalar multiplication of a vector by a scalar (real number): This has the effect of changing length of
the arrow while preserving the direction. For example, scalar multiplication by 2 will double the
length of the arrow, while a scalar ½ will shorten the length by half; and a negative scalar will reverse
the direction of the arrow at the same time.
2
n-Space (slide 6)
For a fix number n, the set or collection of all the n-vectors is called the n-space, and is represented
by the notation Rn. For example, if we use the Venn diagram to represent the 4-space, then its
elements are all the 4-vectors such as (1, 2, 3, 1), (0, 2, 1, 5) and so on.
For the 2-space R2, we can visualize it as the set of all 2-vectors in the xy-plane. For the 3-space R3,
we can view it as the set of all 3-vectors in the xyz-space.
The following are different ways to say the same thing:
• u ∈ Rn
• u is an n-vector
• u = (u1, u2 , …, un) for some real numbers u1, u2 , …, un.
where t is a parameter.
Once again, this can be put in matrix form
In other words, the set of all the solutions of the above system is a subset of R3 and contains all 3-
vectors given by expression on the right.
In general, if a linear system has n variables,
then its solutions are n-vectors, and the solution set is a subset of Rn.
3
Dot Product (slide 8)
Let u = (u1, u2, …, un), v = (v1, v2, …, vn) vectors in Rn. The dot product of u and v is defined to be the
number (scalar):
u ∙ v = u1 v1 + u2v2 + … unvn .
The left hand side is the notation of the dot product operation. The right hand side is the sum of the
product of the corresponding components of the two vectors, and note that the outcome is a
number, not a vector.
We can perform dot product of a vector with itself, and this will result in the sum of squares of all
the components of the vector:
u ∙ u = u12 + u22 + … un2 .
Length, Distance and Angles in R2 (slide 9)
We will look at a few geometrical concepts associated with n-vectors, and illustrate how we can
apply the dot product in the process, beginning with 2-vectors in R2.
The notion of length, distance and angles are defined in the intuitive way.
The length of a vector u = (u1, u2) simply means the length of the arrow representing the vector.
Using Pythagoras theorem, we easily obtain the length
The distance between the two vectors u = (u1, u2) and v = (v1, v2) refers to the distance between the
end points of the arrows, which is given by the length of the vector u – v:
The angle between u = (u1, u2) and v = (v1, v2) refers to the angle 𝜃𝜃 between the two arrows
representing the vectors. If we use the cosine rule (a2 = b2 + c2 - 2bc cos 𝜃𝜃), we can derive
The same notation applies to 3-vectors in R3. In this case, the algebraic expression for length,
distance and angles will involve the 3 components of the vectors instead of two components for 2-
vectors.
4
Length, Distance and Angles in Rn (slide 10)
We can extend the algebraic expression for length, distance and angles to any n-vectors in Rn in the
obvious way.
The distance between u = (u1, u2, …, un) and v = (v1, v2, …, vn) :
The angle 𝜃𝜃 between u = (u1, u2, …, un) and v = (v1, v2, …, vn) :
u ·v
‖u‖ = √u · u, ‖u − v‖ = �(u − v) ·( u − v), 𝜃𝜃 = cos−1 �‖u‖ ‖v‖�.
So a vector of norm 1, is a vector with length 1. Such a vector is called a unit vector:
The norm (length) of u is given by: ||u|| = �12 +22 +02 +12 =√6.
The distance between u and v is given by: ||u – v|| =�(1-1)2 +(2+1)2 +(0-1)2 +(1-1)2 = √10.
1×1+2×(-1)+0×1+1×1 π
The angle between u and v is given by: 𝜃𝜃 = cos−1 � � = cos−1 (0) = .
√6×√4 2
We observe that the quotient turns out to be zero, and hence the angle in this case is 90°.
Although we cannot visualize general n-vectors, they still represent certain quantitative objects. The
abstract notions of length, distance and angles allow us to make comparison between them. For
example, whether one n-vector gives a good approximation for another n-vector. We shall see this in
a later section.
Orthogonal Vectors (slide 12)
Let u and v be two vectors in Rn. If u · v = 0, we say u and v are orthogonal.
5
2.2 Linear Combination and Linear Span
The topics of linear combinations and linear spans are fundamental in the theory of vector spaces.
Many important concepts about n-vectors require the notion of linear combination, which is a
combination of scalar multiplication and vector addition applied on a given set of vectors. For a fixed
set of n-vectors, we can use linear combinations to generate many more n-vectors. This will lead to
the concept of linear span.
Linear Combinations (slide 2)
To talk about linear combination, we must always be given a set of n-vectors to begin with:
u1, u2, …, uk : a fixed set of vectors in Rn
For each of these vectors, we are going to multiply by a corresponding scalar: c1, c2, …, ck :
c1u1 , c2u2 , ··· , ckuk: scalar multiples of the vectors.
Then we add all these scalar multiples of the vectors together to get:
c1u1 + c2u2 + ··· + ckuk .
This is called a linear combination of u1, u2, …, uk. It is also an n-vector in Rn.
For example, take u1 = (2, 1, 0) u2 = (-3, 0, 1) and c1 = 1, c2 = 1. Then we get a specific linear
combination:
For the same vectors, if we use general scalars c1 = s, c2 = t , then we will have a general linear
combination with parameters s and t:
Example (slide 3 - 6)
Let u1 = (2, 1, 3), u2 = (1, –1, 2) and u3 = (3, 0, 5) be three fixed 3-vectors in R3.
(a) v = (3, 3, 4) is a linear combination of u1, u2, u3
as (3, 3, 4) can be expressed as a(2, 1, 3) + b(1, –1, 2) + c(3, 0, 5) . (Why?)
For part (a), we start by writing the given vector v as a general linear combination v = au1 + bu2 + cu3
where a, b, c are unknown scalars to be determined. Equating the components on the left and right
hand side, we get a linear system:
The first equation come from equating the first component on both sides of (*). Similarly, the
second and third equations come from the second and third components of the vectors in (*).
6
So the original equation (*) is equivalent to a linear system, and we can solve this system for the
values of a, b, c.
Before we solve the system, we remark that by expressing the vectors in equation (*) in column
form, we get:
3 2 1 3
�3� = a �1� + b�-1� + c�0�
4 3 2 5
This is called the vector equation form of the above linear system.
We now proceed to solve the linear system by Gaussian elimination, starting from the augmented
matrix, we reduce it to this row echelon form:
Since the last column is not a pivot column, the system is consistent. This means solution for a, b, c
exists. At this point, without finding the solution yet, we can conclude the vector (3, 3, 4) is a linear
combination of u1, u2, u3.
If we want to explicitly write this vector as a specific linear combination, we need to go one step
further to solve the system. From the row echelon form above, we have two equations:
2𝑎𝑎 + 𝑏𝑏 + 3𝑐𝑐 = 3
3 3
− 2
𝑏𝑏 − 2
𝑐𝑐 = 32
By setting the variable c as free parameter t and using back substitution, we get the general solution
for the system:
This means there are many possible values for a, b and c. For example, let t = 0, we get a = 2, b = -1,
c = 0 and hence we have explicitly
(3, 3, 4) = 2u1 – u2
If we take t = 1 instead, we will get another set of values a = 1, b = -2, c = 1 and hence a different
linear combination for (3,3,4):
7
which can be established by comparing components on both sides. We again apply Gaussian
elimination to the augmented matrix and get the row echelon form.
This time we see that the last column is a pivot column, which leads to a row highlighted in pink in
the row echelon form. Hence we know the system has no solution and we conclude that the vector
(1, 2, 4) is not a linear combination of u1, u2, u3.
Observe that we can write every vector in R3 as a linear combination of these standard basis vectors.
Take a general 3-vector (x y z) in R3, where x, y, z represent some real numbers.
We note that (x, y, z) can be broken up as a sum of three vectors on the right hand side. Then, for
these three vectors, we can write them as scalar multiples by pulling out the numbers x y z from the
components. This will then give us a linear combination of the standard basis vectors, with scalars x,
y, z respectively.
Take a more specific 3-vector (1, 2, 5), the linear combination in terms of the standard basis vectors
is (1, 2, 5) = 1e1 + 2e2 + 5e3
In general, for any positive integer n, we have the standard basis vectors for Rn:
b1 = (1, 0, …, 0), b2 = (0, 1, …, 0), …, bn = (0, 0, …, 1)
consisting of n similar vectors with zero component everywhere except for one position as shown.
The Set of Linear Combinations (slide 8)
We now look at the set of all possible linear combinations of a fixed set of vectors. Again let’s start
with a simple example of two vectors (2,1,0) and (-3,0,1).
How many possible linear combinations s(2, 1, 0) + t(-3, 0, 1) are there? Since there are infinitely
many possible scalars s and t, there will be infinitely many linear combinations. In other words, the
set of all linear combinations of these two vectors contains infinitely many elements.
We shall call this set the linear span of (2,1,0) and (-3,0,1) and we write it as
8
This linear span notation is a concise way to describe this infinite set. It should not be mistaken that
this is a finite set with only two vectors (2,1,0) and (-3,0,1). Rather, it contains all possible linear
combinations s(2, 1, 0) + t(-3, 0, 1) of the two given vectors.
Linear Span (slide 9)
We now generalize this to a set of k vectors u1, u2, …, uk in Rn.
We call this set the linear span of u1, u2, …, uk and denote it as span{u1, u2, …, uk}.
(b) w = (1, 2, 4) is not a linear combination of u1, u2, u3. So w ∉ span{u1, u2, u3}.
9
Since there is only one vector, the linear combinations are just the scalar multiples of u. Hence
These are represented by arrows parallel to u and can be of any length and the arrows can be
pointing in the same or opposite direction of u. Hence, this will eventually extend across the entire
line parallel to u.
In other words,
span{u} is represented by the line parallel to u and contains the origin.
Linear Span of Two Vectors (slide 13)
Let u, v be two non-parallel vectors in R2 or R3
For the arrow u, we can extend it across the entire line parallel to u, likewise for v. But do these
represent all the linear combination su + tv? The answer is no. These two lines will only contain all
the scalar multiples of u and v separately, but do not contain the vector addition between su and tv.
The effect of addition will give arrows that are not on these two lines, but on the plane that contains
these two lines as shown.
As s and t can be any real numbers, su + tv will give all the arrows that lie on the plane. So
span{u, v} is represented by the plane that contains the two vectors u and v and the origin.
10
2.3 Subspaces
Sometime we are just interested in specific collection of n-vectors that satisfy certain properties. This
form a subset of the n-space. Some of these subsets are classified as subspaces while some are not.
Subspaces (slide 2)
Note that every subspace of Rn is a subset of Rn but not every subset of Rn is a subspace of Rn.
Let V be the subset of R2 containing all vectors of the form (a, 0), i.e. V
contains 2-vectors (1, 0), (2.4, 0), (-8.31, 0), etc.
(a, 0) = a(1, 0)
as a general linear combination of (1, 0). So V is the set of all linear combinations of (1, 0). i.e.
V = span{(1, 0)}
This subspace is represented by a straight line that passes through the origin in the xy-plane. In fact,
any such straight line in XY-plane represent a subspace of the 2-space, as it can be expressed as the
linear span of some 2-vector.
Let V be the subset of R2 containing all vectors of the form (1, a). i.e. V
contains 2-vectors (1, 3), (1, 4.5), (1, -9/7) etc.
For the general form (1, a), if we try doing something similar to the previous example:
(1, a) = a (1/a, 1)
we notice that unlike the previous case, the vector (1/a, 1) is not a fixed vector, but depends on “a”.
In fact, vectors of the form (1, a) cannot be written as general linear combination of fixed vectors.
11
A better approach to see that V is not a subspace is to look at the closure property. Let’s take two
specific vectors in V:
So the closure property under vector addition is not satisfied. This means V is not a subspace of the
2-space.
Let V be the subset of 3-space containing all vectors (a b c) such that a < b < c, for example, (1, 2, 3),
(-1, 4, 7), (3, 9, 11) etc.
We can write
However, this is NOT a general linear combination of the 3 vectors, because the values of a, b, c are
not unrestricted. They are bounded by this inequality condition: a < b < c.
Again, a better way to show that V is not a subspace of 3-space is to consider the closure property.
This time all the vectors satisfy closure property under vector addition. However, it fails to satisfy the
closure property under scalar multiplication. Take a specific vector in V and a specific scalar:
(1, 2, 3) ∈ V, -1 ∈ R
as (-1, -2, -3) does not satisfy the condition a < b < c. Since the closure property under scalar
multiplication is not satisfied, we conclude that this subset is not a subspace of R3.
Note that we can rewrite the expression in the general solution by separating it into two vectors in
terms of the two parameters s and t. For the rightmost term in the above, we have pulled out the
parameters s and t respectively to get two fixed vectors (2 1 0) and (-3 0 1). In other words, we have
a general linear combination of two fixed vectors. So
the solution set of the linear system = span{ (2, 1, 0), (-3, 0, 1) }.
12
Note that this linear span represents a plane in xyz-space containing the origin. In fact, every plane in
the xyz-space containing the origin represent a subspace of R3 as they can always be expressed as
linear span two non-parallel vectors.
Let Ax = 0 be a homogeneous system with n variables. Then Its solution set is a subspace of Rn.
We will also call this solution set a solution space of the homogeneous system.
Like what we have seen in the example, the general solution of a homogeneous system can always
be written as a general linear combination of some fixed set of vectors.
Note that this only applies to homogeneous system. For a non-homogeneous system, the solution
set cannot be expressed as a linear span, and hence is not a subspace of n-space. In this case, we
cannot call the solution set of non-homogeneous system a solution space.
2 − 1 0
1 − 1 3
A=
− 5 1 0
1 0 1
span{ (2, –1, 0), (1, –1, 3), (–5, 1, 0), (1, 0, 1) }.
This means we are taking the collection of all linear combinations of the 4 rows r1, r2, r3, r4 of matrix
A. This is automatically a subspace of R3 by construction, since it is already in linear span form. We
call this subspace the row space of the matrix A.
Students tend to have the misconception that the row space consists just the 4 row-vectors r1, r2,
r3, r4, which is not true:
The row space contains infinitely many vectors, while the right hand side contains only four vectors.
In general, given any matrix, we can define the row space in a similar way by taking the linear span of
all its rows.
Column space of a matrix is defined similarly as the linear span of all the columns of the matrix.
13
Now we take the linear span of these column vectors:
i.e. we are taking all linear combinations of the 3 columns c1, c2, c3 of matrix A. This is a subspace of
R4 and we call it the column space of the matrix A. We usually write the vectors in a column space in
column form.
Again, do not mistaken the column space as the set with only the 3 column vectors c1, c2, c3:
The third type of subspace associated with a certain matrix A is a collection of vectors u such that
when we pre-multiply it with A, the product Au = 0. This is called the nullspace of a matrix A.
Note that:
Example:
To find the nullspace of A, we set up the homogeneous system (on the left below):
x − 2y + 3z = 0
2x − 4y + 6z = 0
3x − 6y + 9z = 0
Putting the system back in standard equation form (on the right side above), we can proceed to solve
this using Gaussian elimination as usual. (Note that we have already found the general solution of
this system in example 4 above). So we have
the nullspace of A
In general, if A is an m x n matrix, the nullspace of A is a subspace of Rn. Note that we normally write
the vectors in the nullspace in column form, so that they are compatible with the matrix form of the
system.
14
Vector Spaces (slide 11)
In our discussion, a vector space is a general term refering to Rn (for any n) or a subspace of Rn. The
following are examples of vector spaces:
- R3 is a vector space;
- The solution space of a homogeneous system is a vector space;
- The row space of a matrix is a vector space.
However, there is a more general definition of vector space that may include sets of objects that are
not n-vectors as long as addition and scalar multiplication can be defined on these objects and that
they satisfy all the properties below. Examples of such objects include: functions, polynomials and
matrices.
V is a general vector space, with two operations: vector addition and scalar multiplication if it
satisfies the closure properties mentioned earlier and the following properties. There are four
properties (1 to 4) involving the addition operator, and another four properties (5 to 8) on the scalar
multiplication:
The general version of vector space is just for your information. In this module, we will only focus on
the n-spaces and their subspaces.
15
2.4 Linear Independence
We introduce another important concept in vector space. Given a set of n-vectors, sometime it is
essential to know whether there is any linear relationship among the vectors. When such a
relationship exists, we say the set of vectors are linearly dependent, otherwise, we say the set is
linearly independent.
Let’s look at two separate linear spans: span{(1, 1, 1)} and span {(1, 1, 1), (–1, –1, –1)}. Note that the
second linear span has one additional “spanning” vector (–1, –1, –1) than the first linear span.
Recall that:
span {(1, 1, 1), (–1, –1, –1)} contains all linear combinations a(1, 1, 1) + b(-1, -1, -1) of the two vectors
(1 1 1) and (-1 -1 -1).
It is easy to see that, regardless of what values we substitute for a and b, the linear combination
a(1, 1, 1) + b(-1, -1, -1) can be reduced to a scalar multiple c(1, 1, 1).
In other words, having the additional vector (-1, -1, -1) in the span of (1, 1, 1) does not change the set
of vectors. So we say there is a redundant vector in the span of (1, 1, 1) and (-1, -1, -1).
The linear span of this vector extend across to the entire line that are parallel to this arrow.
Now (-1, -1, -1) is the negative of (1, 1, 1) and hence is parallel to (1, 1, 1) in the opposite direction.
So the linear span of the two vectors will just extend across to the same line, and hence representing
the same set.
Let’s look at another example: span{(1, 1, 1), (1, 0, -2)} and span {(1, 1, 1), (1, 0, –2), (2, 3, 5)}. Again
the second linear span has one additional “spanning” vector (2, 3, 5) than the first linear span.
span{(1, 1, 1), (1, 0, -2)} contains all linear combination p(1, 1, 1) + q(1, 0, -2);
span {(1, 1, 1), (1, 0, –2), (2, 3, 5)} contains all linear combination a(1, 1, 1) + b(1, 0, -2) + c(2, 3, 5).
The two linear spans are the same because (2, 3, 5) is a linear combination of (1, 1, 1) and (1, 0, -2):
16
So a linear combination a(1, 1, 1) + b(1, 0, -2) + c(2, 3, 5) can be reduced to a linear combination
p(1, 1, 1) + q(1, 0, -2). In other words, having the additional vector (2, 3, 5) in the linear span of (1, 1,
1) and (1, 0, -2) does not change the set of vectors. So (2, 3, 5) can be regarded as a redundant vector
in span {(1, 1, 1), (1, 0, –2), (2, 3, 5)}.
Let’s look at the geometrical interpretation. The two vectors (1, 1, 1) and (1, 0, -2) are represented
by two non parallel arrows. The linear span of these two vectors extend across to the entire plane
that contains the two arrows.
As we have seen (2, 3, 5) is a linear combination of these two vectors, it must lie on the same plane
as the other two vectors. So the linear span of the three vectors will also extend across to the same
plane.
This is equivalent to a homogeneous linear system with variables c1, c2, …, ck . (See examples below
on vector equation form of linear system.)
Recall that a homogeneous system has either only the trivial solution (case I) or infinitely many
solutions with non-trivial solutions (case II).
In this case, we say the set of vectors u1, u2, …, uk are linearly independent.
In this case, we say the set u1, u2, …, uk are linearly dependent.
Determine whether the set of vectors (1, 0, 0, 1), (0, 2, 1, 0), (1, –1, 1, 1) are linearly independent.
This is a linear system (in vector equation form) whose augmented matrix can be written
down directly (on the left below):
17
Performing Gaussian elimination gives us the row echelon form on the right above. Since there is no
non-pivot column in this row echelon form, we know there is exactly one solution, which must be
the trivial solution: c1 = 0, c2 = 0, c3 = 0. So (1, 0, 0, 1), (0, 2, 1, 0), (1, –1, 1, 1) are linearly
independent.
Determine whether the set of vectors (1, –2, 3), (5, 6, –1), (3, 2, 1) are linearly independent.
Again we write down the augmented matrix of this linear system directly (on the left below):
Performing Gaussian elimination gives us the row echelon form on the right above. Since there is one
non-pivot column (third column) in this row echelon form, there are infinitely many solutions for the
system, i.e. there exist non-trivial solutions for c1, c2, c3. So (1, –2, 3), (5, 6, –1), (3, 2, 1) are linearly
dependent.
There is an alternative way to determine linear independence of a set of vectors when the linear
system involved has the same number of equation and variables.
has non-trivial solution, and hence the three vectors are linearly dependent. Since there are three
vectors, and these vectors come from the 3-space R3, when we rewrite the system in matrix
equation form,
the coefficient matrix is a 3x3 square matrix, and we can look at the determinant:
18
Since the determinant turns out to be 0, from chapter 1 section 1.6, a homogeneous system Ax = 0
has non-trivial solutions when the coefficient matrix A is singular, which in turn means det(A) = 0.
This coincides with our observation.
In general, when there are n vectors u1, u2, …, un in Rn, we can form an n x n matrix A using u1, u2, …,
un (written in column form) as the n columns of the matrix.
If u1, u2, …, uk are linearly dependent, then at least one of the vectors ui can be written as a
linear combination of the other vectors. The converse is also true.
In this case, we know ui is a “redundant” vector in the linear span of u1, u2, …, uk
Examples:
The set of vectors (1, 1, 1), (1, 0, –2), (2, 3, 5) are linearly dependent as (2, 3, 5) is a “redundant”
vector in the span of the three vectors.
The set of vectors (1, 1, 1), (1, 0, –2) are linearly independent as there is no “redundant” vector in
the span of the two vectors.
Suppose u, v are linearly dependent. The vector equation cu + dv = 0 has non-trivial solution (so at
least one of c and d is non-zero).
- If c ≠ 0, u = (-d/c)v
- If d ≠ 0, v = (-c/d)u
In either case, we have u and v being scalar multiple of each other. So either one of them is a
redundant vector between the two.
This gives us a very simple way to decide whether two vectors are linearly independent by inspecting
their components:
- If u and v are scalar multiples of each other, they are linearly dependent;
- If u and v are not scalar multiples of each other, they are linearly independent.
19
1. In R2, a set of three or more vectors must be linearly dependent.
So we know immediately that (1,2), (3,4), (5,6) are linearly dependent
Note that the converse of this result does not hold. In other words, if k < n, it does not mean the k
vectors are linearly independent. For example, in the 3-space, the two vectors (1 1 1) and (-1 -1 -1)
are linearly dependent, as they are scalar multiple of each other.
In R2 (or R3), two vectors u and v are linearly dependent if they lie on the same line.
The two arrows representing the vectors either point in the same direction or in exact opposite
direction. In both cases, the two arrows lie on the same line as shown below.
Two vectors u and v are linearly independent if they do not lie on the same line.
For the last case, the linear span of the two vectors is the plane that contains both vectors. Note
that, in R2, there is only one plane, which is the xy-plane itself, representing the entire R2. This
means:
when u and v are linearly independent vectors in R2, then span{u, v} = R2.
Case 1: When all three arrows are parallel and starting from the origin, either pointing in the same or
opposite direction, then they lie on the same line, and hence the three vectors are linearly
dependent.
20
In this case the linear span of the three vectors is represented by the line L that the three vectors lie
on. In fact, there are two redundant vectors among u, v, and w, since we know the linear span of a
single vector will give the same line.
So we can remove any two of the three vectors from the linear span, and they will all be equal to
each.
Case 2: When the three arrows are not all parallel to each other but they all lie on the same plane.
The three vectors are again linearly dependent. In this case the linear span of the three vectors is
represented by the plane P that the three vectors lie on. This time, there is only one redundant
vectors among u, v, and w, since we know the linear span of two non parallel vectors will give the
same plane.
So, if the three vectors are all not parallel to each other, we can remove any one of the three vectors
from the linear span, and they will all be equal to each.
Case 3: This case only applies to R3, when the arrows representing the three vectors does not lie on
any plane. (Note that in R2, every vector must lie on the xy-plane. So this scenario cannot happen)
In other words, when we try to find a plane that contains two of the three arrows, the remaining out
will always be sticking out from the plane, as shown in the diagram.
when u, v and w are linearly independent vectors in R3, then span{u, v, w} = R3.
This means there is no redundant vector among the three. If we try to remove any one of the vectors
among u, v and w, the linear span of the remaining two vectors will be different.
So none of span{u, v} , span{u, w}, span{v, w} is equal to span{u, v, w}.
21
2.5 Basis and Dimension
A vector space consists of infinitely many vectors. We can find a set with only a finite number of
vectors that can generate all the vectors in a given vector space. Such a set with the smallest possible
number of vectors is called a basis for the vector space., and the dimension of a vector space is
defined to be the number of vectors in a basis of the space.
Let us denote this set by S = {e1, e2, e3}. Since there is no redundant vectors in S, if we try to remove
any one of the vectors from S, the linear span will no longer be the entire R3. So we say S is a smallest
possible set of vectors that generate every vector in R3. Such a smallest possible set is called a basis
for R3.
The number of vectors in a basis for V is called the dimension of V and is denoted by dim V. In other
words, if {u1, u2, …, uk} is a basis for V, then dim V = k. It is the smallest possible number of vectors
that can generate V.
We have seen that the standard basis vectors {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is a basis for R3.
So dim R3 = 3.
A vector space can have many different bases, but they always have the same dimension.
In other words, there are many different ways to form building blocks (with the same number) for
the vector space.
22
For example, in R3, it is not difficult to see that:
{ (2, 0, 0), (0, 2, 0), (0, 0, 2) } is a basis for R3 (multiplying the standard basis vectors by scalar 2, will
still give a building block for R3).
{ (1, 2, 1), (2, 9, 0), (3, 3, 4) } is another basis for R3. This one is not so obvious. We shall now verify it
by checking the two conditions for being a basis:
Since there is no non-pivot column in the row echelon form, the system has only trivial solution. So
(1, 2, 1), (2, 9, 0), (3, 3, 4) are linearly independent.
(ii) span{ (1, 2, 1), (2, 9, 0), (3, 3, 4) } = R3:
Let (x, y, z) be any (general) vector in R3. We set up the vector equation:
The system has a solution for any values of x, y, z as constant terms. This means every vector in R3
can be written as a linear combination of (1, 2, 1), (2, 9, 0), (3, 3, 4). So the three vectors will span
(generate) the entire R3.
By (i) and (ii), we conclude { (1, 2, 1), (2, 9, 0), (3, 3, 4) } is a basis for R3.
If {u1, u2, …, uk} is linearly dependent, then there are redundant vectors among them. We need to
remove the redundant vectors to get a basis for V.
23
Example 2: Basis for Linear Span (slide 8-9)
We shall see that { (1, 1, 1), (1, -1, -1) } is a basis for this subspace V by checking the two conditions:
Since there are only two vectors here, we just observe that (1, 1, 1) and (1, -1, -1) are not scalar
multiple of each other. Hence the two vectors are linearly independent.
We shall show (1,0,0) is a redundant vector in span{ (1,1,1), (1,-1,-1), (1,0,0) } by showing it is a linear
combination of (1, 1, 1) and (1, -1, -1).
Hence we can conclude that { (1, 1, 1), (1, -1, -1) } is a basis for V.
Find a basis for and determine the dimension of the solution space of the homogeneous system
2v + 2w − x + z = 0
− v − w + 2 x − 3y + z = 0
x + y + z = 0
v + w − 2 x − z = 0
This is a system with 5 variables, and hence its solution space is a subspace of R5. After performing
Gaussian elimination, the general solution is given on the left side below:
Like before, we can rewrite this expression as a general linear combination of some fixed vectors, as
given on the ride side above. Let’s denote the two vectors in the linear combination by:
−1 −1
⎛1⎞ ⎛0⎞
u1 = ⎜ 0 ⎟ , u2 = ⎜−1⎟
0 0
⎝0⎠ ⎝1⎠
So the solution space = span{u1, u2}. This is one of the conditions that u1, u2 is a basis for the solution
space.
To show the other condition that u1, u2 are linearly independent, we can just compare whether these
two vectors are scalar multiplication of each other. Since it is not, we conclude that u1, u2 are linearly
independent.
24
So we can conclude that {u1, u2} is a basis for the solution space of the given system, and
dim(solution space) = 2.
We observe that:
Given any homogeneous system, to find a basis and dimension for the solution space, the standard
approach is to perform Gaussian elimination to get the row echelon form R.
The number of non-pivot columns in the row echelon form corresponds to the number of
parameters in the general solution. This number in turn corresponds to the number of vectors in a
basis for the solution space. By definition, this is the dimension of the solution space. Hence the four
numbers in the diagram above are all the same.
Recall that the nullspace of a matrix A is the solution space of the homogeneous system Ax = 0. So
- a basis for the nullspace of A is same as a basis for the solution space of Ax = 0.
- the dimension for the nullspace of A is same as the dimension for the solution space of Ax = 0.
The dimension of the nullspace of A is also called the nullity of A, and it is denoted by nullity(A).
2v + 2w − x + z = 0
− v − w + 2 x − 3y + z = 0
x + y + z = 0
v + w − 2 x − z = 0
Since we know from earlier discussion that a basis for this system is given by
25
Basis for Row Space and Column Space (slide 13)
1 2 2 1
3 6 6 3
4 9 9 5
A=
− 2 − 1 − 1 1
5 8 9 4
4
2 7 3
To find a basis for the row space and column space, we need to look at the row echelon form R.
Row space
There are three non-zero rows v1 v2 v3. These three nonzero rows are obtained by performing row
operations on the original rows of A, and hence they are linear combinations of the rows of A. This
means they belong to the row space of A. In fact, v1 v2 v3 forms a basis for the row space of A.
In general,
the non-zero rows in the r.e.f. R of a matrix A a basis for the row space of A.
The dimension of the row space will therefore be given by the number of nonzero rows of the row
echelon form. If you recall from the earlier chapter, this is the rank of the matrix:
Column space
We look at the pivot columns of the row echelon form.
However, these pivot columns are not linear combinations of the columns in the original matrix A,
and hence they may not belong to the column space of A. Nevertheless,
the columns of A corresponding to pivot columns of R a basis for the column space of A.
In our example, since the pivot columns of the row echelon form R are the first three columns, we
will look at the corresponding first three columns of the original matrix A.
26
These three columns c1 c2 c3 form a basis for the column space of A.
The dimension of the column space of A will be given by the number of pivot columns of the row
echelon form. This number is again the rank of A.
In other words, the dimension of the row space and the column space of a matrix are the same,
Recall that
That explains why this result is called the Dimension Theorem for Matrices.
The following diagram explains why the dimension theorem of matrices works.
27
2.6 Coordinate Vectors
In the previous section, we introduced the basis for a vector space, which form the building blocks
for the space. In this section, we shall see how a basis provides a coordinate system and serves as a
unit of measurement” for the vectors in the vector space.
(a)
In this case, the coordinates of the vector (in red) is given by (2,3), with respect to the standard
coordinate system.
In some situation, we may need to re-orientate the coordinate system such as the following:
(b) (c)
Then the same red arrow will have different coordinates with respect to the new coordinate
systems.
How do we represent the different coordinate systems algebraically? We can do this using the
various bases for R2.
For the first coordinate system (a), it is given by the standard basis vectors, namely (1, 0) and (0, 1),
represented by the two orange unit vectors along the x-and y- axis. In other words, the two unit
vectors define the grid for the coordinate system.
28
The other two coordinate systems correspond to some non-standard bases for R2. The coordinate
system (b) is given by the basis (1, -1), (1, 1) as indicated by the two orange arrows, which gives us
the new grid for the coordinate system. The coordinate system (c) is given by another basis (1, 0),
(1, 1) and hence a different shape for the grid.
Standard Basis S1 (slide 4)
S1 = {(1, 0), (0, 1)}
The standard basis will give the unit block in pink, which in turn determines the unit of measurement
along the x and y axis.
This time we see that the two coordinates of our vector become -0.5 and 2.5 respectively.
Algebraically, we can write v = (2, 3) =-12 (1, -1) + 52(1, 1).
The scalars of this linear combination are -1/2 and 5/2 respectively. Putting these two numbers
together, we get the new coordinates of v with respect to the non-standard basis S2:
29
(v)𝑆𝑆2 = (−12, 52).
30
For (a), we are asking: given a vector v, how do we find (v)S?
For (b), we are asking: given coordinate vector (w)S, how do we recover w?
(a) We solve the equation a(1, 2, 1) + b(2, 9, 0) + c(3, 3, 4) = (5, –1, 9).
By using Gaussian elimination, we get a = 1, b = -1, c = 2. i.e. v = 1(1, 2, 1) – (2, 9, 0) + 2(3, 3, 4).
So (v)S = (1, –1, 2).
(b) We have (w)S = (–1, 3, 2). So we substitute a = -1, b = 3 and c = 2 into
w = a(1, 2, 1) + b(2, 9, 0) + c(3, 3, 4)
to get w = -(1, 2, 1) + 3(2, 9, 0) + 2(3, 3, 4) = (11, 31, 7).
Meaning of Basis (slide 10)
Here’s a summary of what we have discussed about bases for a vector space V.
1. A basis for V is a building block of V
2. A basis for V is a “unit of measurement” for vectors in V.
3. A basis for V gives a “coordinate system” for V.
Note that a basis for “Rn” is not necessarily a basis for a “subspace of Rn”
For example, in the 3-space, we know that the standard basis vectors give a basis for R3.
But for a subspace V of R3 that is represented by a plane, the standard basis for R3 is not a basis for V.
As shown in the diagram below, the vectors e1, e2, e3 may not be on the plane itself.
Instead, a basis for V should come from the subspace itself, and it this case, two vectors will be
enough to generate V.
31
2.7 Projection and Linear Approximation
In this section, we discuss the concept of projection and linear approximation, which are
related to dot product and the notion of orthogonality. In particular, we shall see how to
find approximate solutions for systems that are inconsistent.
Projection of a Vector onto a Plane in R3 (slide 2)
For a 3-vector (x y z) in R3, its projection onto the xy-plane is given geometrically as follow.
We drop the vector v (in red) perpendicularly onto the xy-plane to get the vector p (in green). The
coordinate of p is (x, y, 0). (The x and y coordinate is the same as v, but the z coordinate is replaced
by 0.)
If we complete the triangle as shown, this third side of the triangle is given by the vector n (in blue),
which is equal to v - p. So its coordinate is (0 0 z). Note that this vector n is perpendicular to the xy-
plane. Recall we also refer to perpendicular as orthogonal.
In fact, this holds not just for the xy-plane, but any plane in R3 that contains the origin.
We will further generalize the notion of projection to Rn, with the plane replaced by a more general
subspace of Rn.
Note that we always refer to a projection with respect to certain subspace of Rn. Furthermore, every
vector has exactly one projection onto a given subspace.
Suppose we are given a line and a point not on the line. Intuitively, the point on the line that is
closest to the given point is the perpendicular drop onto the line. This can be expressed in terms of
vectors:
32
p is the projection of u onto the line, and the distance between the vector u and its projection will
give the shortest distance from u to the line.
Similarly, if we are given a plane and a point not on the plane. Like before, the point on the plane
that is closest to the given point is the perpendicular drop onto the plane.
p is the projection of u onto the plane, and the distance between the vector u and its projection will
give the shortest distance from u to the plane.
Let V be a subspace in Rn and u ∈ Rn. The subspace is not necessarily a line or a plane.
What the inequality says is that, among all the vectors v in the subspace V, the projection p is the
one that is nearest to u. i.e. p is the best approximation of u in V.
We have seen how the notion of projection and orthogonality can be applied to the study of best
approximation. We shall now relate this to linear system.
We all know there are two types of systems, those that are consistent with solution, and those that
are inconsistent without solution.
33
For consistent system, we have a systematic way to find the exact solutions. However, in real life,
many linear systems that arise from mathematical modelling may not have exact solutions. For such
inconsistent systems, we will find approximate solutions.
Suppose there are three physical quantities r, s, t that are related by a quadratic equation:
t = cr2 + ds + e (*)
for some constant values c, d, e. We conduct an experiment to determine the exact values of c, d, e.
To do this, we get the measurement for r, s, t from our experiments, substitute these readings into
the quadratic equation (*), and set up a linear system with c, d, e as variables.
Now suppose we repeat the experiment 6 times and obtained 6 sets of values as show in the table.
The top row indicates the experiment number, the second and third rows are our input
measurements for r and s, while the bottom row gives the corresponding output measurement for t.
Like most experiments, while carrying out the measurements, there bound to be some experimental
errors, that may lead to an inconsistent system.
Since there are 6 set of readings, we will have 6 equations, and this will form a 6 x 3 linear system as
shown in the matrix equation form here.
Note that the three columns of the coefficient matrix A correspond to the coefficients of c, d, and e
respectively, and the constant matrix consists of the measurements of t.
Perform Gaussian elimination to this system to convince yourself that it is inconsistent. In other
words, for any value we substitute for the column vector x, Ax is not equal to b, and hence
Ax − b ≠ 0.
We shall see in a while how to find the “best approximate” solution for Ax = b. Bur for now, let us try
to understand what do we mean by “best approximate” solution.
Since Ax – b is not equal to zero, we want to find some value for x so that the difference can be as
close to 0 as possible.
34
In other words, we will try to find some vector u such that the norm ||Au − b|| is the smallest.
A least squares solution of linear system Ax = b is a vector u in Rn that minimize ||Ax − b||.
From earlier discussion, this best approximation Au of b is given by the projection of b onto a certain
subspace:
In other words,
a least squares solution u of Ax = b is a solution of Ax = p
* The projection is on the column space of A because the product Au can be expanded as a linear
combination of the columns of A (see below), which is in the column space of A.
A. If we know the projection p of b onto the column space of A, we can find the least squares
solutions u of Ax = b by solving Ax = p.
B. If we know a least squares solution u of Ax = b, we can find the projection p of b onto the
column space of A simply by matrix multiplication Au.
In principle, statement (A) gives us a way to find least squares solution. The problem is that we may
not know the projection p in the first place. So in practice this is not an effective way to find least
squares solution. Instead we introduce an alternative approach in the next segment.
35
Example: Finding Least Squares Solutions (slide 12)
In our earlier experiment example:
and
The new system ATAx = ATb is consistent and its solution u can be found (by Gaussian elimination):
This vector u is the least squares solution of Ax = b, giving the best approximate solution.
Find the projection of (1,1,1,1) onto the subspace V = span{(1,-1,1,-1), (1,2,0,1), (2,1,1,0)} of R4.
First, we need to set up a linear system and find its least squares solutions.
We use the three vectors in the linear span of V to form a 4 x 3 matrix A and the vector (1, 1, 1, 1) to
be b in column form: 1 1 2 1
−1 2 1 1
A= b=
1 0 1 1
−1 1 0 1
Now we have a linear system Ax = b. We find its least squares solution by solving ATAx = ATb.
−t + 52
The general solution of the new system is given by
x = −t + 54
t
Note that there are infinitely many solutions for this system. All of them are least squares solutions
of Ax = b. We just need one particular solution, say let t = 0, which gives
52
u = 54
0
6
1 1 2 2 5
−1 2 1 5 ⎛65⎞
Then Au= � � � 45 � = ⎜2⎟ will give the required projection onto the subspace V.
1 0 1 5
−1 1 0 0 2
⎝5⎠
36
2.8 Vector and Matrices with Function Entries
Up till now, the vectors and matrices that we have been discussing have constant entries. In other
words, the entries are numbers. Instead of constant entries, we can also talk about vectors and
matrices with their entries or components made up of functions. These objects are important as they
will help us deal with differential equations and their solutions, which we will introduce in later
chapters.
Vectors with Function Entries (slide 2)
Let v1(t), v2(t), …, vn(t) be real-value functions in variable t. Then we can form the n-vector v(t) with
these functions as entries:
Note that the last component of u(t) is regarded as a constant function 2 independent of the variable
t.
By substituting the variable t in a vector function with real value will give us a regular vector. For
example:
Since the vector function v(t) is putting all the components together, the values we can input into t
must be those common values that are defined for every component function:
the domain of v(t) = D1 ∩ D2 ∩ ⋯ ∩ Dn
In the event that there is no common intersection among the domains of the component functions,
then the vector function v(t) is undefined.
37
Examples: Domain of Vector Functions (slide 4)
Examples
Since both u(t) and v(t) are 3-vector functions, we can perform vector addition in the usual way, by
adding the corresponding component functions.
38
Scalar multiplication is also carried out in similar manner. Here the scalar can be a constant number,
or a real value function in variable t. In the example below, we scalar multiply the vector function
v(t) with the “scalar” which is the exponential function et:
The third example is the multiplication of A(t) with u(t). Since the sizes are compatible, the
multiplication can be carried out in the usual way:
Addition and Multiplication of Functions (slide 7)
One thing to take note when carrying out the operations is to ensure the “compatibility” of the
domains of the functions involved. In other words, the functions that we are adding or multiplying
must have some common intersection in their domains.
Example
Though we can conveniently add the component functions, we need to pay attention whether the
resulting function is defined on any domain.
As we have seen before, The domain of v(t) is R, and the domain of u(t) is 0 ≤ t ≤ 1.
So the domain of v(t) + u(t) is 0 ≤ t ≤ 1.
Derivatives of Vectors and Matrices (slide 8)
We can perform differentiation to a vector and matrix function if it is differentiable.
More precisely, given a vector function v(t) with component functions v1(t) to vn(t). Suppose every
component function is differentiable, then we say the entire vector function is differentiable too.
When this condition is met, we can differentiate v(t) w.r.t. t to get v’(t), which is another vector
function. The component functions of v’(t) are the corresponding derivatives of the component
functions of v(t).
Naturally, we call v’(t) the derivative of v(t).
We can define derivative for matrix function in a similar way.
Differentiation Rules (slide 9)
The usual differentiation rules also apply to matrix functions.
Let A(t) and B(t) be differentiable matrices, f(t) a differentiable function, and c a constant.
39
1. (A(t) + B(t))' = A'(t) + B'(t) (addition rule)
2. (cA(t))' = cA'(t) (scalar multiplication rule with constant scalar)
3. (f(t)A(t))' = f(t)A'(t) + f’(t)A(t) (scalar multiplication rule with function scalar)
4. (A(t)B(t))' = A(t)B’(t) + A’(t)B(t) (product rule)
The first three rules also apply to vector functions.
We end this chapter with an example to illustrate product rule on two matrix functions.
Let A(t) and B(t) be the following 2 x 2 matrix functions,
The same result can be obtained by multiplying A(t) and B(t) first, then differentiating.
40