KEMBAR78
MA1513 Chapter 2 Lecture Note | PDF | System Of Linear Equations | Linear Subspace
0% found this document useful (0 votes)
280 views40 pages

MA1513 Chapter 2 Lecture Note

This document provides an overview of vectors and vector spaces in linear algebra. It defines n-vectors as ordered n-tuples of real numbers and introduces basic vector operations like addition, subtraction and scalar multiplication. It describes the n-dimensional vector space Rn as the set of all n-vectors and discusses concepts like linear independence and basis. It also introduces the dot product and uses it to define metrics like length, distance and angles between vectors in Rn.

Uploaded by

Justin Ng
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
280 views40 pages

MA1513 Chapter 2 Lecture Note

This document provides an overview of vectors and vector spaces in linear algebra. It defines n-vectors as ordered n-tuples of real numbers and introduces basic vector operations like addition, subtraction and scalar multiplication. It describes the n-dimensional vector space Rn as the set of all n-vectors and discusses concepts like linear independence and basis. It also introduces the dot product and uses it to define metrics like length, distance and angles between vectors in Rn.

Uploaded by

Justin Ng
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 40

MA1513

Chapter 2: (Linear Algebra) Vector Space


2.1 Vectors in n-Space (P.1)
2.2 Linear Combination and Linear Span (P.6)
2.3 Subspaces (P.11)
2.4 Linear Independence (P.16)
2.5 Basis and Dimension (P.22)
2.6 Coordinate Vectors (P.28)
2.7 Projection and Linear Approximation (P.32)
2.8 Vector and Matrices with Function Entries (P.37)

2.1 Vectors in n-Space


Vectors are fundamental objects in Linear Algebra. In this section, we introduce vectors in the n-
dimensional space, and some standard vector operations.
n-Vector (slide 2)
An n-vector is an ordered n-tuple of real numbers u1, u2 , …, un denoted by
(u1, u2 , …, un)
For example, a 2-vector is given by an ordered pair (u1, u2) while a 3-vector is given by an ordered
triplet (u1, u2, u3).

ui is called the ith component (or ith coordinate) of the n-vector.


When the underlying dimension is clear, we may just refer an n-vector simply as a vector. We should
always view an n-vector as a single object. In other words, the n numbers in the vector should be
treated collectively. We will use “boldface” letters u, v, w, … to represent vectors:

u = (u1, u2 , …, un), v = (v1, v2 , …, vn).


2-vector and 3-vector (slide 3)
Geometrically, a 2-vector u = (u1, u2) is represented by an arrow in the xy-plane.

Note that the red arrow above is not the only arrow that represent u. Any other arrow with the same
direction and length as the red one also represent u. There are infinitely many of them.

1
We can construct an arrow to represent 3-vectors in the xyz-space in a similar way. For general n-
vector, we can’t explicitly draw it as an arrow, but sometime it might still help to think of them as
arrows in order to understand certain concepts regarding vectors.

n-Vectors as Matrices (slide 4)


We can regard an n-vector (u1, u2 , …, un) as a matrix with only one row or one column.

More precisely, we call a 1 x n matrix (u1 u2 … u𝑛𝑛 ) a row vector

u1
u2
and call an n x 1 matrix � ⋮ � a column vector.
u𝑛𝑛

We are free to write vectors in row or column forms, but under certain circumstances, specific form
has to be used.

The special n-vector (0, 0, …, 0) is called the zero vector, and it correspond to either the 1 x n or n x 1
zero matrix 0.

Vector Operations (slide 5)


Here are some basic operations we can perform on n-vectors. They are similar to the matrix
operations.
Let u = (u1, u2, …, un) and v = (v1, v2, …, vn) be n-vectors, and c a real number.

In the case of 2- and 3-vectors, we can also give geometrical interpretation to these operations in
terms of the arrows as shown in the last column of the table above.
Vector addition of two vectors: we can move the arrows of the two vectors (without changing
directions) so that the end point of one arrow (orange) join with the starting point of the other
(blue). Then the sum of the two vectors (green) is obtained by joining the third side of the triangle as
shown.
Vector subtraction of two vectors: we move the arrows so that the starting points of the two vectors
(orange and blue) coincide. Then we join the third side of the triangle to get the difference (green).
Negative of a vector: It is just represented by the arrow pointing in exactly the opposite direction as
the original vector. Both arrows have the same length.
Scalar multiplication of a vector by a scalar (real number): This has the effect of changing length of
the arrow while preserving the direction. For example, scalar multiplication by 2 will double the
length of the arrow, while a scalar ½ will shorten the length by half; and a negative scalar will reverse
the direction of the arrow at the same time.

2
n-Space (slide 6)
For a fix number n, the set or collection of all the n-vectors is called the n-space, and is represented
by the notation Rn. For example, if we use the Venn diagram to represent the 4-space, then its
elements are all the 4-vectors such as (1, 2, 3, 1), (0, 2, 1, 5) and so on.

For the 2-space R2, we can visualize it as the set of all 2-vectors in the xy-plane. For the 3-space R3,
we can view it as the set of all 3-vectors in the xyz-space.
The following are different ways to say the same thing:

• u ∈ Rn
• u is an n-vector
• u = (u1, u2 , …, un) for some real numbers u1, u2 , …, un.

Solution Set as a Subset of Rn (slide 7)


We shall see how the solutions of a linear system is related to the n-space.

For this linear system, check that a particular solution is given by


x = 2, y = -1 and z = -1.
2
Writing in matrix form, this solution can be given by the 3 x 1 column vector �−1�.
−1
Therefore, we can regard a solution of this system as a 3-vector in R3.
In fact, this system has infinitely many solutions with general solutions

where t is a parameter.
Once again, this can be put in matrix form

In other words, the set of all the solutions of the above system is a subset of R3 and contains all 3-
vectors given by expression on the right.
In general, if a linear system has n variables,
then its solutions are n-vectors, and the solution set is a subset of Rn.

3
Dot Product (slide 8)
Let u = (u1, u2, …, un), v = (v1, v2, …, vn) vectors in Rn. The dot product of u and v is defined to be the
number (scalar):
u ∙ v = u1 v1 + u2v2 + … unvn .
The left hand side is the notation of the dot product operation. The right hand side is the sum of the
product of the corresponding components of the two vectors, and note that the outcome is a
number, not a vector.
We can perform dot product of a vector with itself, and this will result in the sum of squares of all
the components of the vector:
u ∙ u = u12 + u22 + … un2 .
Length, Distance and Angles in R2 (slide 9)
We will look at a few geometrical concepts associated with n-vectors, and illustrate how we can
apply the dot product in the process, beginning with 2-vectors in R2.

The notion of length, distance and angles are defined in the intuitive way.

The length of a vector u = (u1, u2) simply means the length of the arrow representing the vector.
Using Pythagoras theorem, we easily obtain the length

‖u‖ = �𝑢𝑢1 2 + 𝑢𝑢2 2 .

The distance between the two vectors u = (u1, u2) and v = (v1, v2) refers to the distance between the
end points of the arrows, which is given by the length of the vector u – v:

‖u − v‖ = �(𝑢𝑢1 − 𝑣𝑣1 )2 + (𝑢𝑢2 − 𝑣𝑣2 )2 .

The angle between u = (u1, u2) and v = (v1, v2) refers to the angle 𝜃𝜃 between the two arrows
representing the vectors. If we use the cosine rule (a2 = b2 + c2 - 2bc cos 𝜃𝜃), we can derive

𝑢𝑢1 𝑣𝑣1 + 𝑢𝑢2 𝑣𝑣2


cos 𝜃𝜃 =
‖u‖ ‖v‖

and hence the angle 𝜃𝜃 (between 0 and 𝜋𝜋) can be obtained:

𝑢𝑢1 𝑣𝑣1 +𝑢𝑢2 𝑣𝑣2


𝜃𝜃 = cos −1 � ‖u‖ ‖v‖
�.

The same notation applies to 3-vectors in R3. In this case, the algebraic expression for length,
distance and angles will involve the 3 components of the vectors instead of two components for 2-
vectors.

4
Length, Distance and Angles in Rn (slide 10)
We can extend the algebraic expression for length, distance and angles to any n-vectors in Rn in the
obvious way.

The length of an n-vector u = (u1, u2, …, un) :

‖u‖ = �𝑢𝑢1 2 + 𝑢𝑢2 2 + ⋯ + 𝑢𝑢𝑛𝑛 2 .

The distance between u = (u1, u2, …, un) and v = (v1, v2, …, vn) :

‖u − v‖ = �(𝑢𝑢1 − 𝑣𝑣1 )2 + (𝑢𝑢2 − 𝑣𝑣2 )2 + ⋯ + (𝑢𝑢𝑛𝑛 − 𝑣𝑣𝑛𝑛 )2.

The angle 𝜃𝜃 between u = (u1, u2, …, un) and v = (v1, v2, …, vn) :

𝑢𝑢1 𝑣𝑣1 +𝑢𝑢2 𝑣𝑣2 +⋯+𝑢𝑢𝑛𝑛 𝑣𝑣𝑛𝑛


𝜃𝜃 = cos −1 � ‖u‖ ‖v‖
�.

We can represent these expressions more concisely in terms of dot product.

u ·v
‖u‖ = √u · u, ‖u − v‖ = �(u − v) ·( u − v), 𝜃𝜃 = cos−1 �‖u‖ ‖v‖�.

For a general n-vector, the length is also known as the norm.

So a vector of norm 1, is a vector with length 1. Such a vector is called a unit vector:

u is a unit vector ⇔ u has norm 1 ⇔ ||u|| = 1


Example (slide 11)
Let u = (1, 2, 0, 1), v = (1, -1, 1, 1) be vectors in R4.

The norm (length) of u is given by: ||u|| = �12 +22 +02 +12 =√6.

The distance between u and v is given by: ||u – v|| =�(1-1)2 +(2+1)2 +(0-1)2 +(1-1)2 = √10.

1×1+2×(-1)+0×1+1×1 π
The angle between u and v is given by: 𝜃𝜃 = cos−1 � � = cos−1 (0) = .
√6×√4 2

We observe that the quotient turns out to be zero, and hence the angle in this case is 90°.
Although we cannot visualize general n-vectors, they still represent certain quantitative objects. The
abstract notions of length, distance and angles allow us to make comparison between them. For
example, whether one n-vector gives a good approximation for another n-vector. We shall see this in
a later section.
Orthogonal Vectors (slide 12)
Let u and v be two vectors in Rn. If u · v = 0, we say u and v are orthogonal.

In this case, the angle between u and v :


u ·v π
𝜃𝜃 = cos −1 � � = cos−1 (0) = .
‖u‖ ‖v‖ 2
So u and v are perpendicular. For example, we can easily check that u = (1, 2, 0, 1) and v = (1, -1, 1, 1)
has dot product u · v = 0, and hence are orthogonal to each other in R4.

5
2.2 Linear Combination and Linear Span
The topics of linear combinations and linear spans are fundamental in the theory of vector spaces.
Many important concepts about n-vectors require the notion of linear combination, which is a
combination of scalar multiplication and vector addition applied on a given set of vectors. For a fixed
set of n-vectors, we can use linear combinations to generate many more n-vectors. This will lead to
the concept of linear span.
Linear Combinations (slide 2)
To talk about linear combination, we must always be given a set of n-vectors to begin with:
u1, u2, …, uk : a fixed set of vectors in Rn
For each of these vectors, we are going to multiply by a corresponding scalar: c1, c2, …, ck :
c1u1 , c2u2 , ··· , ckuk: scalar multiples of the vectors.
Then we add all these scalar multiples of the vectors together to get:
c1u1 + c2u2 + ··· + ckuk .
This is called a linear combination of u1, u2, …, uk. It is also an n-vector in Rn.
For example, take u1 = (2, 1, 0) u2 = (-3, 0, 1) and c1 = 1, c2 = 1. Then we get a specific linear
combination:

1(2, 1, 0) + 1(-3, 0, 1) = (-1, 1, 1).

For the same vectors, if we use general scalars c1 = s, c2 = t , then we will have a general linear
combination with parameters s and t:

s(2, 1, 0) + t(-3, 0, 1).

Example (slide 3 - 6)
Let u1 = (2, 1, 3), u2 = (1, –1, 2) and u3 = (3, 0, 5) be three fixed 3-vectors in R3.
(a) v = (3, 3, 4) is a linear combination of u1, u2, u3
as (3, 3, 4) can be expressed as a(2, 1, 3) + b(1, –1, 2) + c(3, 0, 5) . (Why?)

(b) w = (1, 2, 4) is not a linear combination of u1, u2, u3

as (1, 2, 4) cannot be expressed as a(2, 1, 3) + b(1, –1, 2) + c(3, 0, 5). (Why?)

We shall see below how to get these answers.

For part (a), we start by writing the given vector v as a general linear combination v = au1 + bu2 + cu3

(3, 3, 4) = a(2, 1, 3) + b(1, –1, 2) + c(3, 0, 5) (*)

where a, b, c are unknown scalars to be determined. Equating the components on the left and right
hand side, we get a linear system:

The first equation come from equating the first component on both sides of (*). Similarly, the
second and third equations come from the second and third components of the vectors in (*).

6
So the original equation (*) is equivalent to a linear system, and we can solve this system for the
values of a, b, c.

Before we solve the system, we remark that by expressing the vectors in equation (*) in column
form, we get:
3 2 1 3
�3� = a �1� + b�-1� + c�0�
4 3 2 5
This is called the vector equation form of the above linear system.

We now proceed to solve the linear system by Gaussian elimination, starting from the augmented
matrix, we reduce it to this row echelon form:

Since the last column is not a pivot column, the system is consistent. This means solution for a, b, c
exists. At this point, without finding the solution yet, we can conclude the vector (3, 3, 4) is a linear
combination of u1, u2, u3.

If we want to explicitly write this vector as a specific linear combination, we need to go one step
further to solve the system. From the row echelon form above, we have two equations:
2𝑎𝑎 + 𝑏𝑏 + 3𝑐𝑐 = 3
3 3
− 2
𝑏𝑏 − 2
𝑐𝑐 = 32

By setting the variable c as free parameter t and using back substitution, we get the general solution
for the system:

This means there are many possible values for a, b and c. For example, let t = 0, we get a = 2, b = -1,
c = 0 and hence we have explicitly

(3, 3, 4) = 2u1 – u2

(the scalar attached to u3 is 0, so we do not need to write it out).

If we take t = 1 instead, we will get another set of values a = 1, b = -2, c = 1 and hence a different
linear combination for (3,3,4):

(3, 3, 4) = u1 – 2u2 + u3.

For part (b): w = (1, 2, 4).


Like in part (a), we set up the linear combination w = au1 + bu2 + cu3 :

(1, 2, 4) = a(2, 1, 3) + b(1, –1, 2) + c(3, 0, 5)


The vector equation is equivalent to a linear system

7
which can be established by comparing components on both sides. We again apply Gaussian
elimination to the augmented matrix and get the row echelon form.

This time we see that the last column is a pivot column, which leads to a row highlighted in pink in
the row echelon form. Hence we know the system has no solution and we conclude that the vector
(1, 2, 4) is not a linear combination of u1, u2, u3.

Standard Basis Vectors (slide 7)


The standard basis vectors in R3 is given by the three vectors
e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1).
Note that these three vectors are the directional unit vectors along the x-, y- and z-axis respectively.

Observe that we can write every vector in R3 as a linear combination of these standard basis vectors.
Take a general 3-vector (x y z) in R3, where x, y, z represent some real numbers.

We note that (x, y, z) can be broken up as a sum of three vectors on the right hand side. Then, for
these three vectors, we can write them as scalar multiples by pulling out the numbers x y z from the
components. This will then give us a linear combination of the standard basis vectors, with scalars x,
y, z respectively.
Take a more specific 3-vector (1, 2, 5), the linear combination in terms of the standard basis vectors
is (1, 2, 5) = 1e1 + 2e2 + 5e3

In general, for any positive integer n, we have the standard basis vectors for Rn:
b1 = (1, 0, …, 0), b2 = (0, 1, …, 0), …, bn = (0, 0, …, 1)
consisting of n similar vectors with zero component everywhere except for one position as shown.
The Set of Linear Combinations (slide 8)
We now look at the set of all possible linear combinations of a fixed set of vectors. Again let’s start
with a simple example of two vectors (2,1,0) and (-3,0,1).
How many possible linear combinations s(2, 1, 0) + t(-3, 0, 1) are there? Since there are infinitely
many possible scalars s and t, there will be infinitely many linear combinations. In other words, the
set of all linear combinations of these two vectors contains infinitely many elements.
We shall call this set the linear span of (2,1,0) and (-3,0,1) and we write it as

span{ (2,1,0) , (-3,0,1) }.

8
This linear span notation is a concise way to describe this infinite set. It should not be mistaken that
this is a finite set with only two vectors (2,1,0) and (-3,0,1). Rather, it contains all possible linear
combinations s(2, 1, 0) + t(-3, 0, 1) of the two given vectors.
Linear Span (slide 9)
We now generalize this to a set of k vectors u1, u2, …, uk in Rn.

The set of all linear combinations of u1, u2, …, uk :

c1u1 + c2u2 + ··· + ckuk where c1, c2, …, ck are in R.

We call this set the linear span of u1, u2, …, uk and denote it as span{u1, u2, …, uk}.

It contains all linear combinations of u1, u2, …, uk.


You may always translate the term “linear span” into the phrase “the set of all linear combinations”,
since they mean exact the same thing.
Note that when we use the term linear span, we must always refer it to a set of vectors, as the term
does not have specific meaning if used stand alone.
Vectors in a Linear Span (slide 10)
Let’s revisit our earlier example: u1 = (2, 1, 3), u2 = (1, –1, 2) and u3 = (3, 0, 5).
(a) v = (3, 3, 4) is a linear combination of u1, u2, u3. So v ∈ span{u1, u2, u3}.

(b) w = (1, 2, 4) is not a linear combination of u1, u2, u3. So w ∉ span{u1, u2, u3}.

Linear Span of Standard Basis Vectors (slide 11)


Recall that every vector in R3 is a linear combination of the standard basis vectors
e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1).
So span{e1, e2, e3} contains every vector in R3 and hence

span{e1, e2, e3} = R3.

Similarly for the standard basis vectors in Rn


b1 = (1, 0, …, 0), b2 = (0, 1, …, 0), …, bn = (0, 0, …, 1),
we have span{b1, b2, …, bn} = Rn.

Linear Span of One Vector (slide 12)


We give some geometrical representation of linear span in the 2 and 3 space. In layman terms the
word span means “extend across”.
Let u be a non-zero vector in R2 or R3

9
Since there is only one vector, the linear combinations are just the scalar multiples of u. Hence

span{u} = all scalar multiples cu.

These are represented by arrows parallel to u and can be of any length and the arrows can be
pointing in the same or opposite direction of u. Hence, this will eventually extend across the entire
line parallel to u.

In other words,
span{u} is represented by the line parallel to u and contains the origin.
Linear Span of Two Vectors (slide 13)
Let u, v be two non-parallel vectors in R2 or R3

span{u, v} = all linear combination su + tv.

For the arrow u, we can extend it across the entire line parallel to u, likewise for v. But do these
represent all the linear combination su + tv? The answer is no. These two lines will only contain all
the scalar multiples of u and v separately, but do not contain the vector addition between su and tv.
The effect of addition will give arrows that are not on these two lines, but on the plane that contains
these two lines as shown.

As s and t can be any real numbers, su + tv will give all the arrows that lie on the plane. So

span{u, v} is represented by the plane that contains the two vectors u and v and the origin.

10
2.3 Subspaces
Sometime we are just interested in specific collection of n-vectors that satisfy certain properties. This
form a subset of the n-space. Some of these subsets are classified as subspaces while some are not.

Subspaces (slide 2)

A subspace V of Rn is a subset of Rn that satisfies the following conditions:

A. V can be expressed in linear span form:


i.e. there is a set of vectors u1, u2, …, uk in V such that V = span{u1, u2, …, uk }

B. V satisfies the closure properties:

(i) for all u, v ∈ V, we must have u + v ∈ V.

(ii) for all u ∈ V and c ∈ R, we must have cu ∈ V.

To check that a subset is a subspace,


we just need to check one of the two conditions A and B is met.

Note that every subspace of Rn is a subset of Rn but not every subset of Rn is a subspace of Rn.

Example 1: Subspace (slide 3)

Let V be the subset of R2 containing all vectors of the form (a, 0), i.e. V
contains 2-vectors (1, 0), (2.4, 0), (-8.31, 0), etc.

Note that we can write

(a, 0) = a(1, 0)

as a general linear combination of (1, 0). So V is the set of all linear combinations of (1, 0). i.e.

V = span{(1, 0)}

Since V can be expressed in linear span form above, so it is a subspace of R2.

This subspace is represented by a straight line that passes through the origin in the xy-plane. In fact,
any such straight line in XY-plane represent a subspace of the 2-space, as it can be expressed as the
linear span of some 2-vector.

Example 2: Non-subspace (slide 4)

This next example is a subset of the 2-space which is not a subspace.

Let V be the subset of R2 containing all vectors of the form (1, a). i.e. V
contains 2-vectors (1, 3), (1, 4.5), (1, -9/7) etc.

For the general form (1, a), if we try doing something similar to the previous example:

(1, a) = a (1/a, 1)

we notice that unlike the previous case, the vector (1/a, 1) is not a fixed vector, but depends on “a”.
In fact, vectors of the form (1, a) cannot be written as general linear combination of fixed vectors.

11
A better approach to see that V is not a subspace is to look at the closure property. Let’s take two
specific vectors in V:

(1, 3) ∈ V and (1, 5) ∈ V

but (1, 3) + (1, 5) = (2, 8) ∉ V.

So the closure property under vector addition is not satisfied. This means V is not a subspace of the
2-space.

Example 3: Non-subspace (slide 5)

Let V be the subset of 3-space containing all vectors (a b c) such that a < b < c, for example, (1, 2, 3),
(-1, 4, 7), (3, 9, 11) etc.

We can write

(a, b, c) = a(1, 0, 0) + b(0, 1, 0) + c(0, 0, 1).

However, this is NOT a general linear combination of the 3 vectors, because the values of a, b, c are
not unrestricted. They are bounded by this inequality condition: a < b < c.

Again, a better way to show that V is not a subspace of 3-space is to consider the closure property.
This time all the vectors satisfy closure property under vector addition. However, it fails to satisfy the
closure property under scalar multiplication. Take a specific vector in V and a specific scalar:

(1, 2, 3) ∈ V, -1 ∈ R

but (-1)(1, 2, 3) = (-1, -2, -3) ∉ V

as (-1, -2, -3) does not satisfy the condition a < b < c. Since the closure property under scalar
multiplication is not satisfied, we conclude that this subset is not a subspace of R3.

Example 4: Subspace (slide 6)

Let’s consider the solutions of a homogeneous linear system:

Is the solution set a subspace of R3?


The general solution (we omit the Gaussian elimination here) of the system is given by

Note that we can rewrite the expression in the general solution by separating it into two vectors in
terms of the two parameters s and t. For the rightmost term in the above, we have pulled out the
parameters s and t respectively to get two fixed vectors (2 1 0) and (-3 0 1). In other words, we have
a general linear combination of two fixed vectors. So

the solution set of the linear system = span{ (2, 1, 0), (-3, 0, 1) }.

So the solution set is a subspace of R3

12
Note that this linear span represents a plane in xyz-space containing the origin. In fact, every plane in
the xyz-space containing the origin represent a subspace of R3 as they can always be expressed as
linear span two non-parallel vectors.

Solution Space of Homogeneous System (slide 7)

This is a generalization of the previous example.

Let Ax = 0 be a homogeneous system with n variables. Then Its solution set is a subspace of Rn.
We will also call this solution set a solution space of the homogeneous system.

Like what we have seen in the example, the general solution of a homogeneous system can always
be written as a general linear combination of some fixed set of vectors.

Note that this only applies to homogeneous system. For a non-homogeneous system, the solution
set cannot be expressed as a linear span, and hence is not a subspace of n-space. In this case, we
cannot call the solution set of non-homogeneous system a solution space.

Row Space of a Matrix (slide 8)

We shall look at a few examples of subspaces that are associated to matrices.

 2 − 1 0
 
 1 − 1 3
A=
− 5 1 0
 
 1 0 1 

The 4 rows of A can be regarded as row-vectors:

Now we take the linear span of these row vectors:

span{ (2, –1, 0), (1, –1, 3), (–5, 1, 0), (1, 0, 1) }.

This means we are taking the collection of all linear combinations of the 4 rows r1, r2, r3, r4 of matrix
A. This is automatically a subspace of R3 by construction, since it is already in linear span form. We
call this subspace the row space of the matrix A.

 Students tend to have the misconception that the row space consists just the 4 row-vectors r1, r2,
r3, r4, which is not true:

row space of A = span{ r1, r2, r3, r4 } ≠ { r1, r2, r3, r4 }.

The row space contains infinitely many vectors, while the right hand side contains only four vectors.

In general, given any matrix, we can define the row space in a similar way by taking the linear span of
all its rows.

Column Space of a Matrix (slide 9)

Column space of a matrix is defined similarly as the linear span of all the columns of the matrix.

Let’s use the same matrix A above.

13
Now we take the linear span of these column vectors:

i.e. we are taking all linear combinations of the 3 columns c1, c2, c3 of matrix A. This is a subspace of
R4 and we call it the column space of the matrix A. We usually write the vectors in a column space in
column form.
Again, do not mistaken the column space as the set with only the 3 column vectors c1, c2, c3:

column space of A = span{ c1, c2, c3 } ≠ { c1, c2, c3 }

Nullspace of a Matrix (slide 10)

The third type of subspace associated with a certain matrix A is a collection of vectors u such that
when we pre-multiply it with A, the product Au = 0. This is called the nullspace of a matrix A.

Note that:

the nullspace of A = the solution space of Ax = 0

Example:

To find the nullspace of A, we set up the homogeneous system (on the left below):
 x − 2y + 3z = 0

 2x − 4y + 6z = 0
 3x − 6y + 9z = 0

Putting the system back in standard equation form (on the right side above), we can proceed to solve
this using Gaussian elimination as usual. (Note that we have already found the general solution of
this system in example 4 above). So we have

the nullspace of A

In this case, the nullspace is a subspace of R3.

In general, if A is an m x n matrix, the nullspace of A is a subspace of Rn. Note that we normally write
the vectors in the nullspace in column form, so that they are compatible with the matrix form of the
system.

14
Vector Spaces (slide 11)

In our discussion, a vector space is a general term refering to Rn (for any n) or a subspace of Rn. The
following are examples of vector spaces:

- R3 is a vector space;
- The solution space of a homogeneous system is a vector space;
- The row space of a matrix is a vector space.

However, there is a more general definition of vector space that may include sets of objects that are
not n-vectors as long as addition and scalar multiplication can be defined on these objects and that
they satisfy all the properties below. Examples of such objects include: functions, polynomials and
matrices.

General Vector Space (slide 12)

V is a general vector space, with two operations: vector addition and scalar multiplication if it
satisfies the closure properties mentioned earlier and the following properties. There are four
properties (1 to 4) involving the addition operator, and another four properties (5 to 8) on the scalar
multiplication:

1. For any two vectors a and b of V,


a + b = b + a.
2. For any three vectors a, b, c of V,
(a + b) + c = a + (b + c).
3. There is a zero vector in V, denoted
by 0, such that for every a in V,
a + 0 = a.
4. For every a in V there is a unique
vector in V that is denoted by -a
and is such that
a + (-a) = 0.

The general version of vector space is just for your information. In this module, we will only focus on
the n-spaces and their subspaces.

15
2.4 Linear Independence
We introduce another important concept in vector space. Given a set of n-vectors, sometime it is
essential to know whether there is any linear relationship among the vectors. When such a
relationship exists, we say the set of vectors are linearly dependent, otherwise, we say the set is
linearly independent.

Example 1: Redundant Vector (slide 2)

Let’s look at two separate linear spans: span{(1, 1, 1)} and span {(1, 1, 1), (–1, –1, –1)}. Note that the
second linear span has one additional “spanning” vector (–1, –1, –1) than the first linear span.

Recall that:

span{(1, 1, 1)} contains all scalar multiples c(1, 1, 1) of (1 1 1):

span {(1, 1, 1), (–1, –1, –1)} contains all linear combinations a(1, 1, 1) + b(-1, -1, -1) of the two vectors
(1 1 1) and (-1 -1 -1).

It is easy to see that, regardless of what values we substitute for a and b, the linear combination
a(1, 1, 1) + b(-1, -1, -1) can be reduced to a scalar multiple c(1, 1, 1).

Hence, the two linear spans are actually equal:

In other words, having the additional vector (-1, -1, -1) in the span of (1, 1, 1) does not change the set
of vectors. So we say there is a redundant vector in the span of (1, 1, 1) and (-1, -1, -1).

We can visualize this geometrically. The vector (1, 1, 1) is represented by an arrow.

The linear span of this vector extend across to the entire line that are parallel to this arrow.
Now (-1, -1, -1) is the negative of (1, 1, 1) and hence is parallel to (1, 1, 1) in the opposite direction.
So the linear span of the two vectors will just extend across to the same line, and hence representing
the same set.

Example 2: Redundant Vector (slide 3)

Let’s look at another example: span{(1, 1, 1), (1, 0, -2)} and span {(1, 1, 1), (1, 0, –2), (2, 3, 5)}. Again
the second linear span has one additional “spanning” vector (2, 3, 5) than the first linear span.

span{(1, 1, 1), (1, 0, -2)} contains all linear combination p(1, 1, 1) + q(1, 0, -2);

span {(1, 1, 1), (1, 0, –2), (2, 3, 5)} contains all linear combination a(1, 1, 1) + b(1, 0, -2) + c(2, 3, 5).

The two linear spans are the same because (2, 3, 5) is a linear combination of (1, 1, 1) and (1, 0, -2):

(2, 3, 5) = 3(1,1,1) +(-1)(1,0,-2)

16
So a linear combination a(1, 1, 1) + b(1, 0, -2) + c(2, 3, 5) can be reduced to a linear combination
p(1, 1, 1) + q(1, 0, -2). In other words, having the additional vector (2, 3, 5) in the linear span of (1, 1,
1) and (1, 0, -2) does not change the set of vectors. So (2, 3, 5) can be regarded as a redundant vector
in span {(1, 1, 1), (1, 0, –2), (2, 3, 5)}.

Let’s look at the geometrical interpretation. The two vectors (1, 1, 1) and (1, 0, -2) are represented
by two non parallel arrows. The linear span of these two vectors extend across to the entire plane
that contains the two arrows.

As we have seen (2, 3, 5) is a linear combination of these two vectors, it must lie on the same plane
as the other two vectors. So the linear span of the three vectors will also extend across to the same
plane.

Linear Independence and Dependence (slide 4)

Let u1, u2, …, uk be a set of vectors in Rn.

Set up the vector equation

c1u1 + c2u2 + ··· + ckuk = 0.

This is equivalent to a homogeneous linear system with variables c1, c2, …, ck . (See examples below
on vector equation form of linear system.)

Recall that a homogeneous system has either only the trivial solution (case I) or infinitely many
solutions with non-trivial solutions (case II).

Case I: The system has only the trivial solution,


i.e. the only possible scalars are: c1 = 0, c2 = 0, …, ck = 0

In this case, we say the set of vectors u1, u2, …, uk are linearly independent.

Case II: The system has non-trivial solutions,


i.e. there are scalars c1, c2, …, ck that are not all zero.

In this case, we say the set u1, u2, …, uk are linearly dependent.

Example 3: Linear Independence (slide 5)

Determine whether the set of vectors (1, 0, 0, 1), (0, 2, 1, 0), (1, –1, 1, 1) are linearly independent.

Set up the vector equation:


1 0  1  0
       
0 2 −1 0
c1   + c2   + c3   =  
0 1  1  0
       
1 0  1  0

This is a linear system (in vector equation form) whose augmented matrix can be written
down directly (on the left below):

17
Performing Gaussian elimination gives us the row echelon form on the right above. Since there is no
non-pivot column in this row echelon form, we know there is exactly one solution, which must be
the trivial solution: c1 = 0, c2 = 0, c3 = 0. So (1, 0, 0, 1), (0, 2, 1, 0), (1, –1, 1, 1) are linearly
independent.

Example 4: Linear dependence (slide 6)

Determine whether the set of vectors (1, –2, 3), (5, 6, –1), (3, 2, 1) are linearly independent.

Set up the vector equation:

1 5 3 0


       
c1  −2  + c2  6  + c3  2  =
0
3  −1  1 0
       

Again we write down the augmented matrix of this linear system directly (on the left below):

Performing Gaussian elimination gives us the row echelon form on the right above. Since there is one
non-pivot column (third column) in this row echelon form, there are infinitely many solutions for the
system, i.e. there exist non-trivial solutions for c1, c2, c3. So (1, –2, 3), (5, 6, –1), (3, 2, 1) are linearly
dependent.

Linear independence and Determinant (slide 7)

There is an alternative way to determine linear independence of a set of vectors when the linear
system involved has the same number of equation and variables.

For our previous example, recall that the system


1 5 3 0
       
c1  −2  + c2  6  + c3  2  =
0
3  −1     
    1 0

has non-trivial solution, and hence the three vectors are linearly dependent. Since there are three
vectors, and these vectors come from the 3-space R3, when we rewrite the system in matrix
equation form,

the coefficient matrix is a 3x3 square matrix, and we can look at the determinant:

18
Since the determinant turns out to be 0, from chapter 1 section 1.6, a homogeneous system Ax = 0
has non-trivial solutions when the coefficient matrix A is singular, which in turn means det(A) = 0.
This coincides with our observation.

In general, when there are n vectors u1, u2, …, un in Rn, we can form an n x n matrix A using u1, u2, …,
un (written in column form) as the n columns of the matrix.

 If det(A) = 0, then u1, u2, …, un are linearly dependent

 If det(A) ≠ 0, then u1, u2, …, un are linearly independent.

Linear Dependence and Redundancy (slide 8)

Let u1, u2, …, uk be a set with more than 1 vector.

If u1, u2, …, uk are linearly dependent, then at least one of the vectors ui can be written as a
linear combination of the other vectors. The converse is also true.

In this case, we know ui is a “redundant” vector in the linear span of u1, u2, …, uk

Examples:

The set of vectors (1, 1, 1), (1, 0, –2), (2, 3, 5) are linearly dependent as (2, 3, 5) is a “redundant”
vector in the span of the three vectors.

The set of vectors (1, 1, 1), (1, 0, –2) are linearly independent as there is no “redundant” vector in
the span of the two vectors.

Linear Independence of Two Vectors (slide 9)

Let u, v be a set with two vectors in Rn.

Suppose u, v are linearly dependent. The vector equation cu + dv = 0 has non-trivial solution (so at
least one of c and d is non-zero).

- If c ≠ 0, u = (-d/c)v

- If d ≠ 0, v = (-c/d)u

In either case, we have u and v being scalar multiple of each other. So either one of them is a
redundant vector between the two.

This gives us a very simple way to decide whether two vectors are linearly independent by inspecting
their components:

- If u and v are scalar multiples of each other, they are linearly dependent;
- If u and v are not scalar multiples of each other, they are linearly independent.

A Condition for Linear Dependence (slide 10)

Let u1, u2, …, uk be a set of k vectors in Rn.

If k > n, then the set of vectors are linearly dependent.

Let’s look at some examples.

19
1. In R2, a set of three or more vectors must be linearly dependent.
So we know immediately that (1,2), (3,4), (5,6) are linearly dependent

2. In R3, a set of four or more vectors must be linearly dependent.


We have (1,2,3), (3,4,5), (5,6,7), (7,8,9) are linearly dependent

Note that the converse of this result does not hold. In other words, if k < n, it does not mean the k
vectors are linearly independent. For example, in the 3-space, the two vectors (1 1 1) and (-1 -1 -1)
are linearly dependent, as they are scalar multiple of each other.

Geometrical Meaning: Two Vectors (slide 11)

In R2 (or R3), two vectors u and v are linearly dependent if they lie on the same line.

The two arrows representing the vectors either point in the same direction or in exact opposite
direction. In both cases, the two arrows lie on the same line as shown below.

Two vectors u and v are linearly independent if they do not lie on the same line.

For the last case, the linear span of the two vectors is the plane that contains both vectors. Note
that, in R2, there is only one plane, which is the xy-plane itself, representing the entire R2. This
means:

when u and v are linearly independent vectors in R2, then span{u, v} = R2.

Geometrical Meaning: Three Vectors (slide 12-14)

In R2 (or R3), three vectors u, v and w are linearly dependent if


they lie on the same line or same plane.

Case 1: When all three arrows are parallel and starting from the origin, either pointing in the same or
opposite direction, then they lie on the same line, and hence the three vectors are linearly
dependent.

20
In this case the linear span of the three vectors is represented by the line L that the three vectors lie
on. In fact, there are two redundant vectors among u, v, and w, since we know the linear span of a
single vector will give the same line.

span{u, v, w} = span{u} = span{v} = span{w} = L

So we can remove any two of the three vectors from the linear span, and they will all be equal to
each.

Case 2: When the three arrows are not all parallel to each other but they all lie on the same plane.

The three vectors are again linearly dependent. In this case the linear span of the three vectors is
represented by the plane P that the three vectors lie on. This time, there is only one redundant
vectors among u, v, and w, since we know the linear span of two non parallel vectors will give the
same plane.

span{u, v, w} = span{ u, v} = span{ u, w } = span{ v, w} = P.

So, if the three vectors are all not parallel to each other, we can remove any one of the three vectors
from the linear span, and they will all be equal to each.

Case 3: This case only applies to R3, when the arrows representing the three vectors does not lie on
any plane. (Note that in R2, every vector must lie on the xy-plane. So this scenario cannot happen)

In other words, when we try to find a plane that contains two of the three arrows, the remaining out
will always be sticking out from the plane, as shown in the diagram.

In this case, the three vectors are linearly independent.


The linear combination of u, v and w will generate all the 3-vectors in the 3-space. Hence

when u, v and w are linearly independent vectors in R3, then span{u, v, w} = R3.

This means there is no redundant vector among the three. If we try to remove any one of the vectors
among u, v and w, the linear span of the remaining two vectors will be different.
So none of span{u, v} , span{u, w}, span{v, w} is equal to span{u, v, w}.

21
2.5 Basis and Dimension
A vector space consists of infinitely many vectors. We can find a set with only a finite number of
vectors that can generate all the vectors in a given vector space. Such a set with the smallest possible
number of vectors is called a basis for the vector space., and the dimension of a vector space is
defined to be the number of vectors in a basis of the space.

Standard Basis Vectors in R3 (slide 2)


We begin with the standard basis vectors in R3:
e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1).
We have seen earlier that
 span{e1, e2, e3} = R3
as any 3-vector in R3 can be written as a linear combination of these standard basis vectors. So we
can regard these three vectors as the building block that generate all the vectors in R3.
On the other hand, we also have
 e1, e2, e3 are linearly independent

This means there is no redundant vector among e1, e2, e3.

Let us denote this set by S = {e1, e2, e3}. Since there is no redundant vectors in S, if we try to remove
any one of the vectors from S, the linear span will no longer be the entire R3. So we say S is a smallest
possible set of vectors that generate every vector in R3. Such a smallest possible set is called a basis
for R3.

Basis and Dimension (slide 3)


Let S = {u1, u2, …, uk} be a subset of a vector space V. (Recall that a vector space
can either be Rn or any subspace of the Rn.) Then

S is called a basis for V if

1. span{u1, u2, …, uk} = V; and

2. u1, u2, …, uk are linearly independent.


In other words, S is a smallest possible set of vectors in V that generate every vector in V.

The number of vectors in a basis for V is called the dimension of V and is denoted by dim V. In other
words, if {u1, u2, …, uk} is a basis for V, then dim V = k. It is the smallest possible number of vectors
that can generate V.

Example 1: Basis for R3 (slide 4-6)

We have seen that the standard basis vectors {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is a basis for R3.

So dim R3 = 3.

A vector space can have many different bases, but they always have the same dimension.
In other words, there are many different ways to form building blocks (with the same number) for
the vector space.

22
For example, in R3, it is not difficult to see that:

{ (2, 0, 0), (0, 2, 0), (0, 0, 2) } is a basis for R3 (multiplying the standard basis vectors by scalar 2, will
still give a building block for R3).

{ (1, 2, 1), (2, 9, 0), (3, 3, 4) } is another basis for R3. This one is not so obvious. We shall now verify it
by checking the two conditions for being a basis:

(i) (1, 2, 1), (2, 9, 0), (3, 3, 4) are linearly independent:

We set up the vector equation (linear system):

and perform Gaussian elimination:

Since there is no non-pivot column in the row echelon form, the system has only trivial solution. So
(1, 2, 1), (2, 9, 0), (3, 3, 4) are linearly independent.
(ii) span{ (1, 2, 1), (2, 9, 0), (3, 3, 4) } = R3:

Let (x, y, z) be any (general) vector in R3. We set up the vector equation:

and perform Gaussian elimination:

The system has a solution for any values of x, y, z as constant terms. This means every vector in R3
can be written as a linear combination of (1, 2, 1), (2, 9, 0), (3, 3, 4). So the three vectors will span
(generate) the entire R3.

By (i) and (ii), we conclude { (1, 2, 1), (2, 9, 0), (3, 3, 4) } is a basis for R3.

Basis for Linear Span (slide 7)


Let V = span{ u1, u2, …, uk } where u1, u2, …, uk are some vectors in Rn. So V is a subspace of Rn.

If {u1, u2, …, uk} is linearly independent, then it is a basis for V.

If {u1, u2, …, uk} is linearly dependent, then there are redundant vectors among them. We need to
remove the redundant vectors to get a basis for V.

Note: Suppose uk is a redundant vector among u1, u2, …, uk. Then

span{u1, u2,…, uk} = span{u1, u2, …, uk-1}

23
Example 2: Basis for Linear Span (slide 8-9)

Let V = span{ (1,1,1), (1,-1,-1), (1,0,0) } , which is a subspace of R3.

We shall see that { (1, 1, 1), (1, -1, -1) } is a basis for this subspace V by checking the two conditions:

(i) (1, 1, 1), (1, -1, -1) are linearly independent:

Since there are only two vectors here, we just observe that (1, 1, 1) and (1, -1, -1) are not scalar
multiple of each other. Hence the two vectors are linearly independent.

(ii) span{ (1, 1, 1), (1, -1, -1) } = V:

We shall show (1,0,0) is a redundant vector in span{ (1,1,1), (1,-1,-1), (1,0,0) } by showing it is a linear
combination of (1, 1, 1) and (1, -1, -1).

By performing Gaussian elimination or simply by inspection, we get

(1,0,0) = ½ (1,1,1) + ½ (1,-1,-1).

So (1,1,1) and (1,-1,-1) is sufficient to generate V, which is the same as saying

span{ (1, 1, 1), (1, -1, -1) } = V.

Hence we can conclude that { (1, 1, 1), (1, -1, -1) } is a basis for V.

Basis for Solution Space (slide 10)

Find a basis for and determine the dimension of the solution space of the homogeneous system
 2v + 2w − x + z = 0
 − v − w + 2 x − 3y + z = 0


 x + y + z = 0
 v + w − 2 x − z = 0

This is a system with 5 variables, and hence its solution space is a subspace of R5. After performing
Gaussian elimination, the general solution is given on the left side below:

Like before, we can rewrite this expression as a general linear combination of some fixed vectors, as
given on the ride side above. Let’s denote the two vectors in the linear combination by:
−1 −1
⎛1⎞ ⎛0⎞
u1 = ⎜ 0 ⎟ , u2 = ⎜−1⎟
0 0
⎝0⎠ ⎝1⎠
So the solution space = span{u1, u2}. This is one of the conditions that u1, u2 is a basis for the solution
space.

To show the other condition that u1, u2 are linearly independent, we can just compare whether these
two vectors are scalar multiplication of each other. Since it is not, we conclude that u1, u2 are linearly
independent.

24
So we can conclude that {u1, u2} is a basis for the solution space of the given system, and

dim(solution space) = 2.

We observe that:

dim(solution space) = number of parameters in the general solution.

This is true in general.

Dimension of Solution Space (slide 11)

Given any homogeneous system, to find a basis and dimension for the solution space, the standard
approach is to perform Gaussian elimination to get the row echelon form R.

The number of non-pivot columns in the row echelon form corresponds to the number of
parameters in the general solution. This number in turn corresponds to the number of vectors in a
basis for the solution space. By definition, this is the dimension of the solution space. Hence the four
numbers in the diagram above are all the same.

Basis for Nullspace (slide 12)

Recall that the nullspace of a matrix A is the solution space of the homogeneous system Ax = 0. So

- a basis for the nullspace of A is same as a basis for the solution space of Ax = 0.

- the dimension for the nullspace of A is same as the dimension for the solution space of Ax = 0.

The dimension of the nullspace of A is also called the nullity of A, and it is denoted by nullity(A).

Example: To find the nullspace and nullity of

we look at the homogeneous system Ax = 0

 2v + 2w − x + z = 0

 − v − w + 2 x − 3y + z = 0

 x + y + z = 0
 v + w − 2 x − z = 0

Since we know from earlier discussion that a basis for this system is given by

so it is also a basis for the nullspace of A, and nullity(A) = 2.

25
Basis for Row Space and Column Space (slide 13)

Let’s start with an example:

 1 2 2 1
 
 3 6 6 3
 4 9 9 5
A= 
− 2 − 1 − 1 1
 
 5 8 9 4
 4
 2 7 3 

To find a basis for the row space and column space, we need to look at the row echelon form R.

Row space
There are three non-zero rows v1 v2 v3. These three nonzero rows are obtained by performing row
operations on the original rows of A, and hence they are linear combinations of the rows of A. This
means they belong to the row space of A. In fact, v1 v2 v3 forms a basis for the row space of A.

In general,

the non-zero rows in the r.e.f. R of a matrix A  a basis for the row space of A.

The dimension of the row space will therefore be given by the number of nonzero rows of the row
echelon form. If you recall from the earlier chapter, this is the rank of the matrix:

dim(row space of A) = rank(A).

Column space
We look at the pivot columns of the row echelon form.

However, these pivot columns are not linear combinations of the columns in the original matrix A,
and hence they may not belong to the column space of A. Nevertheless,

the columns of A corresponding to pivot columns of R  a basis for the column space of A.

In our example, since the pivot columns of the row echelon form R are the first three columns, we
will look at the corresponding first three columns of the original matrix A.

26
These three columns c1 c2 c3 form a basis for the column space of A.

The dimension of the column space of A will be given by the number of pivot columns of the row
echelon form. This number is again the rank of A.

dim(column space of A) = rank(A).

In other words, the dimension of the row space and the column space of a matrix are the same,

dim(row space of A) = dim(column space of A)

and they are both equal to the rank of the matrix.

Dimension Theorem for Matrices (slide 14)

If A is a matrix with n columns, then rank(A) + nullity(A) = n.

Recall that

rank(A) = dim(column space of A) and nullity(A) = dim(nullspace of A).

That explains why this result is called the Dimension Theorem for Matrices.

We look at the row echelon form R of the matrix A.

The pivot columns correspond to a basis for the column space of A, so


the number of pivot columns = rank(A).
On the other hand, the non-pivot columns correspond to the parameters in general solutions of the
homogeneous system Ax = 0, which in turn corresponds to a basis for the nullspace of A. so
the number of non-pivot columns = nullity(A).

The following diagram explains why the dimension theorem of matrices works.

27
2.6 Coordinate Vectors
In the previous section, we introduced the basis for a vector space, which form the building blocks
for the space. In this section, we shall see how a basis provides a coordinate system and serves as a
unit of measurement” for the vectors in the vector space.

Uniqueness Expression in terms of Basis (slide 2)


Let S = {u1, u2, …, uk} be a basis for a vector space V. Then
every vector v in V can be expressed in the form v = c1u1 + c2u2 + ··· + ckuk in exactly one way.
i.e. there is only one possible set of values for c1, c2, …, ck.
The first part of the property “can be expressed in the form v = c1u1 + c2u2 + ··· + ckuk” follows from
the fact that the basis spans the entire V.
The second part “exactly one way” is a consequence that the basis is linearly independent. What this
means is that, different linear combinations of the basis vectors will give different vectors in the
space V.
For example, suppose {u1, u2, u3} is a basis for R3. If the linear combination 3u1 + 5u2 + 2u3 = v and the
linear combination 2u1 + 4u2 + 6u3 = w, then v ≠ w, as the two linear combinations have different
coefficients. (Recall that, it is possible that two different linear combinations of a set of vectors can
be equal to the same vector. This will happen when the set of vectors are linearly dependent.)
Standard VS Non-Standard Basis (slide 3)
Given an arrow that represents a certain vector in R2, how do we tell the coordinates of the vector?
We need to set up the coordinate system as a reference for the units on both the x- and y-axis.

(a)

In this case, the coordinates of the vector (in red) is given by (2,3), with respect to the standard
coordinate system.
In some situation, we may need to re-orientate the coordinate system such as the following:

(b) (c)

Then the same red arrow will have different coordinates with respect to the new coordinate
systems.
How do we represent the different coordinate systems algebraically? We can do this using the
various bases for R2.
For the first coordinate system (a), it is given by the standard basis vectors, namely (1, 0) and (0, 1),
represented by the two orange unit vectors along the x-and y- axis. In other words, the two unit
vectors define the grid for the coordinate system.

28
The other two coordinate systems correspond to some non-standard bases for R2. The coordinate
system (b) is given by the basis (1, -1), (1, 1) as indicated by the two orange arrows, which gives us
the new grid for the coordinate system. The coordinate system (c) is given by another basis (1, 0),
(1, 1) and hence a different shape for the grid.
Standard Basis S1 (slide 4)
S1 = {(1, 0), (0, 1)}
The standard basis will give the unit block in pink, which in turn determines the unit of measurement
along the x and y axis.

This will then give the x and y coordinates of our vector.


Algebraically, we can write v = (2, 3) = 2(1, 0) + 3(0, 1).
The scalars (coefficients) of this linear combination are 2 and 3. Putting these two numbers together,
we get the coordinates of our vector v with respect to the standard basis S1:
(v)𝑆𝑆1 = (2, 3).
As expected, if we use the standard basis, the coordinates will be the same as the original one.
This is not the case if we use the non-standard basis.
Non-Standard Basis S2 (slide 5)
S2 = {(1, –1), (1, 1)}
The grid, which is determined by the new basis vectors is given below, and we extend the coordinate
axis, which we refer to as the 1st axis and 2nd axis, to differentiate them from the x- and y- axis. The
basis gives the unit block in pink, which in turns determine the unit of measurement along the 1st
and 2nd axis.

This time we see that the two coordinates of our vector become -0.5 and 2.5 respectively.
Algebraically, we can write v = (2, 3) =-12 (1, -1) + 52(1, 1).
The scalars of this linear combination are -1/2 and 5/2 respectively. Putting these two numbers
together, we get the new coordinates of v with respect to the non-standard basis S2:

29
(v)𝑆𝑆2 = (−12, 52).

We call this the coordinate vector of v relative to basis S2.


Non-Standard Basis S3 (slide 6)
S3 = {(1, 0), (1, 1)}
This time the grid has a distorted shape, given by this basis, and the coordinate axis are not
perpendicular to each other. The basis gives the unit block in pink, which in turns determine the unit
of measurement along the two axis. 2nd-axis

The two coordinates of our vector are -1 and 3 respectively.


Algebraically, we can write v = (2, 3) = –(1, 0) + 3(1, 1).
The scalars of this linear combination are -1 and 3 respectively. Once again, putting these two
numbers together, we get the new coordinates of our vector v with respect to the non-standard
basis S3:
(v)𝑆𝑆3 = (-1, 3)
This gives us the coordinate vector of v relative to basis S3.
Coordinate Vectors (slide 7)
We will now extend the notion of coordinate vectors to a general vector space.
Let S = {u1, u2, …, uk} be a basis for a vector space V and v a fixed vector in V.
We know that v can be written as a linear combination of this basis and in a unique way, as discussed
earlier.
v = c1u1 + c2u2 + ··· + ckuk
Consider this unique set of scalars in the linear combination. We call them the coordinates of the
vector v relative to the basis S. If we use these coordinates to form a k-vector, we will call this the
coordinate vector of v relative to the basis S:
(v)S = (c1, c2, …, ck)
This coordinate vector (v)S is uniquely determined by the vector v and the basis S. It will change if we
use a different basis for the same vector v.
We should also take note that, the order of the vectors in the basis have to be fixed. If we change the
order of some ui and uj in the basis, then the order of the coordinates will also change accordingly.
Example: Coordinate Vectors (slide 8-9)
Let S = { (1, 2, 1), (2, 9, 0), (3, 3, 4)}. Check that this is a basis for R3.
(a) Find the coordinate vector of v = (5, –1, 9) relative to S.
(b) Find a vector w in R3 such that (w)S = (–1, 3, 2).
The two questions are reverse of each other:

30
For (a), we are asking: given a vector v, how do we find (v)S?
For (b), we are asking: given coordinate vector (w)S, how do we recover w?
(a) We solve the equation a(1, 2, 1) + b(2, 9, 0) + c(3, 3, 4) = (5, –1, 9).
By using Gaussian elimination, we get a = 1, b = -1, c = 2. i.e. v = 1(1, 2, 1) – (2, 9, 0) + 2(3, 3, 4).
So (v)S = (1, –1, 2).
(b) We have (w)S = (–1, 3, 2). So we substitute a = -1, b = 3 and c = 2 into
w = a(1, 2, 1) + b(2, 9, 0) + c(3, 3, 4)
to get w = -(1, 2, 1) + 3(2, 9, 0) + 2(3, 3, 4) = (11, 31, 7).
Meaning of Basis (slide 10)
Here’s a summary of what we have discussed about bases for a vector space V.
1. A basis for V is a building block of V
2. A basis for V is a “unit of measurement” for vectors in V.
3. A basis for V gives a “coordinate system” for V.
Note that a basis for “Rn” is not necessarily a basis for a “subspace of Rn”
For example, in the 3-space, we know that the standard basis vectors give a basis for R3.

But for a subspace V of R3 that is represented by a plane, the standard basis for R3 is not a basis for V.
As shown in the diagram below, the vectors e1, e2, e3 may not be on the plane itself.

Instead, a basis for V should come from the subspace itself, and it this case, two vectors will be
enough to generate V.

31
2.7 Projection and Linear Approximation
In this section, we discuss the concept of projection and linear approximation, which are
related to dot product and the notion of orthogonality. In particular, we shall see how to
find approximate solutions for systems that are inconsistent.
Projection of a Vector onto a Plane in R3 (slide 2)

For a 3-vector (x y z) in R3, its projection onto the xy-plane is given geometrically as follow.

We drop the vector v (in red) perpendicularly onto the xy-plane to get the vector p (in green). The
coordinate of p is (x, y, 0). (The x and y coordinate is the same as v, but the z coordinate is replaced
by 0.)

If we complete the triangle as shown, this third side of the triangle is given by the vector n (in blue),
which is equal to v - p. So its coordinate is (0 0 z). Note that this vector n is perpendicular to the xy-
plane. Recall we also refer to perpendicular as orthogonal.

From this observation, we have:

In fact, this holds not just for the xy-plane, but any plane in R3 that contains the origin.

We will further generalize the notion of projection to Rn, with the plane replaced by a more general
subspace of Rn.

Projection of a Vector onto a Subspace of Rn (slide 3)


Let V be a subspace of Rn and u a vector in Rn.
If p is a vector in V such that u – p is orthogonal to V, then p is called the projection of u onto V.

What do we mean by the vector u – p is orthogonal to a subspace V? This is defined to be:

u − p is orthogonal to every vector v in V, i.e. the dot product (u − p) ∙ v = 0.

Note that we always refer to a projection with respect to certain subspace of Rn. Furthermore, every
vector has exactly one projection onto a given subspace.

Projection and Best Approximation (slide 4)

Suppose we are given a line and a point not on the line. Intuitively, the point on the line that is
closest to the given point is the perpendicular drop onto the line. This can be expressed in terms of
vectors:

32
p is the projection of u onto the line, and the distance between the vector u and its projection will
give the shortest distance from u to the line.

So we say: p is the best approximation of u in the line.

Similarly, if we are given a plane and a point not on the plane. Like before, the point on the plane
that is closest to the given point is the perpendicular drop onto the plane.

p is the projection of u onto the plane, and the distance between the vector u and its projection will
give the shortest distance from u to the plane.

Again we say: p is the best approximation of u in the plane.

Best approximation of a Vector in a Subspace (slide 5)

The above observation can be generalized to any n-space.

Let V be a subspace in Rn and u ∈ Rn. The subspace is not necessarily a line or a plane.

Before we continue, recall that for an n-vector u, the norm of u is given by

‖u‖ = �𝑢𝑢1 2 + 𝑢𝑢2 2 + ⋯ + 𝑢𝑢𝑛𝑛 2

Now let p be the projection of u onto V. Then


||u - p|| ≤ ||u - v|| for any vector v in V.

What the inequality says is that, among all the vectors v in the subspace V, the projection p is the
one that is nearest to u. i.e. p is the best approximation of u in V.

Approximate Solutions of Linear Systems (slide 6)

We have seen how the notion of projection and orthogonality can be applied to the study of best
approximation. We shall now relate this to linear system.

We all know there are two types of systems, those that are consistent with solution, and those that
are inconsistent without solution.

33
For consistent system, we have a systematic way to find the exact solutions. However, in real life,
many linear systems that arise from mathematical modelling may not have exact solutions. For such
inconsistent systems, we will find approximate solutions.

Example: Inconsistent System (slide 7-8)

Suppose there are three physical quantities r, s, t that are related by a quadratic equation:

t = cr2 + ds + e (*)

for some constant values c, d, e. We conduct an experiment to determine the exact values of c, d, e.
To do this, we get the measurement for r, s, t from our experiments, substitute these readings into
the quadratic equation (*), and set up a linear system with c, d, e as variables.

Now suppose we repeat the experiment 6 times and obtained 6 sets of values as show in the table.

The top row indicates the experiment number, the second and third rows are our input
measurements for r and s, while the bottom row gives the corresponding output measurement for t.

Like most experiments, while carrying out the measurements, there bound to be some experimental
errors, that may lead to an inconsistent system.

Since there are 6 set of readings, we will have 6 equations, and this will form a 6 x 3 linear system as
shown in the matrix equation form here.

Note that the three columns of the coefficient matrix A correspond to the coefficients of c, d, and e
respectively, and the constant matrix consists of the measurements of t.

Perform Gaussian elimination to this system to convince yourself that it is inconsistent. In other
words, for any value we substitute for the column vector x, Ax is not equal to b, and hence

Ax − b ≠ 0.

We shall see in a while how to find the “best approximate” solution for Ax = b. Bur for now, let us try
to understand what do we mean by “best approximate” solution.

Since Ax – b is not equal to zero, we want to find some value for x so that the difference can be as
close to 0 as possible.

34
In other words, we will try to find some vector u such that the norm ||Au − b|| is the smallest.

Such a u is called a least squares solution to the system Ax = b.

Least Squares Solutions (slide 9)

A least squares solution of linear system Ax = b is a vector u in Rn that minimize ||Ax − b||.

i.e. ||Au − b|| ≤ ||Av − b|| for all v in Rn

In other words, Au gives the best approximation for the vector b ( Au ≈ b ).

From earlier discussion, this best approximation Au of b is given by the projection of b onto a certain
subspace:

Au = p where p is the projection of b onto the column space* of A.

In other words,
a least squares solution u of Ax = b is a solution of Ax = p

* The projection is on the column space of A because the product Au can be expanded as a linear
combination of the columns of A (see below), which is in the column space of A.

Since p = Au, it must belong to the column space of A.


Least Squares Solutions and Projection (slide 10)

A. If we know the projection p of b onto the column space of A, we can find the least squares
solutions u of Ax = b by solving Ax = p.

B. If we know a least squares solution u of Ax = b, we can find the projection p of b onto the
column space of A simply by matrix multiplication Au.

In principle, statement (A) gives us a way to find least squares solution. The problem is that we may
not know the projection p in the first place. So in practice this is not an effective way to find least
squares solution. Instead we introduce an alternative approach in the next segment.

Finding Least Squares Solutions (slide 11)

• Starting with the system Ax = b (1)


• Form the new linear system ATAx = ATb (2)
• Solve the system (2)
• A solution of (2) gives a least squares solution of (1)
System (2) ATAx = ATb is always consistent. It can have exactly one solution, or infinitely many
solutions.

35
Example: Finding Least Squares Solutions (slide 12)
In our earlier experiment example:

Denote this system by Ax = b and recall that the system is inconsistent.

Now we consider the transpose of the coefficient matrix

and pre-multiply it to both A and b to get

and

The new system ATAx = ATb is consistent and its solution u can be found (by Gaussian elimination):

This vector u is the least squares solution of Ax = b, giving the best approximate solution.

Example: Least Squares Solutions and Projection (slide 13)

Find the projection of (1,1,1,1) onto the subspace V = span{(1,-1,1,-1), (1,2,0,1), (2,1,1,0)} of R4.

First, we need to set up a linear system and find its least squares solutions.

We use the three vectors in the linear span of V to form a 4 x 3 matrix A and the vector (1, 1, 1, 1) to
be b in column form:  1 1 2  1
   
−1 2 1  1
A= b= 
 1 0 1  1
   
 −1 1 0   1

Note that V is the column space of A.

Now we have a linear system Ax = b. We find its least squares solution by solving ATAx = ATb.
 −t + 52 
The general solution of the new system is given by  
x =  −t + 54 
 t 
 
Note that there are infinitely many solutions for this system. All of them are least squares solutions
of Ax = b. We just need one particular solution, say let t = 0, which gives
 52 
 
u =  54 
0
 

6
1 1 2 2 5
−1 2 1 5 ⎛65⎞
Then Au= � � � 45 � = ⎜2⎟ will give the required projection onto the subspace V.
1 0 1 5
−1 1 0 0 2
⎝5⎠

36
2.8 Vector and Matrices with Function Entries
Up till now, the vectors and matrices that we have been discussing have constant entries. In other
words, the entries are numbers. Instead of constant entries, we can also talk about vectors and
matrices with their entries or components made up of functions. These objects are important as they
will help us deal with differential equations and their solutions, which we will introduce in later
chapters.
Vectors with Function Entries (slide 2)
Let v1(t), v2(t), …, vn(t) be real-value functions in variable t. Then we can form the n-vector v(t) with
these functions as entries:

We call this a vector function in variable t.


Examples:

Note that the last component of u(t) is regarded as a constant function 2 independent of the variable
t.
By substituting the variable t in a vector function with real value will give us a regular vector. For
example:

Domain of Vector Functions (slide 3)


The component functions v1(t), v2(t), …, vn(t) of a vector function v(t) may not be defined for all the
real numbers. There may be restrictions to what values we can substitute to each individual function.
For example, the square root function √𝑡𝑡 cannot take negative input values.
Let’s denote the domains of the component functions as D1, D2, .., Dn respectively. These are the sets
of values that the respective functions are defined.

Since the vector function v(t) is putting all the components together, the values we can input into t
must be those common values that are defined for every component function:
the domain of v(t) = D1 ∩ D2 ∩ ⋯ ∩ Dn
In the event that there is no common intersection among the domains of the component functions,
then the vector function v(t) is undefined.

37
Examples: Domain of Vector Functions (slide 4)

All three components are defined for all real numbers.


D1 = D2 = D3 = R ⟹ D1 ∩ D2 ∩ D3 = R
So the domain of v(t) is the set of all real numbers.

The domain of √t is t ≥ 0, the domain of √1- t is t ≤ 1, and the domain of t+1


1
is t <
−1 or t > −1.
So the domain of u(t) is 0 ≤ t ≤ 1.

The domain of √t is t ≥ 0, and the domain of √-1 - t is t ≤ −1.


There is no common intersection, so w(t) is undefined.

Matrices with Function Entries (slide 5)


Let a11(t), a12(t), …, anm(t) be real-value functions in variable t.
Then we can form the n x m matrix A(t) with these functions as entries:

Examples

Vector and Matrix Operations with Function Entries (slide 6)


Operations that we perform on regular vectors and matrices, can be carried over to vector and
matrix functions in a similar way:
Vector/matrix addition, scalar multiplication, matrix multiplication, (vector) dot product, matrix
transpose, determinant etc.
Examples

Since both u(t) and v(t) are 3-vector functions, we can perform vector addition in the usual way, by
adding the corresponding component functions.

38
 
Scalar multiplication is also carried out in similar manner. Here the scalar can be a constant number, 
or a real value function in variable t. In the example below, we scalar multiply the vector function 
v(t) with the “scalar” which is the exponential function et: 

The third example is the multiplication of A(t) with u(t). Since the sizes are compatible, the 
multiplication can be carried out in the usual way: 

Addition and Multiplication of Functions (slide 7) 
One thing to take note when carrying out the operations is to ensure the “compatibility” of the 
domains of the functions involved. In other words, the functions that we are adding or multiplying 
must have some common intersection in their domains.
Example 
 
 
Though we can conveniently add the component functions, we need to pay attention whether the 
resulting function is defined on any domain. 
As we have seen before, The domain of v(t) is R, and the domain of u(t) is 0 ≤ t ≤ 1. 
So the domain of v(t) + u(t) is 0 ≤ t ≤ 1. 

Derivatives of Vectors and Matrices (slide 8) 
We can perform differentiation to a vector and matrix function if it is differentiable. 
More precisely, given a vector function v(t) with component functions v1(t) to vn(t). Suppose every 
component function is differentiable, then we say the entire vector function is differentiable too. 

 
When this condition is met, we can differentiate v(t) w.r.t. t to get v’(t), which is another vector 
function. The component functions of v’(t) are the corresponding derivatives of the component 
functions of v(t). 

 
Naturally, we call v’(t) the derivative of v(t). 
We can define derivative for matrix function in a similar way. 

Differentiation Rules (slide 9) 
The usual differentiation rules also apply to matrix functions. 
Let A(t) and B(t) be differentiable matrices, f(t) a differentiable function, and c a constant. 

39 
 
1. (A(t) + B(t))' = A'(t) + B'(t) (addition rule)
2. (cA(t))' = cA'(t) (scalar multiplication rule with constant scalar)
3. (f(t)A(t))' = f(t)A'(t) + f’(t)A(t) (scalar multiplication rule with function scalar)
4. (A(t)B(t))' = A(t)B’(t) + A’(t)B(t) (product rule)
The first three rules also apply to vector functions.

Example: Matrix Differentiation (slide 10)

We end this chapter with an example to illustrate product rule on two matrix functions.
Let A(t) and B(t) be the following 2 x 2 matrix functions,

Their derivatives are given by:

We compute the derivative of the product A(t)B(t) using product rule:


(A(t)B(t))' = A(t)B’(t) + A’(t)B(t)

The same result can be obtained by multiplying A(t) and B(t) first, then differentiating.

40

You might also like