Quantum Information (WiSe 22/23) Last updated: 2022-10-18
Chapter 0: Linear Algebra for Quantum Information
Instructor: Dr. Christian Schilling Scriber: Ignacio Cirac and Sirui Lu
1 Introduction: motivation and historical overview
Last century, there were many scientific and technological revolutions but there were two that
changed very much not only our society but also the way we look at nature, the way that we
do science, and so on. One of them was the discovery of quantum physics. The second one was
the discovery of information theory. Quantum Information Theory combines the two revolutions
and analyzes how to process, transmit, and obtain information by harnessing the laws of quantum
physics.
Quantum information is a highly interdisciplinary field combining physics, math, computer sci-
ence, and electrical engineering. This course offers an introduction to the theoretical foundations
of Quantum Science and Technology. We start with a brief motivation and an introduction to
fundamental concepts and the basic formalism of quantum information (pure/mixed states, evo-
lution, completely positive maps, measurements, Schmidt decomposition, tomography, quantum
estimation, hypothesis testing). Then the concept of entanglement is discussed in detail, including
the distinction between pure and mixed-state entanglement, entanglement entropy, quantification
and conversion. Subsequently, some of the revolutionary promises of exploiting entanglement
are presented, including dense coding, quantum teleportation and quantum cryptography. Next
the Bell inequalities, characterizing the quantum weirdness of entanglement and non-locality, are
introduced and discussed in detail. Subsequent chapters cover central applications of quantum
information theory: quantum computation, quantum algorithms such as those of Deutsch, Shor
and Grover, and quantum simulation. Final core topics are decoherence, Lindbladian descriptions
thereof, and error correction schemes to counteract the consequences of decoherence and protect
fragile quantum information. The module will typically also include one or more optional top-
ics, such as many-body entanglement, topological quantum computation, quantum complexity, or
tensor networks, which link quantum information theory to many-body physics.
Quantum information theory relies heavily on linear algebra in finite-dimensional spaces, a topic
that we will introduce in this chapter. Here, we will quickly review key concepts of linear algebra
that will be relevant for quantum information. We expect that the readers already have some basic
knowledge of linear algebra and quantum mechanics. The review is not meant to be mathematically
rigorous and complete. For references that include more details and proofs, please consult standard
books on linear algebra and textbooks on quantum information such as Nielsen and Chuang (2011).
2 Vectors
2.1 Linear space
We consider (complex) linear spaces L = Cd whose elements are called vectors. They can be
written in terms of their components: v = v1 , . . . , vd , w = w1 , . . . , wd where vi , wi ∈ C are
complex numbers, and d is called the dimension of the linear space. We will write them as column
vectors, e.g.
1 1−i
v = 1 , w = −1 . (2.1)
1+i −1
The vector space inherits many nice properties from linear space. In particular, a linear combina-
tion of vectors in the vector space L also lie in L, i.e., we can add vectors, v + w, multiply them
by scalars, λv, or both λ1 v + λ2 w, where λi ∈ C.
Pd
Linear independence A set of vectors {vi }di=1 is linearly independent if i=1 αi · vi = 0 implies
all αi = 0. Equivalently, if none of them can be written as a linear combination of the rest.
1
Chapter 0 Quantum Information (WiSe 22/23)
Basis A set of vectors {vi }di=1 is a basis for Cd if they are linearly
Pnindependent and for any vector
w ∈ Cd , it can be written as a linear combination of them w = i=1 αi · vi . The dimension of the
linear space, d, coincides with the number of elements of any basis.
2.2 Hilbert space
In the case of finite dimensions, a Hilbert space is a linear space equipped with an inner product
operation. We write H = Cd for a d-dimensional one.
Inner product The inner product (v, w) of two vectors v, w ∈ Cd is defined as
d
X
(v, w) = v i wi ∈ C. (2.2)
i=1
We also use v for conjugate. The inner product has a conjugate symmetry (v, w) = (w, v).
Adjoint This operation transforms column vectors to row vectors and vice versa, and conjugates
their components. We write v † as the adjoint of the vector v. In vector form, we write
v1
.. †
v = . , v = v1 , · · · , vd . (2.3)
vd
With this notation, we can write the inner product as
w1
(v, w) = v † w = v 1 , · · · , v d ... (2.4)
wd
where the product of a row and a column vector follows from (2.2).
Bra–ket/Dirac notation For convenience, quantum physicists often denote vectors v as |vi and
their adjoint v † as hv|. With this notation, the inner product is written as (v, w) = hv| wi. From
now on, we will use the vector notation and the Dirac notation interchangeably.
Norm The Euclidean norm of a vector v ∈ Cd is defined as
v
u d
uX
kvk2 = kvk = t |ci |2 = hv| vi. (2.5)
p
i=1
We say a vector is normalized if kvk=1.
Distance Whenever we have a norm, we can also define a distance between two vectors
d(v, w) = kv − wk. (2.6)
Topology A distance defines a topology and thus we can define limits. Consider a Hilbert
space H and an infinite sequence v1 , v2 , . . . , vn . This sequence is said to converge to v ∈ H if
limn→∞ kvn − vk = 0.
Orthogonality Vectors v and w are orthogonal if their inner product is zero
v ⊥ w ↔ (v, w) = 0. (2.7)
2
Chapter 0 Quantum Information (WiSe 22/23)
Orthonormal basis (ONB) A basis of vectors {vi }di=1 is an ONB if (vi , vj ) = δi,j , i.e. they are
mutually orthogonal and normalized.
Example 2.1. For a linear space of dimension d = 3, two different ONB are given by
1 0 0 1 1 0
0 , 1 , 0 1 1
√ 1 , √ −1 , 0 . (2.8)
2 2
0 0 1 0 0 1
Subspace A subspace S of H is a linear subspace of H if ∀ v1 , v2 ∈ S and λ1 , λ2 ∈ C, λ1 v1 +λ2 v2 ∈
S. That is, linear superpositions do not take us out of the subspace.
3 Matrices
Matrices are representations of linear maps on a linear space in a given basis. We call Mn×m (C),
the set of n × m matrices with complex entries, where n is the number of rows, and m the number
of columns. We write
M1,1 M1,2 · · · M1,m
.. ..
. .
M2,1
M = . .. (3.1)
..
.. . .
Mn,1 Mn,m
with Mi,j ∈ C being the matrix element corresponding to the i-th row and j-th column. Depending
on n and m, we have different kinds of matrices: (i) m = 1: column vectors; (ii) n = 1: row vectors;
(iii) m = n: square matrices.
Special square matrices The identity matrix 1 is the one which has zero entries except in the
diagonal, which are ones. A zero matrix is a matrix all of whose entries are zero.
Linear space The set of linear maps acting on a Hilbert space, H, is denoted L(H). Matrices
can be viewed as the operators written in an ONB. If we denote by {|ii}di=1 ∈ Cd an orthonormal
basis, and M ∈ L(H), we can write the matrix with elements Mi,j = hi| M |ji. For the moment,
we just talk about matrices, but one has to keep in mind for the lecture that they are associated to
linear operators. The set L(H) as well as Mn×m (C) span themselves a linear space, i.e., addition
and scalar multiplication does not take us out of the set L(H), λM , M1 + M2 ∈ Mn×m (C).
3.1 Operations with matrices
Multiplication Matrices can be multiplied: If A is a n × m matrix, and B a m × q matrix, then
M = AB is a n × q matrix with elements
m
X
Mi,j = Ai,k Bk,j . (3.2)
k=1
For any matrix A ∈ Mn×m (C), we have A1 = 1A = A and 0A = A0 = 0.
We can use the Dirac notation as well. Let us take |ii ∈ Cn and |ji ∈ Cm and X = |iihj|. This
can be interpreted by using the multiplication of the n × 1 and 1 × m matrices corresponding to
the ket and the bra. In this way, X isPan nP × m matrix with zeros everywhere except Xi,j = 1.
n m
With that, we can write (3.1) as M = i=1 i=1 Mi,j |ii hj|.
We can multiply matrices and vectors, if q = 1, so we can write, for instance, M v = w or,
using Dirac’s notation, M |vi = |wi. Multiplication is a linear operation, meaning that M (λ1 |v1 i +
λ2 |v2 i) = λ1 M |v1 i + λ2 M |v2 i).
Example 3.1. (Completeness relation) It is clear that for any ONB, {vi }di=1 ,
d d
vi vi† =
X X
|vi ihvi | = 1 (3.3)
i=1 i=1
Pd
This can be seen by applying the left hand side to any vector w = i=1 ci vi and check that it leaves
it invariant.
3
Chapter 0 Quantum Information (WiSe 22/23)
Inverse The inverse of a matrix M , if it exists, is called M −1 , and it fulfills M −1 M = 1. If the
inverse does not exist, the matrix M is called singular. For instance, the inverse of the identity
matrix is itself, and the zero matrix is singular.
Adjoint Given a n × m matrix, M , its adjoint M † is the one that fulfills (w, M v) = (M † w, v), or
in the Dirac notation hw| |M |vi = hv| M † |wi, for all v ∈ Cm and w ∈ Cn . It is easy to check that
the adjoint operation is the one in which the elements are transposed and complex conjugated.
That is, (M † )i,j = M j,i .
Taking the adjoint of linear combination of matrices fulfills (λ1 A1 + λ2 A2 )† = λ1 A†1 + λ2 A†2 . The
product of matrices also has a very simple rule, (AB)† = B † A† .
Example 3.2. Given |vi , |wi ∈ Cd , (|vi hw|)† = |wi hv| are d × d matrices.
Commutator The commutator between two matrices A and B is defined to be [A, B] = AB −BA.
If [A, B] = 0, then we say A, B commute. Similarly, the anti-commutator between two matrices A
and B is defined to be {A, B} = AB + BA. If it vanishes, we say that they anticommute.
3.2 Normal matrices
We consider now a class of matrices that play an important role in Quantum Information Theory.
They are square, i.e., d × d matrices.
Normal matrix A matrix N is normal if it commutes with its adjoint [N, N † ] = 0. In this course,
we will be mostly interested in the following subclasses of normal matrices: Hermitian, unitary,
and projectors.
Hermitian matrix A matrix H is Hermitian (equivalently, self-adjoint) if H = H † .
1
i
Example 3.3. Check that the matrix is Hermitian.
−i −1
Unitary matrix A matrix U is unitary if U −1 = U † . Equivalently, U U † = U † U = 1. Applying
an unitary matrix on to a vector leaves the norm invariant, kU vk = kvk.
cos θ sin θ
Example 3.4. Check that the matrix is unitary.
− sin θ cos θ
Projector A matrix P is a projector if P 2 = P . If, additionally, P = P † , then it is called an
orthogonal projector. The set of vectors fulfilling P v = v spans a subspace, S. One says that P
projects a vector onto that subspace: for any w, P w ∈ S.
1 1 1
Example 3.5. Check that 2 is an orthogonal projector and S has dimension 1 with ONB
1 1
1
given by the single vector .
1
3.3 Spectral decomposition of normal matrices
The properties reviewed here are for normal matrices. From now on we will use Dirac’s notation.
Eigenvalues and eigenvectors If M |vi = λ |vi we say that λ is an eigenvalue of M and |vi is
its corresponding eigenvector. The set of eigenvalues is called spectrum and written as σ(M ). For
normal matrices, one can always choose d eigenvectors that form an orthonormal basis. That is,
we can always find an ONB {|vi i}di=1 , with M |vi i = λi |vi i with λi ∈ σ(M ).
Example 3.6. The eigenvalues of the three subclasses of normal matrices we have seen are special.
1. Hermitian: all eigenvalues λi ∈ R.
2. Unitary: all eigenvalues λi = eiφi for φi ∈ [0, 2π].
3. Projector: all eigenvalues λi ∈ {0, 1}.
4
Chapter 0 Quantum Information (WiSe 22/23)
Spectral decomposition For a normal matrix, M , we can always write
n
X
M= λi |vi i hvi | (3.4)
i=1
which is called the spectral decomposition of M .
Degeneracy We say that an eigenvalue λi is degenerate if it has more than one mutually or-
thogonal eigenvectors. The degeneracy of λi , gi , P is the maximal number of such eigenvectors, ie
M |vi,α i = λi |vi,α i, with hvi,α |vi,β i = δα,β . Thus, i gi = d. We can define
gi
X
Pλi = |vi,α i hvi,α | . (3.5)
α=1
It is clear that Pλi are orthogonal projectors. The subspace Si they project on is called eigenspace
corresponding to the eigenvalue λi and they have dimension gi . For any |vi ∈ Si , M |vi = λi |vi.
We can write the spectral decomposition as
X
M= λPλ . (3.6)
λ∈σ(M )
Example 3.7. Given any |wi, and the eigenvalues and eigenvectors of a matrix, M , we can write
Pd
|wi = i=1 ci |vi i where ci = hvi |wi. Therefore
d
X d
X
M |wi = λi |vi i hvi | wi = λi ci |vi i . (3.7)
i=1 i=1
Diagonalzation Given a normal matrix M , we can always write M = U DU † , where U is a unitary
matrix and D is a diagonal matrix. Indeed, using (3.4), we can write Di,i = λi and Ui,j = hi| vj i,
where |ii is the vector with zero entries except for a one at the i-th position. It is clear that U
defined in this way is unitary, since
d
X X
Ui,j U k,j = hi|vj ihvj |ki = δi,k (3.8)
j=1 j=1
and thus U U † = 1.
Example 3.8 (Spectral decomposition of a degenerate matrix). For the matrix M given by
2 1 0
M = 1 2 0 , (3.9)
0 0 1
the eigenvalues and corresponding eigenvectors are:
0 1
1
λ1 = 1, |v1,1 i = 0 , |v1,2 i = √ −1 ;
1 2 0
(3.10)
1
1
λ2 = 3, |v2,1 i = √ 1 .
2 0
The eigenvectors |v1,1 i and |v1,2 i are said to be degenerate because they are orthogonal eigenvectors
of M with the same eigenvalue λ1 = 1. We can henceforce write down the spectral decomposition
of M in terms of projectors onto each eigenspaces
M = |v1,1 i hv1,2 | + |v1,2 i hv1,2 | + 3 |v2,1 i hv2,1 | = Pλ1 + 3Pλ2 . (3.11)
5
Chapter 0 Quantum Information (WiSe 22/23)
Simultaneous diagonalization If [A, B] = 0, then we can always choose the same set of eigenvec-
tors for A and B and thus write
X X
A= ai |vi i hvi | , B = bi |vi i hvi | (3.12)
i i
where ai , bi are the eigenvalues of A and B.
Positive semidefinite matrices A matrix A is positive if it is Hermitian and all eigenvalues are
positive. Equivalently, if for all |wi, hw|A|wi > 0. We write A > 0. A matrix is called positive
semidefinite if the eigenvalues are non-negative. We write A ≥ 0. If we write A ≥ B we mean
A − B ≥ 0.
Example 3.9. An orthogonal projector is positive semidefinite. Indeed, it is Hermitian and the
eigenvalues are 0 and 1 and thus non-negative.
Rank The rank of a normal matrix is a number that counts how many eigenvalues are different
than zero. More precisely, r = d − g0 , where g0 is the degeneracy of the zero eigenvalue. Equiva-
lently, it is the sum of the dimension of all eigenspaces corresponding to non-zero eigenvalues. For
instance, for unitary operators, r = d. One says that they have full rank. For general matrices
(not necessarily normal), the rank is the dimension of their image space.
Example 3.10. (Rank-1 projectors) Given |vi , |wi ∈ Cd , normalized, |vi hw| is a projector and
has rank one. Note that it is not an orthogonal projector in general. |vi hv| is rank-one orthogonal
projector. Check for yourself that a rank-one Hermitian matrix has one unit eigenvalue and the
rest of the eigenvalues are zero and thus can always be written as |vi hv|.
Trace The trace of a matrix M is defined as the sum of the diagonal elements
d
X
tr(M ) = Mii . (3.13)
i=1
Pn
For any ONB, {|vi i}di=1 , we have tr(M ) = i=1 hvi | M |vi i. Thus, the trace is independent of the
ONB we take. This can be easily shown by using the completeness relation (example 3.1). Some
other useful properties of the trace are:
• Circular property: tr(AB) = tr(BA), tr(ABCD) = tr(DABC), and tr(|v1 i hv2 | A)P= hv2 | A |v1 i.
n
• Trace is equal to the sum of eigenvalues weighted by the degeneracies: tr(M ) = i=1 λi · gi ,
where we have assumed that we have n different eigenvalues λi with degeneracies gi .
Determinant We will not give here the definition, but just list several useful properties:
• det(AB) = det(A) · det(B) = det(BA).
• det(A) 6= Q
0 if and only if (iff) the matrix is not singular, i.e., there exists A−1 .
• det(A) = i λgi i , i.e., the determinant is the product of the eigenvalues, counted with multi-
plicity.
3.4 Functions of normal matrices
Every function of the form f : C → C may be extended to the set of normal matrices as follows
d
X
f (M ) = f (λi ) |vi i hvi | . (3.14)
i=1
If the function is analytical in a region including all eigenvalues, then you can Taylor expand
∞
X f (n) (0) n
f (M ) = M . (3.15)
n=0
n!
Example 3.11. Given a Hermitian matrix, H, we can define U = eiH fulfilling eiH = cos(H) +
i sin(H). This matrix is unitary and U † =
√e
−iH
. If we have a positive matrix, Q > 0, we can
define log(Q) and if Q ≥ 0, we can define Q.
6
Chapter 0 Quantum Information (WiSe 22/23)
3.5 Hilbert space of matrices
The linear space of matrices can be endowed with an inner product so that it acquires the structure
of a Hilbert space.
Hilbert-Schmidt inner product Given two matrices, A and B, we define the inner product
hA, Bi = tr(A† B). We can thus also define norms of matrices, distances, and topology. Thus,
we can compute limits of sequences of matrices.
Norm There are other norms that are useful for pthe set of matrices. The 2-norm of a matrix is
defined in terms
√ of the inner product as kM k 2 = tr(M † M ). The 1-norm or trace norm is defined
as kM k1 = tr M M . Note that this is well defined since M † M ≥ 0. For Hermitian matrices,
†
Pd
H, we have kM k1 = i=1 |λi |. Finally, we can define the operator norm, which for Hermitian
matrices is the eigenvalue with maximal magnitude, i.e. kHk =maxi |λi |.
Singular value decompositions For any matrix, M (not necessarily normal), there always exist
unitary matrix U , V , and diagonal matrix D with non-negative entries, such that M = U DV . The
diagonal elements of D are called the singular values of M .
4 Classical and Quantum Entropy
For classical probability distributions, Shannon introduced a measure of information called entropy.
This can also be defined for positive semidefinite matrices with unit trace.
4.1 Shannon entropy
Pd
Consider a probability distribution, P : {1, . . . , d} → R, fulfilling P (b) ≥ 0 and b=1 P (b) = 1.
The Shannon entropy associated to that probability distribution is defined by
d
X
H(P ) = − P (b) log2 P (b) (4.1)
b=1
with the convention that 0 log2 (0) = 0. One can think of H as a measure of surprise. It has the
following properties:
1. The Shannon entropy is non-negative H(p) ≥ 0. H(p) = 0 if and only if P (b) = 0 for all
entries but one, for which it is 1. That is, for the probability distribution where there is
perfect certainty.
2. The Shannon entropy is upper-bounded H(p) ≤ log2 (d). The equality is attained only for a
flat distributions, i.e. P (b) = 1/d for all b.
3. The Shannon entropy is concave: for any p ∈ [0, 1] and two probability distributions, P1 and
P2 ,
H(p · P1 + (1 − p)P2 ) ≥ pH(P1 ) + (1 − p)H(P2 ). (4.2)
4.2 Von Neumann entropy
Now we introduce the von Neumann entropy of operators ρ fulfilling ρ ≥ 0 and tr(ρ) = 1. In
fact, this immediately implies that if we take any ONB, {|ii}di=1 , the values qi = hi|ρ|ii form a
probability distribution. For the distribution qi , we could compute the Shannon entropy H(q).
The von Neumann entropy is such
P a quantity if we take the eigenbasis of ρ.
Given a density matrix ρ = λi |vi i hvi |, the von Neumann entropy of ρ is defined as
n
X
S(ρ) = − tr(ρ log ρ) = −λi log λi = H({λi }). (4.3)
i=1
Comparing and contrasting with the Shannon entropy, we mention a couple of properties of von
Neumann entropy.
7
Chapter 0 Quantum Information (WiSe 22/23)
1. The von Neumann entropy S(ρ) is non-negative, because of the connection to the Shannon
entropy (Eq. (4.3)). The entropy is zero if and only if the matrix ρ is a rank-1 orthogonal
projector (all eigenvalues are 0 except for one).
2. The maximal value of the von Neumann entropy S(ρ) is equal to the logarithm of your
Hilbert space dimension d. The maximal value log d is attained when ρ = 1/d.
3. The von Neumann entropy S(ρ) is concave,
X X
S( pi ρi ) ≥ pi S(ρi ). (4.4)
i i
Later in the course, we’ll discuss entropy in more detail. It is okay if you don’t understand them
right now.
♣♣♣
Acknowledgement.— We thank Bennet Windt, Jordi Arnau Montana-Lopez, and Adrian O. Paulus
for providing their handwritten notes. We thank Franz Silva-Tarouca, Adrian O. Paulus, and Jan
Geiger for proofreading.
These lecture notes were kindly provided by Ignacio Cirac and Sirui Lu.
References
M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information, 10th ed.
(Cambridge University Press, Cambridge, 2011).