KEMBAR78
En Enumeration Notes | PDF | Set (Mathematics) | Function (Mathematics)
0% found this document useful (0 votes)
246 views115 pages

En Enumeration Notes

This document is the introduction to a set of course notes on combinatorics. It outlines the topics that will be covered in the notes, including: 1) The basic principles of enumeration like choices, permutations, subsets and generating functions. Examples of applications are also provided. 2) Generating series as an algebraic approach to enumeration problems. The binomial theorem and series are introduced. 3) Enumeration of binary strings using regular expressions and recurrence relations. 4) Homogeneous and inhomogeneous linear recurrence relations, and their application to sequences arising in enumeration. Quadratic recurrence relations are also briefly discussed. The notes are a work in progress, with some additional topics planned to be added over

Uploaded by

qixin xiao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
246 views115 pages

En Enumeration Notes

This document is the introduction to a set of course notes on combinatorics. It outlines the topics that will be covered in the notes, including: 1) The basic principles of enumeration like choices, permutations, subsets and generating functions. Examples of applications are also provided. 2) Generating series as an algebraic approach to enumeration problems. The binomial theorem and series are introduced. 3) Enumeration of binary strings using regular expressions and recurrence relations. 4) Homogeneous and inhomogeneous linear recurrence relations, and their application to sequences arising in enumeration. Quadratic recurrence relations are also briefly discussed. The notes are a work in progress, with some additional topics planned to be added over

Uploaded by

qixin xiao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 115

MATH 239/249

Introduction to Combinatorics
Hello World, and Thanks!

The following parts of these notes are still UNDER CONSTRUCTION.


(Estimated completion date: Fall 2021.) Most of these are additional topics
not covered during the semester. The old notes are also available on the
course website.
• Proof of Theorem 4.18 (inhomogeneous linear recurrence relations)
• Section 8.4 (planar duality)
• Part III: Chapters 12, 13, and 14 will appear in three installments over
the course of the next year.

c 2020 David G. Wagner


Department of Combinatorics and Optimization
Faculty of Mathematics
University of Waterloo
Contents

I Introduction to Enumeration 11

1 Basic Principles of Enumeration. 15


1.1 The Essential Ideas. . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.1.1 Choices – “AND” versus “OR”. . . . . . . . . . . . . . . 15
1.1.2 Lists, permutations, and subsets. . . . . . . . . . . . . . 17
1.1.3 Think of what the numbers mean. . . . . . . . . . . . . 20
1.1.4 Multisets. . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.1.5 Bijective proofs. . . . . . . . . . . . . . . . . . . . . . . . 23
1.1.6 Inclusion/Exclusion. . . . . . . . . . . . . . . . . . . . . 26
1.1.7 Combinatorial probabilities. . . . . . . . . . . . . . . . . 29
1.2 Examples and Applications. . . . . . . . . . . . . . . . . . . . . 30
1.2.1 The Vandermonde convolution formula. . . . . . . . . 30
1.2.2 Common birthdays. . . . . . . . . . . . . . . . . . . . . 31
1.2.3 An example with multisets. . . . . . . . . . . . . . . . . 33
1.2.4 Poker hands. . . . . . . . . . . . . . . . . . . . . . . . . 36
1.2.5 Derangements. . . . . . . . . . . . . . . . . . . . . . . . 38
1.3 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2 The Idea of Generating Series. 47


2.1 The Binomial Theorem and Binomial Series. . . . . . . . . . . . 48

3
2.2 The Theory in General. . . . . . . . . . . . . . . . . . . . . . . . 51
2.2.1 Generating series. . . . . . . . . . . . . . . . . . . . . . . 52
2.2.2 The Sum, Product, and String Lemmas. . . . . . . . . . 54
2.3 Compositions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.4 Subsets with Restrictions. . . . . . . . . . . . . . . . . . . . . . 62
2.5 Proof of Inclusion/Exclusion. . . . . . . . . . . . . . . . . . . . 65
2.6 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

3 Binary Strings. 73
3.1 Regular Expressions and Rational Languages. . . . . . . . . . 74
3.2 Unambiguous Expressions. . . . . . . . . . . . . . . . . . . . . 77
3.2.1 Translation into generating series. . . . . . . . . . . . . 78
3.2.2 Block decompositions. . . . . . . . . . . . . . . . . . . . 79
3.2.3 Prefix decompositions. . . . . . . . . . . . . . . . . . . . 82
3.3 Recursive Decompositions. . . . . . . . . . . . . . . . . . . . . 83
3.3.1 Excluded substrings. . . . . . . . . . . . . . . . . . . . . 84
3.4 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4 Recurrence Relations. 93
4.1 Fibonacci Numbers. . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.2 Homogeneous Linear Recurrence Relations. . . . . . . . . . . . 96
4.3 Partial Fractions. . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.3.1 The Main Theorem. . . . . . . . . . . . . . . . . . . . . . 105
4.3.2 Inhomogeneous Linear Recurrence Relations. . . . . . 107
4.4 Quadratic Recurrence Relations. . . . . . . . . . . . . . . . . . 110
4.4.1 The general binomial series. . . . . . . . . . . . . . . . . 111
4.4.2 Catalan numbers. . . . . . . . . . . . . . . . . . . . . . . 112
4.5 Exercises. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Preliminaries.

MATH 239 is an introduction to two of the main areas in combinatorics –


enumeration and graph theory. MATH 249 is an advanced version of MATH
239 intended for very strong students. These courses are designed for stu-
dents in the second year of an undergraduate program in mathematics or
computer science.
The prerequisites required from first-year mathematics are as follows.
• From MATH 135. Abstract algebra I: sets and propositional logic, proofs,
mathematical induction, modular arithmetic, complex numbers, the
Fundamental Theorem of Algebra.
• From MATH 136. Linear algebra I: systems of linear equations, Gaus-
sian elimination, matrix algebra, vector spaces.
• From MATH 137. Calculus I: algebra with power series, open/closed
sets, continuous functions, differentiation (but not integration).
We use the following standard notation for various number systems.

natural numbers N = {0, 1, 2, 3, ...} including zero 0


integers Z = {..., 2, 1, 0, 1, 2, ...}
rational numbers Q
real numbers R
complex numbers C
integers (modulo n) Zn = {[0], [1], [2], ..., [n 1]}
finite field of prime size Fp = Z p

The cardinality (size) of a set S is denoted by |S|.


For convenience, we use LHS and RHS as shorthand for “left-hand side”
and “right-hand side”, respectively.

9
Part I

Introduction to Enumeration

11
Overview.

Suppose I pay $5 for a lottery ticket – what is the chance that I win a share
of the top prize? It depends on the details, of course. There are a certain
number of ways to win, and a certain number of ways to lose. Enumeration
is the art and science of figuring out this kind of thing. This is the subject of
the first part of these course notes.
There are two broad principles of the subject. The combinatorial ap-
proach is to construct explicit one-to-one correspondences between sets to
show that they have the same size. The algebraic approach is to translate the
information about the problem from combinatorics into algebra, and then to
use algebraic techniques to determine the sizes of the sets. We will see many
examples of both approaches.
In Chapter 1 we begin by introducing the basic building blocks of the the-
ory: subsets, lists and permutations, multisets, binomial coefficients, and so
on. In Section 1.2 the use of these objects is illustrated by analyzing various
applications and examples.
In Chapter 2 we introduce the idea of generating series. This begins with
the Binomial Theorem and Binomial Series, which are of fundamental im-
portance for later results. The general theory of generating series is devel-
oped in Section 2.2, and its use is illustrated by analyzing “compositions”
in Section 2.3.
In Chapter 3 we consider the enumeration of various sets of binary strings,
namely those which can be described by regular expressions – the “rational
languages”. This provides an interesting and varied class of examples to
which the results of Chapters 2 and 4 apply.
In Chapter 4 we consider sequences which satisfy a homogeneous linear
recurrence relation with initial conditions, the sequences arising in Chapters

13
2 and 3 being examples. This technique allows us to calculate the numbers
which answer the various counting problems we have been considering.
By using Partial Fractions we can derive an even better solution to such
problems, although the calculations involved are also more complicated.
(We include a proof of Partial Fractions for completeness.) In Section 4.4 we
briefly discuss recurrence relations which are quadratic rather than linear.
Two additional topics are discussed in Chapters 11 and 12.
Chapter 1

Basic Principles of Enumeration.

1.1 The Essential Ideas.

1.1.1 Choices – “AND” versus “OR”.

In the next few pages we will often be constructing an object of some kind
by repeatedly making a sequence of choices. In order to count the total
number of objects we could construct we must know how many choices are
available at each step, but we must know more: we also need to know how to
combine these numbers correctly. A generally good guideline is to look for
the words “AND” and “OR” in the description of the sequence of choices
available. Here are a few simple examples.

Example 1.1. On a table before you are 7 apples, 8 oranges, and 5 ba-
nanas.
• Choose an apple and a banana.
There are 7 choices for an apple AND 5 choices for a banana: 7⇥5 =
35 choices in all.
• Choose an apple or an orange.
There are 7 choices for an apple OR 8 choices for an orange: 7 + 8 =
15 choices in all.
• Choose an apple and either an orange or a banana.
There are 7 ⇥ (8 + 5) = 91 possible choices.

15
16 Basic Principles. Section 1.1

• Choose either an apple and an orange, or a banana.


There are (7 ⇥ 8) + 5 = 61 possible choices.
Generally, “AND” corresponds to multiplication and “OR” corresponds
to addition. The last two of the above examples show that it is important
to determine exactly how the words “AND” and “OR” combine in the de-
scription of the problem.
From a mathematical point of view, “AND” corresponds to the Cartesian
product of sets. If you choose one element of the set A AND you choose one
element of the set B, then this is equivalent to choosing one element of the
Cartesian product of A and B:

A ⇥ B = {(a, b) : a 2 A and b 2 B},

which is the set of all ordered pairs of elements (a, b) with a 2 A and b 2 B.
In general, the cardinalities of these sets are related by the formula

|A ⇥ B| = |A| · |B|.

Similarly, from a mathematical point of view, “OR” corresponds to the


union of sets. If you choose one element of the set A OR you choose one
element of the set B, then this is equivalent to choosing one element of the
union of A and B:

A [ B = {c : c 2 A or c 2 B},

which is the set of all elements c which are either in A or in B.


It is not always true that |A [ B| = |A| + |B|, because any elements in both
A and B would be counted twice by |A| + |B|. The intersection of A and B
is the set
A \ B = {c : c 2 A and c 2 B},
which is the set of all elements c which are both in A and in B. What is
generally true is that

|A [ B| = |A| + |B| |A \ B|.

(This is the first instance of the Principle of Inclusion/Exclusion, which will


be discussed in general in Subsection 1.1.6.) In particular, if A \ B = ?
then |A [ B| = |A| + |B|. Thus, in order to interpret “OR” as addition, it
Section 1.1 Basic Principles. 17

is important to check that the sets of choices A and B have no elements in


common. Such a union of sets A and B for which A \ B = ? is called a
disjoint union of sets.
When solving enumeration problems it is usually very useful to describe
a choice sequence for constructing the set of objects of interest, paying close
attention to the words “AND” and “OR”.

1.1.2 Lists, permutations, and subsets.

A list of a set S is a list of the elements of S exactly once each, in some


order. For example, the lists of the set {1, a, X, g} are:
1aXg a1Xg X1ag g1aX
1agX a1gX X1ga g1Xa
1Xag aX1g Xa1g ga1X
1Xga aXg1 Xag1 gaX1
1gaX ag1X Xg1a gX1a
1gXa agX1 Xga1 gXa1
A permutation is a list of the set {1, 2, ..., n} for some n 2 N. A per-
mutation : a1 a2 ...an can be interpreted as a function : {1, 2, ..., n} !
{1, 2, ..., n} by putting (i) = ai for all 1  i  n.
To construct a list of S we can choose any element v of S to be the first
element in the list and follow this with any list of the set S r {v}. That is
how the table above is arranged – each of the four columns corresponds
to one choice of an element of {1, a, X, g} to be the first element of the list.
Within each column, all the lists of the remaining elements appear after the
first element.
Let pn denote the number of lists of an n-element set S. The first sentence
of the previous paragraph is translated into the equation

pn = n · pn 1 ,

provided that n is positive. (In this equation there are n choices for the first
element v of the list, AND pn 1 choices for the list of S r {v} which follows
it.) It is important to note here that each list of S will be produced exactly
once by this construction.
18 Basic Principles. Section 1.1

Since it is easy to see that p1 = 1 (and p2 = 2), a simple proof by induction


on n shows the following:

Theorem 1.2. For every n 1, the number of lists of an n-element set S is

n(n 1)(n 2) · · · 3 · 2 · 1.

In particular, taking S = {1, 2, ..., n}, this is the number of permutations of


size n. The term n factorial is used for the number n(n 1) · · · 3 · 2 · 1, and it
is denoted by n! for convenience. We also define 0! to be the number of lists
of the 0-element (empty) set ?. Since we want the equation pn = n · pn 1 to
hold when n = 1, and since p1 = 1! = 1, we conclude that 0! = p0 = 1 as
well.
A subset of a set S is a collection of some (perhaps none or all) of the
elements of S, at most once each and in no particular order.
To specify a particular subset A of S, one has to decide for each element v
of S whether v is in A or v is not in A. Thus we have two choices – v 2 A OR
v 62 A – for each element v of S. If S = {v1 , v2 , ..., vn } has n elements then the
total number of choices is 2n since we have 2 choices for v1 AND 2 choices
for v2 AND . . . AND 2 choices for vn .

Theorem 1.3. For every n 0, the number of subsets of an n-element set


is 2n .

A partial list of a set S is a list of a subset of S. That is, it is a list of some


(perhaps none or all) of the elements of S, at most once each and listed in
some particular order. We are going to count partial lists of length k of an
n-element set.
First think about the particular case n = 6 and k = 3, and the set S =
{a, b, c, d, e, f }. A partial list of S of length 3 is a list xyz of elements of S,
which must all be different. There are:
6 choices for x (since x is in S), AND
5 choices for y (since y 2 S but y 6= x), AND
4 choices for z (since z 2 S but z 6= x and z 6= y).
Altogether there are 6 · 5 · 4 = 120 partial lists of {a, b, c, d, e, f } of length 3.
Section 1.1 Basic Principles. 19

This kind of reasoning works just as well in the general case. If S is an


n-element set and we want to construct a partial list v1 v2 ...vk of elements of
S of length k, then there are:
n choices for v1 , AND
n 1 choices for v2 , AND
....
n (k 2) choices for vk 1 , AND
n (k 1) choices for vk .
This proves the following result.

Theorem 1.4. For n, k 0, the number of partial lists of length k of an


n-element set is n(n 1) · · · (n k + 2)(n k + 1).

Notice that if k > n then the number 0 will appear as one of the factors in
the product n(n 1) · · · (n k + 2)(n k + 1). This makes sense, because if
k > n then there are no partial lists of length k of an n-element set. On the
other hand, if 0  k  n then we could also write this product as

n!
n(n 1) · · · (n k + 2)(n k + 1) = .
(n k)!

We next count subsets of an n-element set S which have a particular size


k. So for n, k 0 let nk denote the number of k-element subsets of an n-
element set S. Notice that if k < 0 or k > n then nk = 0 because in these
cases it is impossible for S to have a k-element subset. Thus we need only
consider k in the range 0  k  n.
To count k-element subsets of S we consider another way of constructing
a partial list of length k of S. Specifically, we can choose a k-element subset
A of S AND a list of A. The result will be a list of a subset of S of length
k. Since every partial list of length k of S is constructed exactly once in this
way, this translates into the equation
✓ ◆
n n!
· k! = .
k (n k)!
20 Basic Principles. Section 1.1

In summary, we have proved the following result.

Theorem 1.5. For 0  k  n, the number of k-element subsets of an n-


element set is ✓ ◆
n n!
= .
k k!(n k)!

The numbers n
k
are read as “n choose k” and are called binomial coefficients .

1.1.3 Think of what the numbers mean.

Usually, when faced with a formula to prove, one’s first thought is to prove
it by algebraic calculations, or perhaps with an induction argument, or maybe
with a combination of the two. But often that is not the easiest way, nor is
it the most informative. A much better strategy is one which gives some in-
sight into the meaning of all of the parts of the formula. If we can interpret
all the numbers as counting things, addition as “OR”, and multiplication
as “AND”, then we can hope to find an explanation of the formula by con-
structing some objects in the correct way.

Example 1.6. Consider the equation, for any n 0:


✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
n n n n
+ + + ··· + = 2n .
0 1 2 n

This could be proved by induction on n, but many more details would


have to be given and the proof would not address the true “meaning” of
the formula. Instead, let’s interpret everything combinatorially:
• the number of subsets of the n-element set {1, 2, ..., n} is 2n ;
• for each 0  k  n, the number of k-element subsets of {1, 2, ..., n}
is nk ;
• addition corresponds to “OR” (that is, disjoint union of sets).
So, this formula is saying that choosing a subset of {1, 2, ..., n} (in one of
2n ways) is equivalent to choosing a k-element subset of {1, 2, ..., n} (in
one of nk ways) for exactly one value of k in the range 0  k  n. Said
that way the formula becomes self–evident, and there is nothing more
to prove.
Section 1.1 Basic Principles. 21

Example 1.7. Consider the equation


✓ ◆ ✓ ◆ ✓ ◆
n n 1 n 1
= +
k k 1 k

where we are using the fact that m


j
= 0 if j < 0 or j > m.
This equation can be proven algebraically from the formula of Theo-
rem 1.5, and that is a good exercise which I encourage you to try. But
a more informative proof interprets these numbers combinatorially as
follows:
• nk is the number of k-element subsets of {1, 2, ..., n};
• nk 11 is the number of (k 1)-element subsets of {1, 2, ..., n 1};
• n k 1 is the number of k-element subsets of {1, 2, ..., n 1};
• addition corresponds to disjoint union of sets.
So, this equation is saying that choosing a k-element subset A of {1, 2, ..., n}
is equivalent to either choosing a (k 1)-element subset of {1, 2, ..., n 1}
or a k-element subset of {1, 2, ..., n 1}. This is perhaps not as clear as
the previous example, but the two cases depend upon whether the cho-
sen k-element subset A of {1, 2, ..., n} is such that n 2 A OR n 62 A. If
n 2 A then A r {n} is a (k 1)-element subset of {1, 2, ..., n 1}, while if
n 62 A then A is a k-element subset of {1, 2, ..., n 1}. This construction
explains the correspondence, proving the formula.

This principle – interpreting equations combinatorially and proving the


formulas by describing explicit correspondences between sets of objects –
is one of the most important and powerful ideas in enumeration. We will
apply this way of thinking throughout the first part of these notes.
Incidentally, the equation in Example 1.7 is a very useful recurrence rela-
tion for computing binomial coefficients quickly. Together with the facts

✓ ◆ ✓ ◆
n n! n! n
= = =
k k!(n k)! (n k)!(n (n k))! n k

and n
0
= n
n
= 1 it can be used to compute any number of binomial coeffi-
22 Basic Principles. Section 1.1

cients without difficulty. The resulting table is known as Pascal’s Triangle :

n\k 0 1 2 3 4 5 6 7 8
0 1
1 1 1
2 1 2 1
3 1 3 3 1
4 1 4 6 4 1
5 1 5 10 10 5 1
6 1 6 15 20 15 6 1
7 1 7 21 35 35 21 7 1
8 1 8 28 56 70 56 28 8 1

1.1.4 Multisets.

Imagine a bag which contains a large number of marbles of three colours


– red, green, and blue, say. The marbles are all indistinguishable from one
another except for their colours. There are N marbles of each colour, where
N is very, very large (more precisely we should be considering the limit
as N ! 1). If I reach into the bag and pull out a handful of 11 marbles,
I will have r red marbles, g green marbles, and b blue marbles, for some
nonnegative integers (r, g, b) such that r + g + b = 11. How many possibile
outcomes are there?
The word “multiset” is meant to suggest a set in which the objects can oc-
cur more than once. For example, the outcome (4, 5, 2) in the above situation
corresponds to the “set” {R, R, R, R, G, G, G, G, G, B, B} in which R is a red
marble, G is a green marble, and B is a blue marble. This is an 11-element
multiset with elements of three types. The number of these multisets is the
solution to the above problem.

Definition 1.8. Let n 0 and t 1 be integers. A multiset of size n with


elements of t types is a sequence of nonnegative integers (m1 , ..., mt )
such that
m1 + m2 + · · · + mt = n.

The interpretation is that mi is the number of elements of the multiset which


are of the i-th type, for each 1  i  t.
Section 1.1 Basic Principles. 23

Theorem 1.9. For any n 0 and t 1, the number of n-element multisets


with elements of t types is
✓ ◆
n+t 1
.
t 1

Proof. Think of what that number means! By Theorem 1.5, n+t


t 1
1
is the
number of (t 1)-element subsets of an (n + t 1)-element set. So, let’s
write down a row of (n + t 1) circles from left to right:

O O O O O O O O O O O O O
and cross out some t 1 of these circles to choose a (t 1)-element subset:

O O O O X O O O O O X O O

Now the t 1 crosses chop the remaining sequence of n circles into t seg-
ments of consecutive circles. (Some of these segments might be empty,
which is to say of length zero.) Let mi be the length of the i-th segment
of consecutive O-s in this construction. Then m1 + m2 + · · · + mt = n, so
that (m1 , m2 , ..., mt ) is an n-element multiset with t types. Conversely, if
(m1 , m2 , ..., mt ) is an n-element multiset with t types then write down a se-
quence of m1 O-s, then an X, then m2 O-s, then an X, and so on, finishing
with an X and then mt O-s. The positions of the X-s will indicate a (t 1)-
element subset of the positions {1, 2, ..., n + t 1}.
The construction of the above paragraph shows how to translate between
(t 1)-element subsets of {1, 2, ..., n + t 1} and n-element multisets with
t types of element. This one–to–one correspondence completes the proof of
the theorem.

To answer the original question of this section, the number of 11-element


multisets with elements of 3 types is 11+3
3 1
1
= 132
= 78.

1.1.5 Bijective proofs.

The arguments above, counting lists, permutations, subsets, multisets, and


so on, can be phrased more formally using the idea of bijections between
24 Basic Principles. Section 1.1

finite sets. In simple cases as we have seen so far this is not always neces-
sary, but it is good style. In more complicated situations, as we will see in
Chapters 2 to 4, it is a very useful way to organize one’s thoughts.

Definition 1.10. Let f : A ! B be a function from a set A to a set B.


• The function f is surjective if for every b 2 B there exists an a 2 A
such that f (a) = b.
• The function f is injective if for every a, a0 2 A, if f (a) = f (a0 ),
then a = a0 .
• The function f is bijective if it is both surjective and injective.
• The notation A ⌦ B indicates that there is a bijection between
the sets A and B.

Functions with these properties are called surjections, injections, or bijec-


tions, respectively. An older terminology – now out of fashion – is that
surjections are “onto” functions, injections are “one-to-one” functions, and
bijections are “one-to-one and onto”. By Exercise 1.4(a), the relation ⌦ is
an equivalence relation.
The point of Definition 1.10 is the following. Consider a bijection f : A !
B. Then every b 2 B is the image of at least one a 2 A, since f is surjective.
On the other hand, every b 2 B is the image of at most one a 2 A, since f
is injective. Therefore, every b 2 B is the image of exactly one a 2 A. In
other words, the relation f (a) = b pairs off all the elements of A with all the
elements of B. It follows that A and B have the same number of elements.
That is, if A ⌦ B then |A| = |B|. The converse implication holds, and for
infinite sets the relation A ⌦ B is taken as the definition of two sets “having
the same size”.

Proposition 1.11. Let f : A ! B and g : B ! A be functions between


two sets A and B. Assume the following.
• For all a 2 A, g(f (a)) = a.
• For all b 2 B, f (g(b)) = b.
Then both f and g are bijections. Moreover, for a 2 A and b 2 B, we have
f (a) = b if and only if g(b) = a.

Proof. Exercise 1.4(b).


Section 1.1 Basic Principles. 25

A pair of functions as in Proposition 1.11 are called mutually inverse bijections .


The notation g = f 1 and f = g 1 is used to denote this relation. Notice that
for a bijection f , we have (f 1 ) 1 = f .
Here are two examples of this way of thinking.

Example 1.12 (Subsets and indicator vectors.). Let P(n) be the set of all
subsets of {1, 2, ..., n}, and let {0, 1}n be the set of all indicator vectors
↵ = (a1 , a2 , ..., an ) in which each coordinate is either 0 or 1. There is a
bijection between these two sets. For a subset S ✓ {1, 2, ..., n}, define the
vector ↵(S) = (a1 (S), a2 (S), ..., an (S)) by saying that for each 1  i  n,

1 if i 2 S,
ai (S) =
0 if i 62 S.

Conversely, for an indicator vector ↵ = (a1 , ..., an ) define a subset S(↵)


by saying that
S(↵) = {i 2 {1, 2, ..., n} : ai = 1}.
For example, when n = 8 the subset {2, 3, 5, 7} corresponds to the indica-
tor vector (0, 1, 1, 0, 1, 0, 1, 0). The constructions S 7! ↵(S) and ↵ 7! S(↵)
are mutually inverse bijections between the sets P(n) and {0, 1}n as in
Proposition 1.11. It follows that |P(n)| = |{0, 1}n | = 2n . This is a formal-
ization of the proof of Theorem 1.3.

Example 1.13 (Subsets and multisets.). The proof of Theorem 1.9 can be
phrased in terms of bijections, as follows.
Let M(n, t) be the set of all multisets of size n 2 N with elements of t
1 types. Let B(a, k) be the set of all k-element subsets of {1, 2, ..., a}. We
establish a bijection between M(n, t) and B(n+t 1, t 1) in what follows.
Theorem 1.5 implies that |B(n + t 1, t 1)| = n+t t 1
1
, completing the
proof of Theorem 1.9. Here is a precise description of this bijection.
Let S be any (t 1)-element subset of {1, 2, ..., n + t 1}. We can sort
the elements of S in increasing order: S = {s1 , s2 , ..., st 1 } in which s1 <
s2 < · · · < st 1 . For notational convenience, let s0 = 0 and let st = n + t.
Now define a sequence µ = (m1 , m2 , ..., mt ) by letting mi = si si 1 1
for all 1  i  t.
26 Basic Principles. Section 1.1

For example, with n = 10 and t = 4, consider the 3-element subset


S = {2, 7, 11} of {1, 2, ..., 13}. Then s0 < s1 < · · · < st is 0 < 2 < 7 < 11 <
14, the sequence of differences is (2, 5, 4, 3), and subtracting 1 from each
of these yields µ = (1, 4, 3, 2). Notice that µ is a multiset of size 10 with 4
types of elements.
In general, since si 1 < si for all 1  i  t, it follows that mi =
si si 1 1 is a nonnegative integer. Also, since st = n + t, it follows
that m1 + m2 + · · · + mt = st t = n. That is, µ is a multiset of size
n with elements of t types. This describes a function S 7! µ from the
set B(n + t 1, t 1) to the set M(n, t). We claim that this function is a
bijection between these two sets.
To show that our construction S 7! µ is a bijection, we will describe its
inverse function. Begin with a multiset µ = (m1 , m2 , ..., mt ) of size n with
t types of elements. For each 1  i  t 1, let si = m1 + m2 + · · · + mi + i.
Notice that
1  s1 < s2 < · · · < st 1  n + t 1.
Therefore, S = {s1 , s2 , ..., st 1 } is a member of the set B(n + t 1, t 1).
To finish this example, one must check that these constructions, S 7! µ
and µ 7! S, are mutually inverse bijections as in Proposition 1.11. The
details are left as Exercise 1.6.

1.1.6 Inclusion/Exclusion.

In a vase is a bouquet of flowers. Each flower is (at least one of) fresh,
fragrant, or colourful:
(a) 11 flowers are fresh;
(b) 7 flowers are fragrant;
(c) 8 flowers are colourful;
(d) 6 flowers are fresh and fragrant;
(e) 5 flowers are fresh and colourful;
(f) 2 flowers are fragrant and colourful;
(g) 2 flowers are fresh, fragrant, and colourful.
How many flowers are in the bouquet?
The Principle of Inclusion/Exclusion is a systematic method for answer-
ing such questions, which involve overlapping conditions which can be sat-
Section 1.1 Basic Principles. 27

colourful

fresh fragrant

Figure 1.1: A Venn diagram for three sets.

isfied (or not) in various combinations.

Example 1.14. For a small problem as above we can reason backwards


as follows:
(g): there are 2 flowers with all three properties (fresh, fragrant, and
colourful);
(h): from (g) and (f) there are 0 flowers which are fragrant and colourful
but not fresh;
(i): from (g) and (e) there are 3 flowers which are fresh and colourful but
not fragrant;
(j): from (g) and (d) there are 4 flowers which are fresh and fragrant but
not colourful;
(k): from (c)(g)(h)(i) there are 3 flowers which are colourful but neither
fresh not fragrant;
(`): from (b)(g)(h)(j) there is 1 flower which is fragrant but neither fresh
nor colourful;
(m): from (a)(g)(i)(j) there are 2 flowers which are fresh but neither fra-
grant nor colourful.
The total number of flowers is counted by the disjoint union of the
cases (g) through (m); that is 2 + 0 + 3 + 4 + 3 + 1 + 2 = 15.

A Venn diagram is extremely useful for organizing this calculation. Figure


1.1 is a Venn diagram for the three sets involved in this question. Item (g) in
28 Basic Principles. Section 1.1

Figure 1.2: A Venn diagram for four sets.

the original data gives the number of flowers counted in the central trian-
gle. The subsequent steps (h) to (m) calculate the rest of the numbers in the
diagram, moving outwards from the center.
The above works very well for three properties (fresh, fragrant, colour-
ful) but becomes increasingly difficult to apply as the number of properties
increases. Figure 1.2 shows a Venn diagram for four sets, for instance. In-
stead, consider this alternative to the calculation in Example 1.14:

(a) + (b) + (c) (d) (e) (f ) + (g) = 11 + 7 + 8 6 5 2 + 2 = 15.

This looks much easier to apply, and it gives the right answer. Why? That
is the Principle of Inclusion/Exclusion, which we now explain in general.
Let A1 , A2 , ..., Am be finite sets. We want a formula for the cardinality of
the union of these sets A1 [ A2 [ · · · [ Am . First a bit of notation: if S is a
nonempty subset of {1, 2, ..., m} then let AS denote the intersection of the
sets Ai for all i 2 S. So, for example, with this notation we have A{2,3,5} =
A2 \ A3 \ A5 .

Theorem 1.15 (Inclusion/Exclusion). Let A1 , A2 ,..., Am be finite sets.


Then X
|A1 [ A2 [ · · · [ Am | = ( 1)|S| 1 |AS |.
?6=S✓{1,...,m}
Section 1.1 Basic Principles. 29

We prove Theorem 1.15 in Section 2.5, but all that is required is the Binomial
Theorem 2.2.

1.1.7 Combinatorial probabilities.

We can interpret counting problems in terms of probabilities by making one


additional hypothesis. That hypothesis is that every possible outcome is
equally likely. The exact definition of what is an “outcome” depends on
the particular problem. If ⌦ denotes a finite set of all possible outcomes,
then any subset E of ⌦ is what a probabilist calls an “event”. The probabil-
ity that a randomly chosen outcome from ⌦ is in the set E is |E|/|⌦| exactly
because every outcome has probability 1/|⌦| of being chosen, and there are
|E| elements in E. Here are a few examples to illustrate these ideas.

Example 1.16. What is the probability that a random subset of {1, 2, ..., 8}
has at most 3 elements?
Here an outcome is a subset of {1, 2, ..., 8}, and there are 28 = 256 such
subsets. The number of subsets of {1, 2, ..., 8} with at most 3 elements is
✓ ◆ ✓ ◆ ✓ ◆ ✓ ◆
8 8 8 8
+ + + = 1 + 8 + 28 + 56 = 93.
0 1 2 3

So the probability in question is

93
= 0.363281...
256
to six decimal places.

Example 1.17. What is the probability that a random list of {a, b, c, d, e, f }


contains the letters f ad as a consecutive subsequence?
Here an outcome is a list of {a, b, c, d, e, f }, and there are 6! = 720 such
lists. Those lists of this set which contain f ad as a consecutive subse-
quence can be constructed uniquely as the lists of the set {b, c, e, f ad}, so
30 Basic Principles. Section 1.2

there are 4! = 24 of these. Thus, the probability in question is

24 1
= = 0.03333...
720 30

Example 1.18. What is the probability that a randomly chosen 2-element


multiset with t types of element has both elements of the same type?
The outcomes are the 2-element multisets with t types, numbering
✓ ◆ ✓ ◆ ✓ ◆
2+t 1 t+1 t+1 (t + 1)t
= = =
t 1 t 1 2 2

in total. Of these, exactly t of them have both elements of the same type
– choose one of the t types and take two elements of that type. Thus, the
probability in question is

2t 2
= .
(t + 1)t t+1

The values for the first few t are given in the following table to four
decimal places:

t 1 2 3 4 5 6 7
1.0000 0.6667 0.5000 0.4000 0.3333 0.2857 0.2500

1.2 Examples and Applications.

1.2.1 The Vandermonde convolution formula.

Example 1.19 (Vandermonde convolution formula). For m, n, k 2 N,


✓ ◆ Xk ✓ ◆✓ ◆
m+n m n
= .
k j=0
j k j

For instance, with m = 4 and n = 2 and k = 3 this says that


✓ ◆ ✓ ◆✓ ◆ ✓ ◆✓ ◆ ✓ ◆✓ ◆ ✓ ◆✓ ◆
6 4 2 4 2 4 2 4 2
= + + + .
3 0 3 1 2 2 1 3 0
Section 1.2 Basic Principles. 31

(Of course, 2
3
= 0, but that doesn’t matter.)
The Vandermonde convolution formula can be proven algebraically by
induction on m + n, but the proof is finicky and doesn’t give much insight
into what the formula “means”. (The formula can also be deduced easily
from the Binomial Theorem ??, as we will see in Example 2.3.)
Here is a direct combinatorial proof, illustrating the strategy of thinking
about what the numbers mean. On the LHS, m+n k
is the number of k-
element subsets S of the set {1, 2, ..., m + n}. On the RHS, the number can
be produced as follows:
• choose a value of j in the range 0  j  k, and
• choose a j-element subset A of {1, 2, ..., m}, and
• choose a (k j)-element subset B of {m + 1, ..., m + n}.
(Notice that the set {m + 1, ..., m + n} has n elements, so it has k n j subsets
of size k j.) Now the formula is proved by describing a bijection between
the k-element subsets S of {1, 2, ..., m+n} counted on the LHS, and the pairs
(A, B) of subsets counted on the RHS. To describe this correspondence, let
M = {1, 2, ..., m} and N = {m + 1, ..., m + n}. Notice that M \ N = ? and
M [ N = {1, 2, ..., m + n} and |M | = m and |N | = n. Now, given a k-element
subset S of {1, 2, ..., m + n} we let

A = S \ M and B = S \ N.

Conversely, given a pair of subsets (A, B) satisfying the conditions in the


points above, we let S = A [ B. After some thought, you will see that
these constructions S 7! (S \ M, S \ N ) and (A, B) 7! A [ B are mutually
inverse bijections between the sets in question. Therefore, there are the same
number of objects on each side, and the formula is proved.

1.2.2 Common birthdays.

Example 1.20. Let p(n) denote the probability that in a randomly chosen
group of n people, at least two of them are born on the same day of the
year. What does the function p(n) look like?

To simplify the analysis, we will ignore the existence of leap years and as-
sume that every year has exactly 365 days. Moreover, we will also assume
32 Basic Principles. Section 1.2

that people’s birthdays are independently and uniformly distributed over


the 365 days of the year, so that we can use the ideas of combinatorial prob-
ability theory. These are reasonable approximations – although they intro-
duce tiny errors they do not change the qualitative “shape” of the answer.
To begin with, p(1) = 0 since there is only n = 1 person in the group.
Also, if n > 365 then p(n) = 1 since there are more people in the group
than days in a year, so at least two people in the group must have the same
birthday.
For n in the range 2  n  365 it is quite complicated to analyze the
probability p(n) directly. However, the complementary probability 1 p(n)
is relatively easy to compute. From the definition of p(n) we see that 1 p(n)
is the probability that in a randomly chosen group of n people, no two of
them are born on the same day of the year. This model is equivalent to
rolling “no pair” when throwing n independent dice each with 365 sides. If
we list the people in the group as P1 , P2 , ..., Pn in any order, then their birth-
days must form a partial list of the 365 days of the year, of length n. There
are 365!/(365 n)! such partial lists. Since the total number of outcomes is
365n , we have derived the formula
365!
1 p(n) = .
(365 n)!365n
Therefore
365!
p(n) = 1 .
(365 n)!365n
To give some feeling for what this looks like, here is a table of p(n) (rounded
to six decimal places) for selected values of 2  n  365.

n p(n) n p(n) n p(n)


2 0.002740 25 0.568700 70 0.999160
3 0.008204 30 0.706316 80 0.999914
4 0.016356 35 0.814383 100 1.
5 0.027136 40 0.891232 150 1.
10 0.116948 45 0.940976 200 1.
15 0.252901 50 0.970374 250 1.
20 0.411438 60 0.994123 300 1.

Figure 1.3 gives a graph of this function. It is a rather surprising fact that
p(23) = 0.507297, so that if you randomly choose a set of 23 people on earth
Section 1.2 Basic Principles. 33

Figure 1.3: The probability of a common birthday among n people.

then there is a slightly better than 50% chance that at least two of them will
have the same birthday. (Approximately – we have ignored leap years and
twins.)

1.2.3 An example with multisets.

Example 1.21. A packet of Maynard’s Wine Gums consists of a roll or


packet of 10 candies, each of which has one of five “flavours” – Green,
Yellow, Orange, Red, or Purple. I especially like the purple ones. What is
the chance that when I buy some Wine Gums there are exactly k purple
candies (for each 0  k  10)?

This example is designed to illustrate the fact that the probabilities depend
on which model is used to analyze the situation. There are two reasonable
possibilities for this problem, which I will call the dice model and the mul-
tiset model.
In the “dice model” we keep track of the fact that the candies are stacked
up in the roll from bottom to top, so there is a natural sequence (c1 , c2 , ...., c10 )
of flavours one sees when the roll is opened. For example, the sequences

(G, P, R, Y, Y, G, O, R, Y, O)
34 Basic Principles. Section 1.2

and
(Y, G, O, P, R, R, Y, G, O, Y )

count as different outcomes in this model. We have a sequence of 10 candies,


and a choice of one of 5 flavours for each candy, giving a total of 510 =
9765625 outcomes. (This is equivalent to rolling a sequence of ten 5-sided
dice, hence the name for the model.)
In the “multiset model” we disregard the order in which the candies oc-
cur in the packet as being an inessential detail. The only important informa-
tion about the packet is the number of candies of each type that it contains.
For example, both of the outcomes in the previous paragraph reduce to the
same multiset
{G, G, Y, Y, Y, O, O, R, R, P }

or (2, 3, 2, 2, 1) in this model. Thus we are regarding the packet as a multiset


of size 10 with 5 types of element, giving a total of 10+5 5 1
1
= 144
= 1001
outcomes.
Notice that the number of outcomes in the dice model is vastly larger
than in the multiset model. It should come as no surprise, then, that the
probabilities we compute depend strongly on which of these two models
we consider. (The true values for the probabilities depend on the details of
the manufacturing process by which the rolls or packets are made. These
cannot be calculated, but must be measured instead.)
Consider the dice model first, and let d(k) denote the probability of get-
ting exactly k purple candies in a roll. There are 10
k
choices for the positions
of these k purple candies, and (5 1) 10 k
choices for the sequence of (non–
purple) flavours of the other 10 k candies. This gives a total of 10 k
410 k
outcomes with exactly k purple candies in this model. Therefore,
✓ ◆ 10 k
10 4
d(k) =
k 510

for each 0  k  10.


Next let’s consider the multiset model, and let m(k) denote the proba-
bility of getting exactly k purple candies in a packet. If we have k purple
candies then the rest of the candies form a multiset of size 10 k with el-
ements of 4 types, so there are 10 4k+41
1
= 133 k such outcomes in this
Section 1.2 Basic Principles. 35

model. Therefore,
13 k
3
m(k) = 14
4

for each 0  k  10.


Here is a table of these probabilities (rounded to six decimal places).

k d(k) m(k)
0 0.107374 0.285714
1 0.268435 0.219780
2 0.301990 0.164835
3 0.201327 0.119880
4 0.088080 0.083916
5 0.026424 0.055944
6 0.005505 0.034965
7 0.000786 0.019980
8 0.000074 0.009990
9 0.000004 0.003996
10 0.000000 0.000999

The differences between the two models are clearly seen.


In closing, here are two more points about these models.
First, given a multiset (m1 , ..., mt ) of size n with elements of t types, the
number of outcomes in the dice model which “reduce” to this multiset is
✓ ◆
n n!
= ,
m1 , ..., mt m1 ! · m2 ! · · · mt !

called a multinomial coefficient . This can be seen intuitively by arranging the


n elements of the multiset in a line in one of n! ways, and noticing that since
we can’t tell the mi elements of type i apart we can freely rearrange them
in mi ! ways without changing the arrangement. One could prove this more
carefully by induction on t, using the case t = 2 of binomial coefficients
as part of the induction step. Or, one could give a combinatorial proof by
constructing a bijection which makes the informal argument above more
precise.
36 Basic Principles. Section 1.2

For the second point, the above analysis of the multiset model can be
generalized to prove Exercise 1.11: for any integers n 0 and t 2,
✓ ◆ n ✓
X ◆
n+t 1 n k+t 2
= .
t 1 k=0
t 2

1.2.4 Poker hands.

Poker is played with a standard deck of 52 cards, divided into four suits:
spades , hearts ~, diamonds }, and clubs |.
Within each suit are 13 cards of different values:
A (Ace), 2, 3, 4, 5, 6, 7, 8, 9, 10, J (Jack), Q (Queen), K (King).
An Ace can be high (above K) or low (below 2) at the player’s choice.
Many variations on the game exist, but the common theme is to make
the best five-card hand according to the ranking of poker hands. This order
is determined by how unlikely it is to be dealt such a hand. From best to
worst, the types of poker hand are as follows:
• Straight Flush: five cards of the same suit with consecutive values.
For example, 8~ 9~ 10~ J~ Q~.
• Four of a Kind (or Quads): four cards of the same value, with any fifth
card. For example, 7 7~ 7} 7| 4}.
• Full House (or Tight, or Boat): three cards of the same value, and a
pair of cards of another value. For example, 9 9~ 9} A} A|.
• Flush: five cards of the same suit, but not with consecutive values.
For example, 3~ 7~ 10~ J~ K~.
• Straight: five cards with consecutive values, but not of the same suit.
For example, 8~ 9| 10 J~ Q}.
• Three of a Kind (or Trips): three cards of the same value, and two
more cards not of the same value. For example, 8 8~ 8} K} 5|.
• Two Pair: this is self-explanatory.
For example, J~ J| 6} 6| A.
• One Pair: this is also self-explanatory.
For example, Q Q} 8} 7| 2.
• Busted Hand: anything not covered above.
For example, K Q} 8} 7| 2.
Section 1.2 Basic Principles. 37

There are 525


= 2598960 possible 5-element subsets of a standard deck
of 52 cards, so this is the total number of possible poker hands. How many
of these hands are of each of the above types? The answers can be found
easily on the WWWeb, so there’s no sense trying to keep them secret. Here
they are: N is the number of outcomes of each type, and p = N/ 52 5
is the
probability of each type of outcome (rounded to six decimal places).

Hand N p
Straight Flush 40 0.000015
Quads 624 0.000240
Full House 3744 0.001441
Flush 5108 0.001965
Straight 10200 0.003925
Trips 54912 0.021128
Two Pair 123552 0.047539
One Pair 1098240 0.422569
Busted 1302540 0.501177

The derivation of these numbers is excellent practice (see Exercise 1.14), so


we will do only two of the cases – Straight, and Busted – as illustrations.

Example 1.22.
• To construct a Straight hand there are 10 choices for the consecutive
values of the cards (A2345, 23456, ... up to 10JQKA), and 45 choices
for the suits on the cards. However, four of these choices for suits
give all five cards the same suit – these lead to straight flushes and
must be discounted. Hence the total number of straights is
10 · (45 4) = 10200.
• To construct a Busted hand there are 13 5
10 choices for 5 values
of cards which are not consecutive (no straight) and have no pairs.
Having chosen these values there are 45 4 choices for the suits on
the cards which do not give all five cards the same suit (no flush).
Hence the total number of busted hands is
13
5
10 · (45 4) = 1302540.
38 Basic Principles. Section 1.2

1.2.5 Derangements.

Here is a “classical” example. (In this context, the word “derangement”


makes more sense in French than in English.)

Example 1.23 (The Derangement Problem). A group of eight people


meet for dinner at a fancy restaurant and check their coats at the door.
After a delicious gourmet meal the group leaves, and on the way out the
eight coats are returned to the eight people completely at random by an
incompetent clerk. What is the probability that no-one gets the correct
coat?

Of course, we want to solve the derangement problem for any number


of people, not just for eight. To state the problem mathematically, list the
people P1 , P2 , ..., Pn in any order. We can record who gets whose coat by a
sequence of numbers (c1 , c2 , ..., cn ) in which ci = j means that Pi was given
the coat belonging to Pj . The sequence (c1 , c2 , ..., cn ) will thus contain each of
the numbers 1, 2, ..., n exactly once in some order. In other words, (c1 , ..., cn )
is a permutation of the set {1, 2, ..., n}, and we assume that this permutation
is chosen randomly by the incompetent clerk. Person i gets the correct coat
exactly when ci = i. Thus, in general the derangement problem is to de-
termine, for a random permutation (c1 , ..., cn ) of {1, 2, ..., n}, the probability
that ci 6= i for all 1  i  n.
For small values of n the derangement problem can be analyzed directly,
but complications arise as n gets larger. In fact, this example is perfectly
designed to illustrate the principle of Inclusion/Exclusion. To see how this
applies, for each 1  i  n let Ai be the set of permutations of {1, ..., n} such
that ci = i. That is, Ai is the set of ways in which the coats are returned and
person Pi gets the correct coat. From that interpretation, the union of sets
A1 [ A2 [ · · · [ An is the set of ways in which the coats are returned and
at least one person gets the correct coat. Therefore, the complementary set
of permutations gives those ways of returning the coats so that no-one gets
the correct coat. The number of these derangements of n objects is thus

n! |A1 [ A2 [ · · · [ An |.

It remains to apply Inclusion/Exclusion to determine |A1 [ · · · [ An |. To


do this we need to determine |AS | for every nonempty subset ? 6= S ✓
Section 1.2 Basic Principles. 39

{1, 2, ..., n}. Consider the example n = 8 and S = {2, 3, 6}. In this case
A{2,3,6} = A2 \ A3 \ A6 is the set of those permutations of {1, ..., 8} such that
c2 = 2 and c3 = 3 and c6 = 6. Such a permutation looks like ⇤ 2 3 ⇤ ⇤ 6 ⇤ ⇤
in which the boxes are filled with the numbers {1, 4, 5, 7, 8} in some order.
Since there are 5! lists of the set{1, 4, 5, 7, 8} it follows that |A{2,3,6} | = 5! =
120 in this case. The general case is similar. If ? 6= S ✓ {1, 2, ..., n} is a
k-element subset then the permutations of {1, 2, ..., n} in AS are obtained by
fixing ci = i for all i 2 S, and then listing the remaining n k elements of
{1, ..., n} r S in the remaining spaces. Since there are (n k)! such lists we
see that |AS | = (n k)!.
Since |AS | = (n k)! for every k-element subset of {1, 2, ..., n}, and there
are nk such k-element subsets, Inclusion/Exclusion implies that

Xn ✓ ◆ Xn
n ( 1)k 1
|A1 [ · · · [ An | = ( 1)k 1 (n k)! = n! .
k=1
k k=1
k!

It follows that the number of derangements of n objects is

Xn Xn
( 1)k 1
( 1)k
n! n! = n! .
k=1
k! k=0
k!

Since the total number of permutations of n objects is n!, the probability that
a randomly chosen permutation of {1, 2, ..., n} is a derangement is

Xn
( 1)k
Dn = .
k=0
k!

The following table lists the first several values of the function Dn (with
the decimals rounded to six places). Notice that for n 7 the value of Dn
changes very little. If you recall the Taylor series expansion of the exponen-
tial function
X1
x xk
e =
k=0
k!

then it is easy to see that as n tends to infinity the probability Dn approaches


40 Basic Principles. Section 1.3

the limiting value of e 1


= 0.3678794412....

n Dn Dn
0 1/1 1.
1 0/1 0.000000
2 1/2 0.500000
3 1/3 0.333333
4 3/8 0.375000
5 11/30 0.366667
6 53/144 0.368056
7 103/280 0.367857
8 2119/5760 0.367882
9 16687/45630 0.367879
10 16481/44800 0.367879

In summary, for the original Example 1.23, the probability that no-one gets
their own coat is very close to 36.79%.

1.3 Exercises.

Exercise 1.1. Fix integers n 0 and t 1. Consider a randomly


chosen multiset of size n with elements of t types. For each part below,
calculate the probability that the multiset has the stated property, and
give a brief explanation.
(a) Every type of element occurs at most once.
(b) Every type of element occurs at least once.
(c) Every type of element occurs an even number of times.
(d) Every type of element occurs an odd number of times.
(e) For k 2 N, exactly k types of element occur with multiplicity at
least one.
(f) For k 2 N, exactly k types of element occur with multiplicity at
least two.
Section 1.3 Basic Principles. 41

Exercise 1.2. Consider rolling six fair 6-sided dice, which are distin-
guishable, so that there are 66 = 46656 equally likely outcomes. Count
how many outcomes are of each of the following types. (The answers
add up to 46656.)
(a) Six-of-a-kind.
(b) Five-of-a-kind and a single.
(c) Four-of-a-kind and a pair.
(d) Four-of-a-kind and two singles.
(e) Two triples.
(f) A triple, a pair, and a single.
(g) A triple and three singles.
(h) Three pairs.
(i) Two pairs and two singles.
(j) One pair and four singles.
(k) Six singles.

Exercise 1.3. Let m 1, d 2, and k 0 be integers. When rolling m


fair dice, each of which has d sides, what is the probability of rolling
exactly k pairs and m 2k singles?

Exercise 1.4.
(a) Prove that ⌦ is an equivalence relation.
(b) Prove Proposition 1.11.

Exercise 1.5. Define a function f : Z ! N as follows: for a 2 Z,



2a if a 0,
f (a) =
1 2a if a < 0.

Show that f : Z ! N is a bijection as follows.


(a) Define a function g : N ! Z.
(b) Show that for all a 2 Z, g(f (a)) = a.
(c) Show that for all b 2 N, f (g(b)) = b.
42 Basic Principles. Section 1.3

Exercise 1.6. Complete the proof in Example 1.13.

Exercise 1.7. Give bijective proofs of the following identities.


P
(a) For all n 2 N, nk=0 nk k = n2n 1 .
P
(b) For all n 2 N, nk=0 nk k(k 1) = n(n 1)2N 2 .

Exercise 1.8. For an integer n 1, give a bijective proof that


X ✓ n ◆ X ✓n ◆
=
even k
k odd k
k

Exercise 1.9. Let n be a positive integer. Let Sn be the set of all ordered
pairs of sets (A, B) in which A ✓ B ✓ {1, 2, ..., n}. Let Tn be the set of
all functions f : {1, 2, ..., n} ! {1, 2, 3}.
(a) What is |Tn |? (Explain.)
(b) Define a bijection g : Sn ! Tn . Explain why g((A, B)) 2 Tn
for any (A, B) 2 Sn . (You do not need to explain why g is a
bijection.)
(c) Define the inverse function g 1 : Tn ! Sn of your bijection g
from part (b). (You do not need to explain why g and g 1 are
mutually inverse bijections.)
(d) By counting Sn and Tn in two different ways, deduce that
n ✓ ◆
X
n n
3 = 2k .
k=0
k

Exercise 1.10. Fix integers n 0 and k 1. Let A(n, k) be the set of


sequences (a1 , ..., ak ) 2 N such that a1 + · · · + ak = n, and j divides
k

aj for all 1  j  k. Let B(n, k) be the set of sequences (b1 , ..., bk ) 2 Nk


such that b1 + · · · + bk = n, and b1 b2 · · · bk . For example, here
Section 1.3 Basic Principles. 43

are the sets for n = 7 and k = 3:

A(7, 3) B(7, 3)
(7, 0, 0) (7, 0, 0)
(5, 2, 0) (6, 1, 0)
(4, 0, 3) (5, 2, 0)
(3, 4, 0) (4, 3, 0)
(2, 2, 3) (5, 1, 1)
(1, 6, 0) (4, 2, 1)
(1, 0, 6) (3, 3, 1)
(0, 4, 3) (3, 2, 2)

Construct a pair of mutually inverse bijections between the sets A(n, k)


and B(n, k).

Exercise 1.11. For integers n 0 and t 2, give a bijective proof that


✓ ◆ n ✓
X ◆
n+t 1 n k+t 2
= .
t 1 k=0
t 2

Exercise 1.12. For integers n 1 and t 1, give a bijective proof that


✓ ◆ t ✓ ◆✓
X ◆
n+t 1 t n 1
= .
t 1 k=0
k k 1

Exercise 1.13. Choose a permutation of {1, 2, ..., 7} at random, so


that each of the 7! = 5040 permutations are equally likely. What are
the probabilities of the following events?
1. Numbers 1 and 2 are consecutive (i.e. 3672154 or 3671254).
2. Number 1 is to the left of 2.
3. No two odd numbers are consecutive.
4. Any other condition you can think of.
5. Any similar questions for permutations of {1, 2, ..., n}.
44 Basic Principles. Section 1.3

Exercise 1.14. (The r = 13 and s = 4 case of this exercise completes the


table in Subsection 1.2.4.) Let r 2 and s 2 be integers. Consider a
(non-standard) deck of rs cards, divided into s suits each with cards
of r different values. The cards in each suit are numbered A, 2, 3, ..., r,
and A can be either below 2 or above r. Choose five cards from such a
deck in one of rs 5
ways. How many ways are there to produce each
kind of hand for this “poker in an alternate universe”?
(a) Count “quints” (five-of-a-kinds).
(b) Count straight flushes.
(c) Count quads.
(d) Count full houses.
(e) Count flushes.
(f) Count straights.
(g) Count trips.
(h) Count two-pairs.
(i) Count one-pairs.
(j) Count busted hands.
Exercise 1.15. The game called “Crowns and Anchors” or “Birdcage”
was popular on circus midways early in the 20th century. It is a game
between a Player and the House, played as follows. First, the Player
wagers w dollars on an integer p from one to six. Next, the House
rolls three six-sided dice. For every die that shows p dots on top, the
House pays the Player w dollars, but if no dice show p dots on top
then the Player’s wager is forfeited, and goes to the House. (Assume
that the dice are fair, so that every outcome is equally likely.)
For example, if I wager two dollars on the number five, and the
dice show five, five, and three dots, respectively, then the House pays
me four dollars for a total of six (a profit of four dollars). However, if
in this case the dice show four, three, and two dots, respectively, then
the House takes my wager for a total of zero (a loss of two dollars).
(a) For every dollar that the Player wagers, how much money should
the Player expect to win back in the long run? Would you play
this game?
(b) In a parallel universe there is a game of Crowns and Anchors
being played with m 1 dice, each of which has d 2 sides.
(Assume that the dice are fair, so that every outcome is equally
likely.) In which universes does the Player win in the long run?
In which universes does the House win in the long run? In
which universes is the game completely fair?
Chapter 2

The Idea of Generating Series.

P
We will be dealing algebraically with infinite power series G(x) = 1 n=0 gn x
n

in which the coefficients g = (g0 , g1 , g2 , ...) form a sequence of integers.


These can, for the most part, be handled just like polynomials. Problems
of convergence can arise if one tries to substitute a particular real or com-
plex value for the indeterminate x. We usually don’t do this, and finiteness
of all the coefficients of G(x) is all that we require.

Example 2.1 (The Geometric Series). The simplest infinite case of power
series is if all the coefficients equal one. Then

G = 1 + x + x2 + x3 + x4 + · · · .

Multiply this by x:

xG = x + x2 + x3 + x4 + · · · .

It follows that G xG = (1 x)G = 1. In conclusion


1
= 1 + x + x2 + x3 + · · · .
1 x

Don’t worry about convergence – this is just algebra!

47
48 Generating Series. Section 2.1

2.1 The Binomial Theorem and Binomial Series.

We develop two of the most useful facts that we will need in what follows.
The proofs are also good illustrations of calculating with generating series.

Theorem 2.2 (The Binomial Theorem). For any natural number n 2 N,


n ✓ ◆
X
n n
(1 + x) = xk .
k=0
k

This formula is an identity between two polynomials in the variable x. You


probably have seen a proof of it by induction on n, but we are going to prove
it here using the bijection between subsets and indicator vectors discussed
in Example 1.12.

Proof. Recall that P(n) is the set of all subsets of {1, 2, ..., n}, and that {0, 1}n
is the set of all indicator vectors ↵ = (a1 , a2 , ..., an ) in which each coordi-
nate is either 0 or 1. Example 1.12 gives a bijection between these two sets,
which you should recall. For example, when n = 8 the subset {2, 3, 5, 7}
corresponds to the indicator vector (0, 1, 1, 0, 1, 0, 1, 0). The constructions
S 7! ↵(S) and ↵ 7! S(↵) are mutually inverse bijections between the sets
P(n) and {0, 1}n . From this, we concluded that |P(n)| = |{0, 1}n | = 2n , but
we can deduce more. Notice that if S is a subset with k elements then it
corresponds to an indicator vector ↵ that sums to k. It is sometimes helpful
to record this information in a little table, like this:

P(n) ⌦ {0, 1}n


S $ ↵ = (a1 , a2 , ..., an )
|S| = a1 + a2 + · · · + an .

Because of this bijection, if we introduce an “indeterminate” x, and if S


corresponds to ↵, then
x|S| = xa1 +a2 +···+an .
Moreover, also because of this bijection, summing over all subsets is equiv-
alent to summing over all indicator functions. That is,
X X
x|S| = xa1 +a2 +···+an .
S2P(n) ↵2{0,1}n
Section 2.1 Generating Series. 49

Now we can simplify both sides separately. On the LHS, we know from
Theorem 1.5 that there are nk k-element subsets of an n-element set, for
each 0  k  n. Therefore,
X Xn ✓ ◆
|S| n k
x = x .
k=0
k
S2P(n)

On the RHS, summing over all the indicator vectors ↵ 2 {0, 1}n is equivalent
to summing over all a1 2 {0, 1} and all a2 2 {0, 1} and so on,... until all
an 2 {0, 1}. This gives

X 1 X
X 1 1
X
a1 +a2 +···+an
x = ··· xa1 +a2 +···+an
↵2{0,1}n a1 =0 a2 =0 an =0
1 1 1 1
!n
X X X X
= x a1 x a2 · · · x an = xa = (1 + x)n .
a1 =0 a2 =0 an =0 a=0

This proves the Binomial Theorem. With practice and familiarity, it be-
comes a one-line proof:
n ✓ ◆
X X X
n
xk = x|S| = xa1 +a2 +···+an = (1 + x)n .
k
k=0 S2P(n) ↵2{0,1}n

Example 2.3 (Vandermonde Convolution.). As mentioned in Subsection


1.2.1, the Binomial Theorem easily implies the Vandermonde Convolu-
tion formula. To see this, begin with the obvious identity of polynomials

(1 + x)m+n = (1 + x)m · (1 + x)n

and use the Binomial Theorem to expand each of the factors.


! n ✓ ◆ !
X ✓m + n ◆
m+n Xm ✓ ◆
m j X n
k
x = x xi
k=0
k j=0
j i=0
i
Xm X n ✓ ◆✓ ◆ m+n k ✓ ◆✓ ◆
m n j+i X X m n
= x = xk .
j=0 i=0
j i k=0 j=0
j k j
50 Generating Series. Section 2.1

(The last step is accomplished by re-indexing the double summation.)


Since the polynomials on the LHS and on the RHS are equal, they must
have the same coefficients. By comparing the coefficients of xk on both
sides we see that
✓ ◆ Xk ✓ ◆✓ ◆
m+n m n
= ,
k j=0
j k j

giving the result.

Consider the set M(t) of all multisets with t 1 types of elements, re-
gardless of the size of the multiset. That is, an element of M(t) is a sequence
µ = (m1 , m2 , ..., mt ) of t natural numbers, and the size of the multiset is
|µ| = m1 + m2 + · · · + mt . By Theorem 1.9, for each n 2 N there are n+t t 1
1

elements of M(t) of size n. By analogy with the Binomial Theorem 2.2, we


could collect these numbers as the coefficients of a power series:
X1 ✓ ◆
n+t 1 n
x .
n=0
t 1
The Binomial Series is an algebraic formula for this summation.

Theorem 2.4 (The Binomial Series). For any positive integer t 1,


1 ✓
X ◆
1 n+t 1
= xn .
(1 x)t n=0
t 1

(Properly, this is the binomial series with negative integer exponent. The
general binomial series is discussed in Subsection 4.4.1).

Proof. The key observation is that the set of all multisets with t 1 types
of elements is M(t) = N , the Cartesian product of t copies of the natural
t

numbers N. This leads to a calculation similar to the proof of the Binomial


Theorem above, based on this structure:
M(t) = Nt
µ = (m1 , .., mt )
|µ| = m1 + · · · + mt
Section 2.2 Generating Series. 51

We use this to calculate as follows. The first equality is by Theorem 1.9.


1 ✓
X ◆ 1
X X
n+t 1 n
x = |M(n, t)| xn = x|µ|
n=0
t 1 n=0 µ2M(t)
X 1 X
X 1 X1
= xm1 +m2 +···mt = ··· xm1 +m2 +···+mt
(m1 ,m2 ,...,mt )2Nt m1 =0 m2 =0 mt =0
1 1 1 1
!t
X X X X 1
= xm 1 xm 2 · · · xmt = xm = .
m1 =0 m2 =0 mt =0 m=0
(1 x)t

The last equality is by the geometric series 1 + x + x2 + · · · = 1/(1 x).

2.2 The Theory in General.

In general terms we have a sequence of numbers g = (g0 , g1 , g2 , ...) which


we would like to determine. To do this we introduce an indeterminate x
and encode these numbers as the coefficients of a power series
1
X
2 3
G(x) = g0 + g1 x + g2 x + g3 x + · · · = g n xn ,
n=0

called the generating series for the sequence g.


By “indeterminate” we mean that x behaves algebraically as if it were a
number, but that it does not have any particular value. It should be thought
of as a punctuation mark that is there to keep the different coefficients gn
separated from each other. Sometimes, people call x a “variable” – but that
word is meant to convey the idea that x is a number, but that we don’t know
specifically what the value of that number is. The word “indeterminate”
is meant to convey the idea that x does not have any value at all – it is just a
punctuation mark. As will be seen, it is the coefficients of these power series
that carry information about our counting problems.
In this chapter and the next we will see how to use this strategy to encode
the answers to various counting problems as generating series. In Chapter 4
we will see how to get numbers out of these power series in order to answer
the counting problems explicitly.
52 Generating Series. Section 2.2

2.2.1 Generating series.

Let A be a set of “objects” which we want to count. For example, A might be


the set of subsets of the set {1, 2, ..., n}. Or, A might be the set of all multisets
with t types of elements. The set A can be quite arbitrary, but we assume
that each element of A has a “size” or “weight” attached to it. The weight
of ↵ 2 A is a nonnegative integer !(↵) 2 N. We just require that there are
only finitely many elements of A of any given weight.

Definition 2.5 (Weight Function). Let A be a set. A function ! : A !


N from A to the set N of natural numbers is a weight function provided
that for all n 2 N, the set

An = ! 1 (n) = {↵ 2 A : !(↵) = n}

is finite. That is, for every n 2 N there are only finitely many elements
↵ 2 A of weight n.

Notice that if A is a set with a weight function ! : A ! N, then


1
[
A= An
n=0

is a (disjoint) union of countably many finite sets, and so A is itself either


finite or countably infinite.

Definition 2.6 (Generating series). Let A be a set with a weight func-


tion ! : A ! N as in Definition 2.5. The generating series of A with
respect to ! is X
A(x) = !A (x) = x!(↵) .
↵2A

(We usually suppress the superscript from the notation.) Remember – the
indeterminate x does not have a value. It is just used to keep track of the
weight of each object ↵ 2 A in the exponent.
Section 2.2 Generating Series. 53

Proposition 2.7. Let A be a set with a weight function ! : A ! N, and let


1
X
2
A (x) = a0 + a1 x + a2 x + · · · = an x n .
n=0

For every n 2 N, the number of elements of A of weight n is an = |An |.

Proof.

X 1
X X 1
X X 1
X
!(↵) !(↵) n
A (x) = x = x = x 1= |An | xn .
↵2A n=0 ↵2A: !(↵)=n n=0 ↵2An n=0

Thus, for each n 2 N, the coefficient of xn in A (x) is the number of elements


in A that have weight n.

Since we will be doing a lot of long calculations with power series, and
because of Proposition 2.7, it is useful to have a handy notation for extract-
ing coefficients from them.

P1
Definition 2.8. Let G(x) = g0 + g1 x + g2 x2 + · · · = n=0 gn xn be any
power series. Then for any k 2 N,

[xk ] G(x) = gk

is the coefficient of xk in the power series G(x).

Example 2.9. For example, for any natural numbers a, b 2 N,

X1 ✓ ◆ ✓ ◆
a 1 a n+b n a+b
[x ] = [x ] x = .
(1 x)1+b n=0
b b
54 Generating Series. Section 2.2

2.2.2 The Sum, Product, and String Lemmas.

Lemma 2.10 (The Sum Lemma.). Let A and B be disjoint sets, so that
A \ B = ?. Assume that ! : (A [ B) ! N is a weight function on the
union of A and B. We may regard ! as a weight function on each of A or B
separately (by restriction). Under these conditions,

A[B (x) = A (x) + B (x).

Proof. From the definition of generating series,


X X X
A[B (x) = x!( ) = x!( ) + x!( )
= A (x) + B (x).
2A[B 2A 2B

(The condition that A \ B = ? is needed for the second equality.)

In fact, the above proof can be generalized slightly to give more.

Lemma 2.11 (The Infinite Sum Lemma.). Let A0 , A1 , A2 ,... be pairwise


S
disjoint sets (so that Ai \ Aj = ? if i 6= j), and let B = 1 j=0 Aj . Assume
that ! : B ! N is a weight function. We may regard ! as a weight function
on each of the sets Aj separately (by restriction). Under these conditions,
1
X
B (x) = Aj (x).
j=0

Proof. Exercise 2.6.

Lemma 2.12 (The Product Lemma.). Let A and B be sets with weight
functions ! : A ! N and ⌫ : B ! N, respectively. Define ⌘ : A ⇥ B ! N
by putting ⌘(↵, ) = !(↵) + ⌫( ) for all (↵, ) 2 A ⇥ B. Then ⌘ is a weight
function on A ⇥ B, and
⌘ ! ⌫
A⇥B (x) = A (x) · B (x).

(The superscripts !, ⌫, and ⌘ indicate which weight function is being used


for each set.)
Section 2.2 Generating Series. 55

Proof. To see that ⌘ is a weight function, consider any n 2 N. There are n + 1


choices for an integer 0  k  n. For each such k, there are only finitely
many pairs (↵, ) 2 A ⇥ B with !(↵) = k and !( ) = n k. That is, the set
of elements of A ⇥ B of weight n is
n
[
(A ⇥ B)n = Ak ⇥ Bn k ,
k=0

a finite (disjoint) union of finite sets. It follows that there are only finitely
many elements of A ⇥ B of weight n. Now,

X XX
⌘(↵, )
A⇥B (x) = x = x!(↵)+⌫( )
(↵, )2A⇥B ↵2A 2B
X X
= x!(↵) · x⌫( )
= A (x) · B (x).
↵2A 2B

The Product Lemma 2.12 can be extended to the Cartesian product of any
finite number of sets, by induction on the number of factors. (Exercise 2.7.)
Finally, the String Lemma combines both disjoint union and Cartesian
product, as follows. Let A be a set with a weight function ! : A ! N.
For any k 2 N, the Cartesian product of k copies of A is denoted by Ak .
The entries of Ak are k-tuples (↵1 , ↵2 , ..., ↵k ) with each ↵i 2 A. Notice that
A0 = {"} is the one-element set whose only element is the empty string
" = () of length zero. We can define a weight function !k on Ak by putting
!k (↵1 , ..., ↵k ) = !(↵1 ) + · · · + !(↵k ).
It is a good exercise to check that this is a weight function. Note that the
weight of the empty string is zero. Repeated application of the Product
Lemma 2.12 shows that for all k 2 N,
k
Ak (x) =( A (x)) .
We can take the union of the sets Ak for all k 2 N:
1
[

A = Ak .
k=0

Notice that the sets in this union are pairwise disjoint, since each Ak consists
of strings with exactly k coordinates. We define a function ! ⇤ : A⇤ ! N by
saying that ! ⇤ = !k when restricted to Ak .
56 Generating Series. Section 2.2

Lemma 2.13. Let A be a set with weight function ! : A ! N, and define


A⇤ and ! ⇤ : A⇤ ! N as above. Then ! ⇤ is a weight function on A⇤ if and
only if there are no elements in A of weight zero (that is, A0 = ?).

Proof. If 2 A has weight zero, !( ) = 0, then for any natural number


k 2 N, a sequence of k -s in Ak also has weight zero: !k ( , , ..., ) = 0. So,
by the way ! ⇤ : A⇤ ! N is defined, there are infinitely many elements of
weight zero in A⇤ , so that ! ⇤ is not a weight function.
Conversely, assume that every element of A has weight at least 1. Then,
for each k 2 N, every element of Ak has weight at least k. Now consider
any n 2 N and all the strings (↵1 , ..., ↵k ) 2 A⇤ of weight n. By the previous
sentence, if there are any such strings of length k then 0  k  n. For each
0  k  n, Ak has only finitely many elements of weight n. It follows that
A⇤ has only finitely many elements of weight n. Therefore, ! ⇤ is a weight
function on A⇤ .

Lemma 2.14 (The String Lemma.). Let A be a set with a weight function
! : A ! N such that there are no elements of A of weight zero. Then

1
A⇤ (x) = .
1 A (x)

Proof. By the Infinite Sum and Product Lemmas 2.11 and 2.12,
1
X 1
X k 1
A⇤ (x) = Ak (x) = ( A (x)) = .
k=0 k=0
1 A (x)
Section 2.3 Generating Series. 57

2.3 Compositions.

Definition 2.15. A composition is a finite sequence of positive integers:

= (c1 , c2 , ..., ck ),

in which k 2 N is a natural number, and each ci 1 is a positive


integer. The entries ci are called the parts of the composition. The
length of the composition is `( ) = k, the number of parts. The size of
the composition is
| | = c1 + c2 + · · · + ck ,
the sum of the parts.

Notice that there is exactly one composition of length zero: this is " = (),
the empty string with no entries. Compositions are related to multisets,
but there are two important differences: the parts of a composition must
be positive integers, not just nonnegative, and the length of a composition
might not be specified, while the number of types of element in a multiset
must be fixed.

Example 2.16. Here are all the compositions of size five:

(5) (2,3) (2,2,1) (1,2,1,1)


(4,1) (3,1,1) (2,1,2) (1,1,2,1)
(1,4) (1,3,1) (1,2,2) (1,1,1,2)
(3,2) (1,1,3) (2,1,1,1) (1,1,1,1,1)

In this section we will apply the results of Subsection 2.2.2 to obtain for-
mulas for the generating series of various sets of compositions defined by
imposing some extra conditions. In Chapter 4 we will see how to use this
information to actually count such things.
58 Generating Series. Section 2.3

Theorem 2.17. Let P = {1, 2, 3, ...} be the set of positive integers.


(a) The set C of all compositions is C = P ⇤ .
(b) The generating series for C with respect to size is
x
C (x) =1+ .
1 2x
(c) For each n 2 N, the number of compositions of size n is

1 if n = 0,
|Cn | = n 1
2 if n 1.

Proof. A single part is a positive integer c 2 P = {1, 2, 3, ...}. A composition


of length k is a sequence = (c1 , c2 , ..., ck ) of k positive integers, so is an
element of P k . Since the length k can be any natural number, the set C of all
compositions is
1
[
C= P k = P ⇤.
k=0

This proves part (a).


The generating series for one-part compositions with respect to size is
1
X x
P (x) = xc = x + x2 + x3 + · · · = ,
c=1
1 x

by the geometric series. From the String Lemma 2.14 it follows that
1 1 x x
C (x) = P ⇤ (x) = = = 1+ .
1 x/(1 x) 1 2x 1 2x
This proves part (b).
Expanding C(x) = C (x) using the geometric series, we obtain
1
X 1
X
C(x) = 1 + 2j xj+1 = 1 + 2n 1
xn .
j=0 n=1

Since |Cn | = [xn ]C(x) is the coefficient of xn in C(x), this proves part (c).

Proposition 2.23 is a bijection which gives a combinatorial proof of Theorem


2.17(c).
Section 2.3 Generating Series. 59

Many variations on the proof of Theorem 2.17 are possible. We will do a


few as examples, and present many more as exercises. The general approach
consists of three steps:
• Identify the allowed values for each part. This might depend on the
position of the part within the composition.
• Identify the allowed lengths for the compositions.
• Apply the Sum, Product, and String Lemmas to obtain a formula for
the generating series.

Example 2.18. Let F be the set of all compositions in which each part is
either one or two.
• The allowed sizes for a part are 1 or 2, so P = {1, 2} is the set of
allowed parts. The generating series for a single part is x + x2 .
• The length can be any natural number k 2 N. By the Product
Lemma, the generating series for compositions in F of length k is
(x + x2 )k .
• Since F = {1, 2}⇤ , the String Lemma implies that
1
X 1
X 1
F (x) = F (x) = f n xn = (x + x2 )k = .
n=0 k=0
1 x x2

Here, fn = [xn ]F (x) = |Fn | is the number of compositions in F of


size n. In Section 4.1 we will see how to use this information to get
a formula for the numbers fn .

Example 2.19. Let H be the set of all compositions in which each part is
at least two.
• The allowed sizes for a part are P = {2, 3, 4, ...}. The generating
series for a single part is
1
X
c 2 3 4 x2
P (x) = x = x + x + x + ··· = .
c=2
1 x

• The length can be any natural number k 2 N. By the Product


Lemma, the generating series for compositions in H of length k
60 Generating Series. Section 2.3

is ✓ ◆k
x2
.
1 x
• Since H = P ⇤ , the String Lemma implies that
1
X X1 ✓ ◆k
n x2
H(x) = H (x) = hn x =
n=0 k=0
1 x
1 1 x x2
= = = 1+ .
1 x2 /(1 x) 1 x x2 1 x x2

Here, hn = [xn ]H(x) is the number of compositions in H of size n.

Example 2.20. Let J be the set of all compositions in which each part is
odd.
• The allowed sizes for a part are P = {1, 3, 5, ...}. The generating
series for a single part is
1
X x
P (x) = x2i+1 = x1 + x3 + x5 + · · · = .
i=0
1 x2

• The length can be any natural number k 2 N. By the Product


Lemma, the generating series for compositions in J of length k is
✓ ◆k
x
.
1 x2

• Since J = P ⇤ , the String Lemma implies that


1
X 1
J(x) = J (x) = jn x n =
n=0
1 x/(1 x2 )
1 x2 x
= = 1+ .
1 x x2 1 x x2
Here, jn = [xn ]J(x) is the number of compositions in J of size n.

Example 2.21. The sets F, H, and J in Examples 2.18 to 2.20 have very
similar generating series. In fact, after a little thought one sees that for
Section 2.3 Generating Series. 61

all n 2,
1
[xn ] H(x) = [xn 1 ] J(x) = [xn 2 ] F (x) = [xn 2 ] .
1 x x2
This means that for all n 2, we have hn = jn 1 = fn 2 , so for the sizes
of sets we have |Hn | = |Jn 1 | = |Fn 2 |. We have proven these equalities
even though we don’t yet know what those numbers actually are! This
seems slightly magical, but it works.
Since these sets have the same sizes there must be bijections between
them to explain this fact. Constructing such bijections is left to Exercise
2.17. As a starting point, here are the sets for n = 7:

H7 J6 F5
(7) (5, 1) (2, 2, 1)
(5, 2) (1, 5) (2, 1, 2)
(2, 5) (3, 3) (1, 2, 2)
(4, 3) (3, 1, 1, 1) (2, 1, 1, 1)
(3, 4) (1, 3, 1, 1) (1, 2, 1, 1)
(3, 2, 2) (1, 1, 3, 1) (1, 1, 2, 1)
(2, 3, 2) (1, 1, 1, 3) (1, 1, 1, 2)
(2, 2, 3) (1, 1, 1, 1, 1, 1) (1, 1, 1, 1, 1)

(It need not be the case that the bijections match up these sets of com-
positions line by line in this table.) In Section 4.1 we will determine the
coefficients of the power series 1/(1 x x2 ), answering the counting
problem for these sets F, H, and J.

Example 2.22. Let Q be the set of all compositions in which each part is
at least two, and the number of parts is even.
• The allowed sizes for a part are P = {2, 3, 4, ...}. The generating
series for a single part is
1
X x2
P (x) = xc = x2 + x3 + x4 + · · · = .
c=2
1 x

• The length is an even natural number k = 2j for some j 2 N. By


the Product Lemma, the generating series for a composition in Q
62 Generating Series. Section 2.4

of length 2j is
✓ ◆2j
x2
.
1 x
• Since Q = (P 2 )⇤ , the String Lemma implies that
1
X X1 ✓ ◆2j
n x2
Q(x) = Q (x) = qn x =
n=0 j=0
1 x
1 (1 x)2
= =
1 x4 /(1 x)2 (1 x)2 x4
1 2x + x2 x4
= = 1 + .
1 2x + x2 x4 1 2x + x2 x4
Here qn = [xn ]Q(x) = |Qn | is the number of compositions in Q of
size n. In Example 4.11 we will see how to calculate the first several
values of |Qn |.

2.4 Subsets with Restrictions.

The theory above for compositions can be used to obtain generating series
for subsets of natural numbers subject to some restrictions on the “gaps”
between consecutive elements of the subset. This is because of the following
correspondence between such subsets and nonempty compositions.

Proposition 2.23. Let U be the set of pairs (n, A) in which n 2 N is a


natural number and A ✓ {1, 2, ..., n} is a subset. Let C r {"} be the set of
nonempty compositions. There is a bijection U ⌦ C r {"} between these
two sets.

Proof. We define mutually inverse bijections U ⌦ C r {"} as follows.


First, given (n, A) in U, sort the elements of A = {a1 , a2 , ..., ak } into in-
creasing order:
1  a1 < a2 < · · · < ak  n.

For convenience, put a0 = 0 and ak+1 = n + 1. Now define ci = ai ai 1 for


all 1  i  k + 1, and let = (c1 , c2 , ..., ck+1 ). Notice that each ci is a positive
integer, so that is a composition of length k + 1. Since k + 1 is at least one,
Section 2.4 Generating Series. 63

this composition is not empty, so that is in the set C r {"}. The size of
this composition is
k+1
X k+1
X
| |= ci = (ai ai 1 ) = ak+1 a0 = n + 1,
i=1 i=1

and its length is `( ) = |A| + 1.


Conversely, given a nonempty composition = (c1 , c2 , ..., c` ) in C r {"},
notice that ` 1. We define aj = c1 + c2 + · · · + cj for each 1  j  ` 1, and
let A = {a1 , a2 , ..., a` 1 } and n = | | 1. This defines a pair (n, A) in the set
U.
One can check that these constructions give a pair of mutually inverse
bijections U ⌦ C r {"} by using Proposition 1.11. This is left as Exercise
2.18

The bijection of Proposition 2.23 can be neatly summarized in a little ta-


ble, as follows:

U ⌦ C r {"}
(n, A) $
n=| | 1
|A| = `( ) 1.

Many variations on this bijection are possible, by putting some restrictions


on the allowed “gaps” between elements of the subset A and then analyzing
the corresponding set of (nonempty) compositions. When deriving these
generating series, we have to be careful that the n in the pair (n, A) corre-
sponds to | | 1 for the corresponding composition .

Example 2.24. For each n 2 N, let rn be the number of subsets of {1, ..., n}
that do not contain two consecutive numbers (like a and a+1). We obtain
P
a formula for the generating series R(x) = 1 n=0 rn x using the ideas of
n

Proposition 2.23.
For n 2 N, let Rn be the set of pairs (n, A) with A as in the statement
S
of the problem, and let R = 1 n=0 Rn . Then |Rn | = rn for all n 2 N, and
we want to determine the generating series for the set R with respect to
the weight function !(n, A) = n.
64 Generating Series. Section 2.4

The first question to ask is: which nonempty compositions correspond


to pairs in the set R? Notice that (n, A) is in R if and only if the corre-
sponding composition = (c1 , c2 , ..., c` ) of size n + 1 has ci 2 for all
2  i  ` 1. It is possible that maybe c1 = 1 or c` = 1, though. Let M
be the set of compositions corresponding to pairs in R. Notice that
X X
M (x) = M (x) = x| | = xn+1 = x R(x).
2M (n,A)2R

The compositions in M can be described as follows.


• The first and last parts are positive integers.
• Parts other than the first and last parts are integers greater than or
equal to 2.
• The length is at least one (since it is one more than the size of the
corresponding subset).
A part that is a positive integer has generating series x/(1 x), and a
part that is at least 2 has generating series x2 /(1 x), as we have seen.
Now we do a case analysis by the number of parts, using the Product
Lemma.
• For ` = 1 part the generating series is x/(1 x).
• For ` 2 parts the generating series is
✓ ◆` 2
x x2 x x2` 2
= .
1 x 1 x 1 x (1 x)`

Combining the contributions for all lengths ` 1 using the Sum Lemma,
we have
1
X
x2` 2x
xR(x) = M (x) = +
1 x `=2
(1 x)`
1 ✓
X ◆j
x x2 x2
= +
1 x (1 x)2 j=0 1 x
Section 2.5 Generating Series. 65

x x2 1
= + 2
· 2
1 x (1 x) 1 x /(1 x)
2 3
x x x x2
= +
(1 x)(1 x x2 ) (1 x)(1 x x2 )
2
x+x
= .
1 x x2
It follows that
1+x
R(x) = .
1 x x2

2.5 Proof of Inclusion/Exclusion.

In this section we prove Theorem 1.15, the Principle of Inclusion/Exclusion.

Lemma 2.25. For any nonempty set T ,


X
( 1)|S| 1
= 1.
?6=S✓T

Proof. Consider the identity


X
x|S| = (1 + x)n
S✓{1,2,...,n}

which was part of the proof of the Binomial Theorem. If T is any n-element
set then X
x|S| = (1 + x)n
S✓T

as well (as can be seen by numbering the elements of T arbitrarily). Both


sides are polynomials in x, so we can substitute x = 1. The result is
X
( 1)|S| = (1 1)n = 0n = 0,
S✓T

because n 1. (Note that 00 = 1, since it is an empty product.) On the LHS


we separate the term corresponding to S = ?, and see that
X
1+ ( 1)|S| = 0.
?6=S✓T
66 Generating Series. Section 2.5

Rearranging this gives the desired formula.

Recall the notation from Subsection 1.1.6: for any finite number of sets
A1 , A2 , . . . , Am and ? 6= S ✓ {1, 2, ..., m}, let
\
AS = Ai .
i2S

So, for example, A{2,3,5} = A2 \ A3 \ A5 .

Theorem 2.26 (Inclusion/Exclusion). Let A1 , A2 , . . . , Am be finite sets.


Then X
|A1 [ · · · [ Am | = ( 1)|S| 1 |AS |.
?6=S✓{1,2,...,m}

Proof. Let V = A1 [ · · · [ Am , and let Nm = {1, 2, .., m}. For each v 2 V let
T (v) = {i 2 Nm : v 2 Ai }. Notice that T (v) 6= ?, for all v 2 V . Also notice
that for ? 6= S ✓ Nm we have v 2 AS if and only if ? 6= S ✓ T (v). Therefore,
using Lemma 2.25 above, we have
X X X
( 1)|S| 1 |AS | = ( 1)|S| 1 1
?6=S✓Nm ?6=S✓Nm v2AS
X X X
= ( 1)|S| 1
= 1 = |V |,
v2V ?6=S✓T (v) v2V

as was to be shown.

Example 2.27 (The Euler totient function). For a positive integer n, the
Euler totient of n is the number '(n) of integers b in the range 1  b  n
such that b and n are relatively prime. That is,

'(n) = |{b 2 {1, 2, ..., n} : gcd(b, n) = 1}|.

We can use Inclusion/Exclusion to obtain a formula for '(n), as follows.


Let the prime factorization of n be n = pc11 pc22 · · · pcmm , in which the pi
are pairwise distinct primes and the ci are positive integers. For each
1  i  m, let
Ai := {b 2 Nn : pi divides b}.
Section 2.6 Generating Series. 67

Then

'(n) = |(Nn r (A1 [ · · · [ Am ))| = n |A1 [ · · · [ Am |.

Since the factors pi are pairwise coprime, for any ? 6= S ✓ Nm and


Q
b 2 Nn we have b 2 AS if and only if i2S pi divides b. Therefore,
n
|AS | = Q .
i2S pi

By Inclusion/Exclusion, it follows that


X Y 1
|A1 [ · · · [ Am | = n ( 1)|S| 1
.
?6=S✓Nm i2S
pi

Therefore
X Y 1
'(n) = n n ( 1)|S| 1

?6=S✓Nm i2S
pi
X Y 1 Ym ✓ ◆
|S| 1
= n ( 1) = n 1 .
S✓Nm i2S
pi i=1
pi

2.6 Exercises.

Exercise 2.1. Calculate the following coefficients.


(a) [x8 ](1 x) 7 .
(b) [x10 ]x6 (1 2x) 5 .
(c) [x8 ](x3 + 5x4 )(1 + 3x)6 .
(d) [x9 ]((1 4x)5 + (1 3x) 2 ).
(e) [xn ](1 2tx) k .
(f) [xn+1 ]xk (1 4x) 2k .
(g) [xn ]xk (1 x2 ) m .
(h) [xn ]((1 x2 ) k + (1 7x3) k ).
68 Generating Series. Section 2.6

Exercise 2.2. In each case, find an instance of a Binomial Series that


begins as shown.
(a) 1 2x + 3x2 4x3 + 5x4 6x5 + · · · .
(b) 1 + 3x + 6x2 + 10x3 + 15x4 + 21x5 + · · · .
(c) 1 x3 + x6 x9 + x12 x15 + · · · .
(d) 1 + 2x2 + 4x4 + 8x6 + 16x8 + 32x10 + · · · .
(e) 1 4x2 + 12x4 32x6 + 80x8 192x10 + · · · .
(f) 1 + 6x + 24x2 + 80x3 + 240x4 + 672x5 + · · · .

Exercise 2.3. Give algebraic proofs of these identities from Exercise


1.7.
P
(a) For all n 2 N, nk=0 nk k = n 2n 1 .
P
(b) For all n 2 N, nk=0 nk k(k 1) = n(n 1)2n 2 .

Exercise 2.4. Calculate [xn ](1 + x) 2 (1 2x) 2 . Give the simplest ex-
pression you can find.

Exercise 2.5.
(a) Let a 1 be an integer. For each n 2 N, extract the coefficient of
xn from both sides of this power series identity:

(1 + x)a 1
2 a
=
(1 x ) (1 x)a

to show that
✓ ◆ bn/2c ✓
X ◆✓ ◆
n+a 1 a k+a 1
= .
a 1 k=0
n 2k a 1

(b) Can you think of a combinatorial proof?

Exercise 2.6. Prove the Infinite Sum Lemma 2.11.


Section 2.6 Generating Series. 69

Exercise 2.7. Extend the Product Lemma 2.12 to the product of finitely
many sets with weight functions.

Exercise 2.8. Show that for m, n, k 2 N,


k
X ✓ ◆✓ ◆ ✓ ◆
j n+j 1 m m n
( 1) = .
j=0
j k j k

Exercise 2.9.
(a) Make a list of all the four-letter “words” that can be formed from
the “alphabet” {a, b}. Define the weight of a word to be the
number of occurrences of ab in it. Determine how many words
there are of weight 0, 1 and 2. Determine the generating series.
(b) Do the same for five-letter words over the same alphabet, but,
preferably, without listing all the words separately.
(c) Do the same for six-letter words.

Exercise 2.10.
(a) Consider throwing two six–sided dice, one red and one green.
The weight of a throw is the total number of pips showing on
the top faces of both dice (that is, the usual score). Make a table
showing the number of throws of each weight, and write down
the generating series.
(b) Do the same as for part (a), but throwing three dice: one red, one
green, and one white.

Exercise 2.11. Construct a table, as in Exercise 2.10(a), if the weight of


a throw is defined to be the absolute value of the difference between
the numberof pips showing on the two dice. Also, write down the
generating series.
70 Generating Series. Section 2.6

Exercise 2.12. Let S be the set of ordered pairs (a, b) of integers with
0  |b|  a. Each part gives a function ! defined on the set S. De-
termine whether or not ! is a weight function on the set S. If it is
not, then explain why not. If it is a weight function, then determine
the generating series S (x) of S with respect to !, and write it as a
polynomial or a quotient of polynomials.
(a) For (a, b) in S, let !((a, b)) = a.
(b) For (a, b) in S, let !((a, b)) = a + b.
(c) For (a, b) in S, let !((a, b)) = 2a + b.

Exercise 2.13. Let S = {1, 2, 3, 4, 5, 6}4 be the set of outcomes when


rolling four six-sided dice. For (a, b, c, d) 2 S, define its weight to be
!(a, b, c, d) = a + b + c + d. Consider the generating series S (x) of S
with respect to !.
✓ ◆4
x x7
(a) Explain why S (x) = .
1 x
(b) How many outcomes in S have weight 19?
(c) Let m, d, k be positive integers. When rolling m dice, each of
which has exactly d sides (numbered with 1, 2, ..., d pips, respec-
tively), how many different ways are there to roll a total of k
pips on the top faces of the dice? (Part (b) is the case m = 4,
d = 6, k = 19.)

Exercise 2.14. Let A be a set with weight function ! : A ! N. Show


that for any n 2 N, the number of ↵ 2 A with !(↵)  n is

1
[xn ] A (x).
1 x
Section 2.6 Generating Series. 71

Exercise 2.15. For each of the following sets of compositions, obtain


a rational function formula for the generating series of that set (with
respect to size).
(a) Let A be the set of compositions of length congruent to 1
(modulo 3).
(b) Let B be the set of compositions of length congruent to 2
(modulo 3).
(c) Let C be the set of compositions of even length, with each part
being at most 3.
(d) Let D be the set of compositions of odd length, with each part
being at least 2.
(e) Let E be the set of compositions = (c1 , c2 , ..., ck ) of any length,
in which each part ci is congruent to i (modulo 2). So c1 is odd, c2
is even, c3 is odd, and so on. (Note that the empty composition
" = () is in the set E.)

Exercise 2.16.
(a) Let An be the set of all compositions of size n in which every
part is at most three. Obtain a formula for the generating series
P1
n=0 |An | x .
n

(b) Let Bn be the set of all compositions of size n in which every


part is a positive integer that is not divisible by three. Obtain a
P
formula for the generating series 1 n=0 |Bn | x .
n

(c) Deduce that for all n 3, |Bn | = |An | |An 3 |.


(d)* Can you find a combinatorial proof of part (c)?

Exercise 2.17. Find bijections to explain the equalities in Example


2.21.
Exercise 2.18. Prove that the constructions in the proof of Proposition
2.23 define mutually inverse bijections U ⌦ C r {"}.

Exercise 2.19. For each part, determine the generating series for the
number of subsets S of {1, 2, ..., n} subject to the stated restriction.
(a) Consecutive elements of S differ by at most 2.
(b) Consecutive elements of S differ by at least 3.
(c) Consecutive elements of S differ by at most 3.
(d) Consecutive elements of S differ by a number congruent to 1
(modulo 3).
(e) Consecutive elements of S differ by a number congruent to 2
(modulo 3).
(f) If S = {a1 < a2 < · · · < ak } then ai ⌘ i (mod 2) for all 1  i  k.
(g) Fix integers 1  g < h. If S = {a1 < a2 < · · · < ak } then
g  ai ai 1  h for 2  i  k.

Exercise 2.20. Fix n, k 2 N. Let R(n, k) be the number of k-element


subsets S = {a1 < a2 < · · · < ak } of {1, 2, ..., n} such that ai ai 1 i
for all 2  i  k. Show that
✓ ◆
n k(k 1)/2
R(n, k) = .
k

Exercise 2.21. Let p(n) be a polynomial function of n.


(a) Prove, by induction on d = deg(p), that p(n) can be written as a
linear combination of n+j
j
for j = 0, 1, 2, ..., d.
(b) Briefly explain why there is a polynomial Ap (x) such that
1
X Ap (x)
p(n)xn = .
n=0
(1 x)d+1

(c) What can you say about the degree of Ap (x)? What can you say
about the value of Ap (1)?
Chapter 3

Binary Strings.

This chapter presents a wide variety of examples to which the generating


series technique of Chapter 2 applies. In Chapter 4 we will see how to
use these generating series to answer the counting problems which arise.
Similar calculations which provide even more information are presented in
Chapter 11.

Definition 3.1. A binary string is a finite sequence = b1 b2 · · · bn in


which each bit bi is either 0 or 1. The number of bits is the length of
the string, denoted `( ) = n. Thus, a binary string of length n is an
element of the Cartesian power {0, 1}n . A binary string of arbitrary
S
length is an element of the set {0, 1}⇤ = 1 n=0 {0, 1} . There is exactly
n

one binary string " = () of length zero, the empty string with no bits.

Clearly, there are 2n binary strings of length n, so that the generating


series for binary strings with respect to length is

X 1
X 1
{0,1}⇤ (x) = x`( )
= 2 n xn = .
1 2x
2{0,1}⇤ n=0

We will see how to describe various subsets of binary strings in a way which
allows us to determine their generating series (with respect to length).

73
74 Binary Strings. Section 3.1

3.1 Regular Expressions and Rational Languages.

Definition 3.2 (Regular Expressions.). A regular expression is defined


recursively, as follows.
• All of ", 0, and 1 are regular expressions.
• If R and S are regular expressions, then so is R ` S.
• If R and S are regular expressions, then so is RS.
For any finite k 2 N we also use Rk for the k-fold concatenation
of R: that is R2 = RR and R3 = RRR, and so on.
• If R is a regular expression, then so is R⇤ .

For example, (" ` 0⇤ 00) (1⇤ 0⇤ 00)⇤ 1⇤ is a regular expression. These regular
expressions are just formal syntactic constructions with no intrinsic mean-
ing. However, we will interpret them in two different ways.
• A regular expression R will produce a subset R ✓ {0, 1}⇤ . Such a subset
is called a rational language. (See Definition 3.5.)
• A regular expression R will lead to a rational function R(x). (See Defi-
nition 3.11.)
In general, the rational function R(x) is quite meaningless. However, under
favourable conditions on the expression R, it turns out that R(x) = R (x) is
the generating series of the set R with respect to length. Then the machinery
of Chapter 4 can be applied.

Definition 3.3 (Concatenation Product). Let ↵, 2 {0, 1}⇤ be binary


strings – so ↵ = a1 a2 · · · am and = b1 b2 · · · bn . The concatenation of ↵
and is the string

↵ = a1 a2 · · · am b 1 b 2 · · · b n .

Let A, B ✓ {0, 1}⇤ be sets of binary strings. The concatenation product


AB is the set
AB = {↵ : ↵ 2 A and 2 B}.
Section 3.1 Binary Strings. 75

Example 3.4. Consider the sets A = {011, 01} and B = {101, 1101}.
There are four ways to concatenate a string in A followed by a string
in B:
011.101, 011.1101, 01.101, 01.1101.
Here, the dot . indicates the point at which the concatenation takes place.
However, this information is not recorded when passing from ↵ 2 A
and 2 B to their concatenation ↵ . Thus the concatenation product
AB consists of the strings

011101, 0111101, 01101, 011101.

The string 011101 is produced twice. The concatenation product AB has


only three elements:

AB = {011101, 0111101, 01101}.

Definition 3.5 (Rational Languages.). A rational language is a set R ✓


{0, 1}⇤ of binary strings that is produced by a regular expression; this
is defined recursively as follows.
• To begin with, " produces {"} and 0 produces {0} and
1 produces {1}.
• If R produces R and S produces S, then R ` S produces R [ S.
• If R produces R and S produces S, then RS produces RS.
S
• If R produces R then R⇤ produces R⇤ = 1 k=0 R .
k

Here Rk is the concatenation product of k copies of R.

It is important to note that a rational language can be produced by many


different regular expressions, as we will see.
In Definition 3.5, when R produces R and S produces S, it might happen
that R [ S is not a disjoint union of sets. Also, the concatenation product RS
is not the same as the Cartesian product R ⇥ S, as Example 3.4 shows. These
facts lead to complications which are addressed in Section 3.2.
76 Binary Strings. Section 3.2

Example 3.6. Here are some easy examples.


• The regular expression 1⇤ produces the rational language

{1}⇤ = {", 1, 11, 111, 1111, ....}

of all finite strings of 1s (including the empty string ").


• The regular expression (1 ` 11)⇤ also produces the set {1}⇤ .
• The regular expression 1(11)⇤ produces the rational language of all
strings of 1s of odd (positive) length:

{1, 111, 11111, ...}.

• The regular expression (0 ` 1)⇤ produces the rational language


{0, 1}⇤ of all binary strings.
• The regular expression 1⇤ (01⇤ )⇤ also produces the rational language
{0, 1}⇤ of all binary strings.

Example 3.7. Not every set of binary strings is a rational language.


• The regular expression (01)⇤ produces the rational language

{", 01, 0101, 010101, 01010101, ...}.

For every even natural number 2j there is exactly one string of


length 2j in this set.
• The set
{", 01, 0011, 000111, 00001111, ...}
is not a rational language, even though for every even natural
number 2j there is exactly one string of length 2j in this set.

The problem with Example 3.7 is that to describe the second set we would
S
need an expression like 1 j=0 0 1 . However, an infinite union like this is
j j

not allowed according to Definition 3.2. The underlying difficulty is that a


regular expression has a “finite memory” and cannot remember arbitrarily
large numbers, like the j 2 N needed in the above expression. There is a
close connection between rational languages and finite state machines, and
this is a central topic in the theory of computation.
Section 3.2 Binary Strings. 77

3.2 Unambiguous Expressions.

Definition 3.8 (Unambiguous Expression). Let R be a regular expres-


sion that produces a rational language R. Then R is unambiguous if
every string in R is produced exactly once by R. If an expression is
not unambiguous then it is ambiguous.

As is usual with regular expressions, whether or not it is unambiguous


can be decided recursively.

Lemma 3.9 (Unambiguous Expression). Let R and S be unambiguous


expressions producing the sets R and S, respectively.
• The expressions " and 0 and 1 are unambiguous.
• The expression R ` S is unambiguous if and only if R \ S = ?, so that
R [ S is a disjoint union of sets.
• The expression RS is unambiguous if and only if there is a bijection
RS ⌦ R⇥S between the concatenation product RS and the Cartesian
product R⇥S. In other words, for every string ↵ 2 RS there is exactly
one way to write ↵ = ⇢ with ⇢ 2 R and 2 S.
• The expression R⇤ is unambiguous if and only if each of the concatena-
S
tion products Rk is unambiguous and the union 1 k=0 R is a disjoint
k

union of sets.

Proof. Exercise 3.1.

Example 3.10. Here are some easy examples.


• The expression 1⇤ is unambiguous.
• The expression (1 ` 11)⇤ is ambiguous: 1.11 = 11.1 = 1.1.1.
• The expression (0 ` 1)⇤ is unambiguous. This expression produces
each string in {0, 1}⇤ one bit at a time, so each string is produced in
exactly one way.
• The expression 1⇤ (01⇤ )⇤ is also unambiguous. First, the expression
01⇤ is unambiguous, since it produces a 0 followed by a (possibly
empty) string of 1s – each such string is produced exactly once.
78 Binary Strings. Section 3.2

Now (01⇤ )k is unambiguous for any k 2 N, since any string it pro-


duces will begin with a 0, it will have k bits equal to 0, and the
strings of 1s following these 0s can only be constructed in one way.
Next, (01⇤ )⇤ is unambiguous since for each k 2 N, the strings pro-
duced by (01⇤ )k have exactly k bits equal to 0 (so the union of sets
corresponding to the outer ⇤ is a disjoint union). Finally, 1⇤ (01⇤ )⇤
is unambiguous since for any string it produces, the length of the
initial string of 1s is also determined uniquely.

3.2.1 Translation into generating series.

We translate regular expressions into rational functions as follows.

Definition 3.11. A regular expression leads to a rational function; this


is defined recursively, as follows. Assume that R and S are regular
expressions that lead to R(x) and S(x), respectively.
• To begin with, " leads to 1 and 0 leads to x and 1 leads to x.
• The expression R ` S leads to R(x) + S(x).
• The expression RS leads to R(x) · S(x).
• The expression R⇤ leads to 1/(1 R(x)).

It is easy to see (again recursively!) that if R leads to R(x), then R(x) is a


rational function.

Example 3.12. Here are some easy examples.


• The unambiguous expression 1⇤ leads to 1/(1 x).
• The ambiguous expression (1 ` 11)⇤ leads to 1/(1 (x + x2 )).
But this expression produces the same rational language as 1⇤ .
• The unambiguous expression (0 ` 1)⇤ leads to

1 1
= .
1 (x + x) 1 2x

• The unambiguous expression 1⇤ (01⇤ )⇤ leads to

1 1 1
· = .
1 x 1 x · 1/(1 x) 1 2x
Section 3.2 Binary Strings. 79

Theorem 3.13. Let R be a regular expression producing the rational lan-


guage R and leading to the rational function R(x). If R is an unambiguous
expression for R then R(x) = R (x), the generating series for R with respect
to length.

Sketch of proof. The proof of this is, as usual, recursive. Or, one could say it
goes by induction on the complexity of the expression R, and uses the fact
that R is unambiguous. Certainly, each of ", 0, and 1 are unambiguous and
lead to the correct generating series for the sets {"}, {0}, and {1}, respec-
tively. The induction step follows from Lemma 3.9 and the Sum, Product,
and String Lemmas of Subsection 2.2.2, because each of the operations is
unambiguous.

Example 3.14. If the regular expression R producing R is ambiguous,


then the rational function R(x) is in general meaningless.
• For example, consider the regular expression (" ` 1)⇤ , which pro-
duces the rational language {1}⇤ . It is an ambiguous expression.
In fact, it produces every string of 1s in infinitely many ways. The
generating series for the set {1}⇤ is 1/(1 x). The expression (" ` 1)⇤
leads to 1/(1 (1 + x)) = x 1 . If this were a generating series it
would say that there are exactly 1 objects of size 1, and nothing
else. This makes no sense.
• Similarly, the ambiguous expression (1 ` 11)⇤ also produces the set
{1}⇤ . However, the expression (1 ` 11)⇤ leads to 1/(1 x x2 ),
which is also incorrect.

3.2.2 Block decompositions.

Definition 3.15 (Blocks of a string). Let = b1 b2 b3 · · · bn be a binary


string of length n. A block of is a nonempty maximal subsequence
of consecutive equal bits. To be clearer, a block is a nonempty subse-
quence bi bi+1 · · · bj of consecutive bits all of which are the same (all are
0, or all are 1), and which cannot be made longer. So, either i = 1 or
bi 1 6= bi , and either j = n or bj+1 6= bj .
80 Binary Strings. Section 3.2

Example 3.16. The blocks of the string 11000101110010001111001011 are


separated by dots here:

11.000.1.0.111.00.1.000.1111.00.1.0.11

Proposition 3.17 (Block Decompositions.). The regular expressions

0⇤ (1⇤ 10⇤ 0)⇤ 1⇤ and 1⇤ (0⇤ 01⇤ 1)⇤ 0⇤

are unambiguous expressions for the set {0, 1}⇤ of all binary strings.
They produce each binary string block by block.

Sketch of proof. By symmetry it is enough to consider the first expression.


The middle part 1⇤ 10⇤ 0 produces a block of 1s followed by a block of 0s.
This concatenation is unambiguous. The repetition of this (1⇤ 10⇤ 0)⇤ is also
unambiguous, since each pass through the repetition starts with a 1 and
ends with a 0. (Try it out on Example 3.16.) But the string we want to build
might start with a block of 0s: the initial 0⇤ allows this but does not require
it, since 0⇤ = " ` 0⇤ 0. The final 1⇤ similarly allows the string to end with a
block of 1s, but does not require it. All the operations are unambiguous, so
the whole expression is unambiguous.

Example 3.18. The block decomposition 0⇤ (1⇤ 10⇤ 0)⇤ 1⇤ is unambiguous,


and produces {0, 1}⇤ . It had better lead to the right generating series!
After a bit of calculation, we see that it leads to
1 1 1 1
· · = ,
1 x 1 x2 /(1 x)2 1 x 1 2x

which is good.

Example 3.19. Let G be the set of binary strings in which every block
of 1s has odd length. What is the generating series for G with respect
to length? We will modify the block decomposition 0⇤ (1⇤ 10⇤ 0)⇤ 1⇤ for all
binary strings. The expression 1⇤ 1 in the middle produces a block of 1s.
The expression 1⇤ = " ` 1⇤ 1 produces either the empty string or a block
Section 3.2 Binary Strings. 81

of 1s. If we want a block of 1s of odd length, then that is produced by


(11)⇤ 1. So the expression

G = 0⇤ ((11)⇤ 1 0⇤ 0)⇤ (" ` (11)⇤ 1)

is a block decomposition for the set G in question. It is therefore an un-


ambiguous expression that produces G. This expression leads to
✓ ◆
1 1 x
G(x) = · · 1+
1 x 1 (x/(1 x2 ))(x/(1 x)) 1 x2
1 + x x2 1 + x x2
= = .
(1 x)(1 x2 ) x2 1 x 2x2 + x3

By Theorem 3.13, G(x) = G (x) is the generating series of the set G. It


follows from Theorem 4.8 that the number gn of strings of length n in G
satisfies the linear recurrence relation with initial conditions given by
8
>
> 1 if n = 0,
>
<
1 if n = 1,
gn gn 1 2gn 2 + gn 3 =
> 1 if n = 2,
>
>
: 0 if n 3,

(with the convention that gn = 0 for all n < 0). This gives the initial
conditions g0 = 1, g1 = 2, g2 = 3 (which can be checked directly), and the
recurrence gn = gn 1 + 2gn 2 gn 3 for all n 3. It is easy to calculate
the first several of these numbers.

n 0 1 2 3 4 5 6 7 8
.
gn 1 2 3 6 10 19 33 61 108

To get an exact formula for gn we would need to factor the denominator


1 x 2x2 + x3 to find its inverse roots. They turn out to be slightly
horrible complex numbers, so we won’t do it.
82 Binary Strings. Section 3.2

Example 3.20. Let H be the set of binary strings in which each block of
0s has length one. It is not hard to see that (" ` 0) (1⇤ 1 0)⇤ 1⇤ is a block
decomposition for H, and is therefore unambiguous. This expression
leads to the formula
1
X 1 1 1+x
H(x) = hn xn = (1 + x) · · = .
n=0
1 x2 /(1 x) 1 x 1 x x2

This resembles Examples 2.21 and 2.24 yet again! These are the Fibonacci
P
numbers – see Example 4.1. With 1/(1 x x2 ) = 1 n=0 fn x , we see
n

that for n 1,
hn = [xn ]H(x) = fn + fn 1 = fn+1 .
Let’s reality check this for n = 5. The strings in H5 are 11111, five strings
with one 0, six strings with two 0s, and 01010. And indeed, f6 = 13.
Neat!

3.2.3 Prefix decompositions.

Example 3.21. Consider the regular expression (0⇤ 1)⇤ 0⇤ . We claim that
this is unambiguous. To see this, let = b1 b2 · · · bn be a binary string
produced by this expression. As an example, take

00111010110001011000.

How can this string be produced? The repetition (0⇤ 1)⇤ produces a bi-
nary string by chopping it into pieces after each occurrence of the bit 1.
But any string produced by this expression is either empty or ends with
a 1. The final “suffix” 0⇤ in the expression allows the possibility that the
string might end with some 0s. The string above is produced as

001.1.1.01.01.1.0001.01.1.000

and this is the only way it is produced. This rule – “chop the string into
pieces after each occurrence of the bit 1” – gives a unique way to produce
each binary string from the expression (0⇤ 1)⇤ 0⇤ . It follows that (0⇤ 1)⇤ 0⇤ is
an unambiguous expression for the set {0, 1}⇤ of all binary strings. And,
Section 3.3 Binary Strings. 83

indeed, (0⇤ 1)⇤ 0⇤ leads to


1 1 1
· = .
1 x/(1 x) 1 x 1 2x

In general, a prefix decomposition for a set of binary strings is a regular


expression of the form A⇤ B. The idea is to chop each string that is produced
into initial segments produced by the expression A. There might be a ter-
minal segment produced by the expression B. (It is possible that B = ".)
One does have to argue that such expressions are unambiguous, but – as in
Example 3.21 – this can usually be done by checking that:
• there is at most one way for a binary string to begin with an initial
segment produced by A, and
• if the string does not begin with an initial segment produced by A then
it is produced by B.
Both of the expressions A and B must be unambiguous, too, of course.
Similarly, a postfix decomposition has the form A(B⇤ ).

3.3 Recursive Decompositions.

Recursive decompositions are more general than regular expressions, and


can produce sets of binary strings that are more general than rational lan-
guages. The added feature is that in addition to the elementary building
blocks “"” and “0” and “1” in a regular expression, we are also allowed to
use letters that stand for sub-expressions on both sides of an equation.

Example 3.22. A simple example is the expression S = " ` (0 ` 1)S. Since


S appears on both the LHS and the RHS, in some sense this “defines S
in terms of itself”. This is the key idea behind recursive expressions.
One reads this expression as saying that a string produced by S is either
empty, or it consists of a single bit (either 0 or 1) followed by another
(shorter) string produced by S. By induction on the length of the string,
one sees that every binary string is produced exactly once in this way.
Translating the above expression into rational functions by analogy
84 Binary Strings. Section 3.3

with Definition 3.11 leads to an equation for S(x):

S(x) = 1 + (x + x)S(x).

This is easily solved to yield S(x) = 1/(1 2x).

Example 3.23. As in Example 3.7, consider the set of strings

B = {", 01, 0011, 000111, ...}.

This is not a rational language, but it can be described by the unam-


biguous recursive expression B = " ` 0 B 1. This leads to the equation
B(x) = 1 + x2 B(x), and thence to B(x) = 1/(1 x2 ).

Subsection 4.4.2 presents a more substantial example of a recursive decom-


position which produces a set of binary strings that is not a rational lan-
guage. The generating series for that example is not even a rational func-
tion.

3.3.1 Excluded substrings.

Let  2 {0, 1}⇤ be a nonempty binary string. We say that 2 {0, 1}⇤ contains
 if there are (possibly empty) binary strings ↵, such that = ↵  . If
does not contain  then avoids or excludes . Let A ⇢ {0, 1}⇤ be set of
strings that avoid . We will develop a general method for calculating the
generating series A (x).

Example 3.24. As an easy first example, consider the case  = 01011. Let
A be the set of strings avoiding 01011, and let B be the set of strings that
have exactly one occurrence of 01011, at the very end (that is, as a suffix).
Consider the strings in A [ B. Such a string is either empty, or it ends
with either a 0 or a 1. If this string is not empty, then removing the last
bit leaves a (possibly empty) string in A (because of the way the sets A
and B are defined). This translates into the relation

A ` B = " ` A (0 ` 1)
Section 3.3 Binary Strings. 85

for expressions A and B producing A and B, respectively. This leads to


the equation
A(x) + B(x) = 1 + 2x A(x)
for the generating series.
We need another equation in order to determine A(x) and B(x). If
= ↵ 01011 is an arbitrary string in B, then ↵ is in A, again from the
way the sets A and B are defined. We claim that the converse also holds:
if ↵ 2 A then = ↵ 01011 is in B. To see this we need to show that the
only occurrence of 01011 as a substring of is the one at the very end.
We know that ↵ 2 A does not contain 01011 as a substring. So if there
is another occurrence of 01011 inside then it must “overlap” the final
01011 in at least one position (but not in all positions). For this particular
excluded substring this is not possible, as the following table shows:

. . . . 0 1 0 1 1
0 1 0 1 1 . . . .
. 0 1 0 1 1 . . .
. . 0 1 0 1 1 . .
. . . 0 1 0 1 1 .

In each row after the first, there is at least one position at which the
shifted 01011 disagrees with the substring in the first row. So 01011 can-
not overlap itself in a nontrivial way. This gives the relation B = A 01011,
yielding the equation B(x) = x5 A(x). Substituting this into the first
equation gives 1 + 2x A(x) = (1 + x5 ) A(x), which is easily solved. We
conclude that A(x) = 1/(1 2x + x5 ).

Example 3.25. As a slightly trickier example, consider the case  = 01101.


Again, let A be the set of strings avoiding 01101, and let B be the set of
strings that have exactly one occurrence of 01101, at the very end (that
is, as a suffix). As in the previous example, a string in A [ B is either
empty, or it ends with either a 0 or a 1. The reasoning of the previous
example also holds in this case, translating into the same relation

A ` B = " ` A (0 ` 1)

and the same equation A(x) + B(x) = 1 + 2x A(x) for the generating
86 Binary Strings. Section 3.3

series.
The second equation is the slightly trickier part, because the string
01101 can overlap itself in a nontrivial way. As in the previous exam-
ple, we still have the set inclusion B ✓ A 01101. But the reverse inclu-
sion does not hold in this case: for example 011.01101 = 01101.101 is in
A 01101 but not in B, because it contains a substring 01101 that is not at
the very end.
. . . . 0 1 1 0 1
0 1 1 0 1 . . . .
. 0 1 1 0 1 . . .
. . 0 1 1 0 1 . .
. . . 0 1 1 0 1 .
Looking at all the possible ways that 01101 can overlap itself, we see that
in this case A 01101 = B [ B 101. This gives A 01101 = B (" ` 101) and
x5 A(x) = (1 + x3 )B(x). Substituting B(x) = x5 A(x)/(1 + x3 ) into the first
equation and solving for A(x) yields

1 + 3x + 2x2
A(x) = .
1 2x 2x2 + x3 + x5
The proof of the general result follows exactly the same pattern.

Theorem 3.26. Let  2 {0, 1}⇤ be a nonempty string of length n, and let
A = A be the set of binary strings that avoid . Let C be the set of all
nonempty suffixes of  such that  = ⌘  for some nonempty prefix ⌘ of
P
. Let C(x) = 2C x
`( )
. Then

1 + C(x)
A(x) = .
(1 2x)(1 + C(x)) + xn

Proof. Let B be the set of strings that contain exactly one occurrence of , at
the very end. Any string in A [ B is either empty or in the set A{0, 1}, so
that
A [ B = {"} [ A {0, 1}.
This yields A(x) + B(x) = 1 + 2x A(x).
The key observation is that if 2 B and ⇢ is a proper prefix of then
⇢ 2 A. Consequently, B ✓ A . Conversely, consider any = ↵  in A .
Section 3.4 Binary Strings. 87

Maybe 2 B. If 62 B, then there is an “early” occurrence of  in ↵  that


is not the occurrence at the very end. All such early occurrences of  must
overlap the final occurrence of  nontrivially because ↵ 2 A. Consider the
first such early occurrence of  in = ↵ . Then there is a nonempty suffix
2 C of , and a nonempty prefix ⌘ of , and a (possibly empty) prefix ⇢ of
↵ such that
= ↵ = ⇢⌘ = ⇢ .
Since we are looking at the first early occurrence of  in , the substring ⇢ 
is in B. Moreover, this argument shows that these substrings ⇢, ⌘, and are
determined uniquely from . This shows that every 2 A  is either in B,
or is in B for exactly one 2 C, and this decomposition is unique. That is,
[
A = B [ B
2C

is a disjoint union of sets. Translating this into generating series yields

xn A(x) = (1 + C(x)) B(x).

Solving this for B(x) and substituting this into the first equation gives
✓ ◆
xn
1 + 2x A(x) = A(x) · 1 + .
1 + C(x)

Solving this for A(x) yields the result.

In Section 12.1 we use a completely different method to solve a vast gen-


eralization of this excluded substring problem

3.4 Exercises.

Exercise 3.1. Prove Lemma 3.9.

Exercise 3.2. Let A = (10 ` 101) and B = (001 ` 100 ` 1001). For each
of AB and BA, is the expression unambiguous? What is the generating
series (by length) of the set it produces?
88 Binary Strings. Section 3.4

Exercise 3.3. Let A = (00 ` 101 ` 11) and B = (00 ` 001 ` 10 ` 110).
Prove that A⇤ is unambiguous, and that B⇤ is ambiguous. Find the
generating series (by length) for the set A⇤ produced by A⇤ .

Exercise 3.4. For each of the following sets of binary strings, write an
unambiguous expression which produces that set.
(a) Binary strings that have no block of 0s of size 3, and no block of
1s of size 2.
(b) Binary strings that have no substring of 0s of length 3, and no
substring of 1s of length 2.
(c) Binary strings in which the substring 011 does not occur.
(d) Binary strings in which the blocks of 0s have even length and
the blocks of 1s have odd length.

Exercise 3.5. Let G = 0⇤ ((11)⇤ 1(00)⇤ 00 ` (11)⇤ 11(00)⇤ 0)⇤ , and let G be
the set of binary strings produced by G.
(a) Give a verbal description of the strings in the set G.
(b) Find the generating series (by length) of G.
(c) For n 2 N, let gn be the number of strings in G of length n. Give
a recurrence relation and enough initial conditions to uniquely
determine gn for all n 2 N.

Exercise 3.6.
(a) Show that the generating series (by length) for binary strings in
which every block of 0s has length at least 2 and every block of
1s has length at least 3 is

(1 x + x3 )(1 x + x2 )
.
1 2x + x2 x5
(b) Give a recurrence relation and enough initial conditions to
determine the coefficients of this power series.
Section 3.4 Binary Strings. 89

Exercise 3.7.
(a) For n 2 N, let hn be the number of binary strings of length n
such that each even-length block of 0s is followed by a block of
exactly one 1 and each odd-length block of 0s is followed by a
block of exactly two 1s. Show that

1+x
hn = [xn ] .
1 x2 2x3
(b) Give a recurrence relation and enough initial conditions to
determine hn for all n 2 N.

Exercise 3.8. Let K be the set of binary strings in which any block of
1s which immediately follows a block of 0s must have length at least
as great as the length of that block of 0s. (Note: this is not a rational
language.)
P
(a) Derive a formula for K(x) = ↵2K x`(↵) .
(b) Give a recurrence relation and enough initial conditions to
determine the coefficients [xn ]K(x) for all n 2 N.

Exercise 3.9.
(a) Fix an integer m 1. Find the generating series (by length) of
the set of binary strings in which no block has length greater
than m.
(b) Fix integers m, k 1. Find the generating series (by length)
of the set of binary strings in which no block of 0s has length
greater than m, and no block of 1s has length greater than k.

Exercise 3.10. Let L be the set of binary strings in which each block of
1s has odd length, and which do not contain the substring 0001. Let
P
Ln be the set of strings in L of length n, and let L(x) = 1 n=0 |Ln |x .
n

(a) Give an expression that produces the set L unambiguously, and


explain briefly why it is unambiguous and produces L.
90 Binary Strings. Section 3.4

(b) Use your expression from part (a) to show that

1 + x x2
L(x) = .
1 x 2x2 + x3 + x4

Exercise 3.11. Let M be the set of binary strings in which each block
of 0s has length at most two, and which do not contain 00111 as a
substring. Let Mn be the set of strings in M of length n, and let M (x) =
P1
n=0 |Mn |x .
n

(a) Give an expression that produces the set M unambiguously, and


explain briefly why it is unambiguous and produces M.
(b) Use your expression from part (a) to show that

1 + x + x2
M (x) = .
1 x x2 x3 + x5

Exercise 3.12. Let N be the set of binary strings in which each block
of 0s has odd length, and each block of 1s has length one or two. Let
P
Nn be the set of strings in N of length n, and let N (x) = 1n=0 |Nn |x .
n

(a) Show that

1 + 2x + x2 x4 3 + x 3x2
N (x) = = 2+x+ .
1 2x2 x3 1 2x2 x3
(b) Derive an exact formula for |Nn | as a function of n.

Exercise 3.13. For n 2 N, let pn be the number of binary strings of


length n in which every block of 0s is followed by a block of 1s with
the same parity of length.
P
(a) Determine the generating series P (x) = 1 n=0 pn x .
n

(b) Show that if n 2, then pn = 2 · 3bn/2c 1 .


Exercise 3.14.
(a) Let Q be the set of binary strings that do not contain 11000 as a
substring. For n 2 N, let Qn be the set of strings in Q of length n.
P
Obtain a formula for the generating series Q(x) = 1 n=0 |Qn |x ,
n

with a brief explanation.


(b) Let R be the set of compositions, of any length, in which each
part is at most 4. For n 2 N, let Rn be the set of compositions in
R of size n. Obtain a formula for the generating series R(x) =
P1
n=0 |Rn |x , with a brief explanation.
n

(c) Deduce that for all integers n 1, |Rn | = |Qn | |Qn 1 |.


(d)* Part (c) implies that for every integer n 1, there is a bijection
Qn ⌦ Rn [ Qn 1 . Can you determine such a bijection precisely?

Exercise 3.15. Let V be the set of binary strings that do not contain
0110 as a substring. Show that the generating series (by length) for V
is
1 + x3
V (x) = V (x) = .
1 2x + x3 x4

Exercise 3.16.
(a) Let W be the set of binary strings that do not contain 0101 as a
substring. Obtain a formula for the generating series (by length)
of W.
(b) Fix a positive integer r 1 and consider the binary string (01)r .
(Part (a) is the case r = 2.) Obtain a formula for the generating
series of the set of binary strings that do not contain (01)r .

Exercise 3.17. Let S = A⇤ B be an unambiguous prefix decomposition


producing some set of strings S ✓ {0, 1}⇤ . Show that the recursion
R = B ` A R defines an expression R that produces the same set of
strings S ✓ {0, 1}⇤ . Also check that both S and R lead to the rational
function B(x)/(1 A(x)).
Chapter 4

Recurrence Relations.

In Chapters 2 and 3 we saw how to encode a sequence of numbers as the


P
coefficients of a power series G(x) = 1 n=0 gn x . We used the Sum, Product,
n

and String Lemmas to obtain algebraic formulas for these generating series.
In this section we will see two techniques for using these algebraic formulas
to compute the coefficients gn , which are the numbers we really want. First
we will do the example of Fibonacci numbers in detail, and then we will
develop the theory in general.

4.1 Fibonacci Numbers.

Example 4.1. The sequence of Fibonacci numbers f = (f0 , f1 , f2 , f3 , ...) is


defined by putting f0 = 1, f1 = 1, and fn = fn 1 + fn 2 for all n 2.
One can use this information to compute fn iteratively for as long as you
want:

n 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
fn 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610

Can we obtain a formula for fn as a function of n?

The answer is “yes”, of course – we obtain such a formula in this section.


Moreover, this is just the first example of a very general technique, which is
the main subject of this chapter.

93
94 Recurrence Relations. Section 4.1

Example 4.2. We start by finding a formula for the generating series


P
F (x) = 1 n=0 fn x . From the information defining the Fibonacci num-
n

bers, we see that


1
X 1
X
n
F (x) = f0 + f1 x + fn x = 1 + x + (fn 1 + fn 2 )xn .
n=2 n=2

The next step is to write the RHS in terms of F (x).


1
X 1
X
n
F (x) = 1 + x + fn 1 x + f n 2 xn
n=2 n=2
X1 1
X
=1+x+x fi xi + x 2
f j xj
i=1 j=0

= 1 + x + x(F (x) f0 ) + x2 F (x)


= 1 + xF (x) + x2 F (x).

This equation can be solved for F (x), yielding

1
F (x) = .
1 x x2

We have seen this generating series before, relating to the sets of composi-
tions F, H, and J in Example 2.21. Obtaining a formula for Fibonacci num-
bers will thus solve the counting problem for each of these sets of composi-
tions.
The key is the denominator of the series, in this case 1 x x2 .

Example 4.3. We factor the denominator in the form

1 x x2 = (1 ↵x)(1 x)

for some complex numbers ↵ and , called the inverse roots of the poly-
nomial. To do this we can use the Quadratic Formula, but since we are
looking for the inverse roots of a polynomial we have to be careful. Sub-
stitute x = 1/t and multiply both sides by t2 to get

t2 t 1 = (t ↵)(t ).
Section 4.1 Recurrence Relations. 95

Now the inverse roots of the denominator 1 x x2 are the (ordinary)


roots of this auxiliary polynomial t2 t 1. By the Quadratic Formula,
they are p p
↵ b ± b2 4ac 1± 5
= = .
2a 2

Example 4.4. Having found the inverse roots of the denominator, the
next step is to apply the Partial Fractions Theorem 4.12, which will be
explained (and proved) in Section 4.3. In this case it implies that there
are complex numbers A and B such that

1 1 A B
= = + .
1 x x2 (1 ↵x)(1 x) 1 ↵x 1 x

There are a few different ways to find the numbers A and B, as we will
see. Here we can multiply by 1 x x2 = (1 ↵x)(1 x) and collect
like powers of x:

1 = A(1 x) + B(1 ↵x) = (A + B) (A + B↵)x.

Comparing coefficients, we see that A + B = 1 and A + B↵ = 0.


From A↵ + B↵ = ↵ and A + B↵ = 0, we see that
p
↵ 5+ 5
A= = .
↵ 10

Similarly, from A + B = and A + B↵ = 0, we see that


p
5 5
B= = .
↵ 10

Now all that remains is to put the pieces of this calculation together.

Example 4.5. We apply the geometric series expansion to the result of


the partial fractions decomposition. (More generally, we would use bi-
96 Recurrence Relations. Section 4.2

nomial series.)
X1 X1 1
X
A B n n n n
+ =A ↵ x +B x = (A↵n + B n
) xn .
1 ↵x 1 x n=0 n=0 n=0

It follows that for all n 2 N, the Fibonacci numbers are given by the
formula
1
fn = [xn ]F (x) = [xn ] = A↵n + B n
1! x x2
p p n p p !n
5+ 5 1+ 5 5 5 1 5
= + .
10 2 10 2

That seems kind of weird, since we know that the Fibonacci numbers are
p
integers. But notice that = (1 5)/2 ⇡ 0.618 so that as n ! 1, n
! 0.
p p !n
5+ 5 1+ 5
In fact, for all n 2 N, fn is the integer closest to .
10 2

4.2 Homogeneous Linear Recurrence Relations.

Definition 4.6 (Homogeneous linear recurrence relation). Let g =


(g0 , g1 , g2 , ...) be an infinite sequence of complex numbers. Let a1 , a2 ,
..., ad be in C, and let N d be an integer. We say that g satisfies a
homogeneous linear recurrence relation provided that

g n + a1 g n 1 + a2 g n 2 + · · · + ad g n d = 0

for all n N . The values g0 , g1 , ..., gN 1 are the initial conditions of the
recurrence. The relation is linear because the LHS is a linear combi-
nation of the entries of the sequence g; it is homogeneous because the
RHS of the equation is zero.

In Definition 4.6, if the RHS of the equation is instead a non-zero function


p : N ! C, then this is an inhomogeneous linear recurrence relation.
The recurrence relation can be rewritten as
gn = (a1 gn 1 + a2 g n 2 + · · · + ad g n d )
for all n N , and this can be used to compute the numbers gn by induction
Section 4.2 Recurrence Relations. 97

on n, using the initial conditions as the basis of induction.


Consider the Fibonacci numbers from Example 4.1: f0 = f1 = 1 are the
initial conditions, and fn fn 1 fn 2 = 0 for all n 2 is a homogeneous
linear recurrence relation. We derived the formula
1
X 1
F (x) = f n xn =
n=0
1 x x2

for the generating series. This is an instance of a general fact about a se-
quence with a homogeneous linear recurrence relation. Here is another ex-
ample before we see the general theory.

Example 4.7. Define a sequence g = (g0 , g1 , ...) by the initial conditions


g0 = 2, g1 = 5, and g2 = 6, and the relation gn 3gn 2 2gn 3 = 0 for all
P
n 3. Obtain a formula for the generating series G(x) = 1 n=0 gn x .
n

The general method is to multiply the recurrence by xn and sum over


all n N . In this case
1
X
(gn 3gn 2 2gn 3 ) xn = 0.
n=3

Now we split the LHS into separate summations, reindex them, and
write everything in terms of the power series G(x)
1
X 1
X 1
X
n n
gn x 3 gn 2 x 2 g n 3 xn = 0
n=3 n=3 n=3
1
X X1
(G(x) g0 g1 x g 2 x2 ) 3x2 g j xj 2x 3
g k xk = 0
j=1 k=0
2 2
(G(x) 2 5x 6x ) 3x (G(x) 2) 2x3 G(x) = 0
G(x) 3x2 G(x) 2x3 G(x) = 2 + 5x.

It follows that
2 + 5x
G(x) = .
1 3x2 2x3

Notice how the polynomial 1 3x2 2x3 in the denominator of this for-
mula is related to the linear recurrence relation gn 3gn 2 2gn 3 = 0 for
n 3. We can explain the numerator, too, if we make the convention that
98 Recurrence Relations. Section 4.2

gn = 0 for all integers n < 0. Then, using the initial conditions, we have
g0 3g 2 2g 3 =2 0 0 = 2 for n = 0,
g1 3g 1 2g 2 =5 0 0 = 5 for n = 1,
g2 3g0 2g 1 =6 3·2 0 = 0 for n = 2,
gn 3gn 2 2gn 3 = 0 for n 3.

Theorem 4.8. Let g = (g0 , g1 , g2 , ...) be a sequence of complex numbers,


P1
and let G(x) = n=0 gn x be the corresponding generating series. The
n

following are equivalent.


(a) The sequence g satisfies a homogeneous linear recurrence relation

g n + a1 g n 1 + · · · + ad g n d = 0 for all n N,

with initial conditions g0 , g1 , ..., gN 1 .


(b) The series G(x) = P (x)/Q(x) is a quotient of two polynomials.
The denominator is

Q(x) = 1 + a1 x + a2 x2 + · · · + ad xd

and the numerator is P (x) = b0 + b1 x + b2 x2 + · · · + bN 1x


N 1
,
in which
bk = gk + a1 gk 1 + · · · + ad gk d
for all 0  k  N 1, with the convention that gn = 0 for all n < 0.

Proof. To prove this theorem, we just copy the calculation in Example 4.7,
but do it in the most general case. For convenience, let a0 = 1. Assume that
part (a) holds, and let
Q(x) = 1 + a1 x + a2 x2 + · · · + ad xd .
Consider the product Q(x)G(x):
d
! 1
!
X X
Q(x)G(x) = aj x j g n xn
j=0 n=0
1
d X 1 d
!
X X X
n+j
= aj gn x = aj g k j xk .
j=0 n=0 k=0 j=0
Section 4.2 Recurrence Relations. 99

In the last step we have re-indexed the double sum using k = n + j, and
used the convention that gn = 0 for all n < 0.
The coefficient of xk in this formula is gk + a1 gk 1 + · · · + ad gk d . This
is the LHS of the recurrence relation for g applied when n = k. Thus, this
coefficient is zero for k N . On the other hand, for 0  k  N 1, we see
P
that it is dj=0 aj gk j = bk by the way the numbers bk are defined. That is,
N
X1
Q(x)G(x) = bk xk = P (x),
k=0

and it follows that G(x) = P (x)/Q(x). This shows that (a) implies (b).
Conversely, assume that (b) holds and that G(x) = P (x)/Q(x) is as given.
We essentially run the argument for the first part of the proof in reverse.
The equations bk = gk + a1 gk 1 + · · · + ad gk d for 0  k  N 1 (with the
convention that gn = 0 for n < 0) determine the initial conditions g0 , g1 ,...
gN 1 inductively. For n N , the coefficient of xn in P (x) = Q(x)G(x) is
zero. This implies that gk + a1 gk 1 + · · · + ad gk d = 0 for all k N , showing
that (b) implies (a).

Theorem 4.8 is useful in both directions, as the following two examples


show.
1
X 1 3x + 4x2
Example 4.9. Let D(x) = d n xn = 3
. Obtain a homoge-
n=0
1 2x + 3x
neous linear recurrence relation and initial conditions satisfied by the
sequence d = (d0 , d1 , d2 , ...).
From Theorem 4.8 we can read from the RHS that for all n 2 N:
8
>
> 1 if n = 0,
>
<
3 if n = 1,
dn 2dn 1 + 3dn 3 =
>
> 4 if n = 2,
>
: 0 if n 3,

with the convention that dn = 0 if n < 0. We can determine the initial


conditions inductively as follows: d0 = 1; d1 2d0 = 3, so d1 = 1;
d2 2d1 = 4, so d2 = 2. The recurrence dn 2dn 1 + 3dn 3 = 0 holds for
all n 3. Using the initial conditions d0 = 1, d1 = 1, d2 = 2, and the
100 Recurrence Relations. Section 4.2

recurrence dn = 2dn 1 3dn 3 for all n 3, we can compute dn for as


long as we want:

n 0 1 2 3 4 5 6 7 8
dn 1 1 1 1 5 4 5 5 22

Example 4.10. A sequence s = (s0 , s1 , s2 , ...) is defined by the initial con-


ditions s0 = 1, s1 = 2, s2 = 1, and the recurrence sn sn 1 2sn 3 = 0 for
P
all n 3. Obtain a formula for the generating series S(x) = 1 n=0 sn x .
n

Since we have the information at hand, we might as well compute a


few more values of sn :

n 0 1 2 3 4 5 6 7
sn 1 2 1 3 7 9 15 29

To get the generating series, Theorem 4.8 implies immediately that the
denominator is 1 x 2x3 . To obtain the numerator we apply the recur-
rence for small values of n, with the convention that sn = 0 if n < 0.
8
< 1 if n = 0,
sn sn 1 2sn 3 = 2 1 = 1 if n = 1,
:
1 2 = 1 if n = 2.

Thus, the numerator is 1 + x x2 . The generating series is

1 + x x2
S(x) = .
1 x 2x3

Example 4.11. Let’s revisit Example 2.22, concerning the set Q of all com-
positions in which each part is at least two, and the number of parts is
even. We derived the generating series
1
X 1 2x + x2
Q(x) = qn x n = .
n=0
1 2x + x2 x4
Section 4.3 Recurrence Relations. 101

From Theorem 4.8 we see immediately that


8
>
> 1 if n = 0,
>
<
2 if n = 1,
qn 2qn 1 + qn 2 qn 4 =
>
> 1 if n = 2,
>
: 0 if n 3,

with the convention that qn = 0 if n < 0. We can inductively calculate


the first several values of qn :

n 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
qn 1 0 0 0 1 2 3 4 6 10 17 28 45 72 116

We have determined that |Q14 | = 116, but we have not listed all these
compositions individually. That is pretty cool, when you think about it.

4.3 Partial Fractions.

A rational function is a quotient of two polynomials P (x)/Q(x). This is anal-


ogous to a rational number being a quotient of two integers.

Theorem 4.12 (Partial Fractions). Let G(x) = P (x)/Q(x) be a rational


function in which deg P < deg Q and the constant term of Q(x) is 1. Factor
the denominator to obtain its inverse roots:
d1 d2 ds
Q(x) = (1 1 x) (1 2 x) · · · (1 s x)

in which 1 , ..., s are distinct nonzero complex numbers and d1 +· · ·+ds =


d = deg Q. Then there are d complex numbers:
(1) (2) (d1 ) (1) (2) (d2 )
C1 , C1 , ..., C1 ; C2 , C2 , ..., C2 ; ... ; Cs(1) , Cs(2) , ..., Cs(ds )

(which are uniquely determined) such that

Xs X ds (j)
P (x) Ci
G(x) = = j
.
Q(x) i=1 j=1
(1 i x)
102 Recurrence Relations. Section 4.3

Proof. Consider the set VQ of all rational functions P (x)/Q(x) in which Q is


a fixed polynomial as in the statement of the theorem, and P is any poly-
nomial of degree strictly less than d = deg Q. It is easily seen that VQ is a
vector space over the complex numbers C, since if P (x) and R(x) both have
degree less than d and ↵ 2 C then P (x) + ↵R(x) has degree less than d. It is
also clear that the vectors

1 x x2 xd 1
, , , ...,
Q(x) Q(x) Q(x) Q(x)

span VQ as a vector space over C. Thus, the dimension of VQ is at most d.


Now we claim that for every 1  i  s and 1  j  di , the quotient
i x) is in VQ . This is because
j
1/(1
di j
Q dh
1 (1 i x) h6=i (1 h x)
j
=
(1 i x) Q(x)

and the numerator has degree d jd 1 < d.


The essential point in the proof is that the set of vectors

1
B = j
: 1  i  s and 1  j  di
(1 i x)

in VQ is linearly independent. From this we can conclude that the dimension


of VQ is at least d1 + · · · + ds = d. It then follows that dim VQ = d and that B
is a basis for VQ . Therefore, every vector in VQ can be written uniquely as a
linear combination of vectors in B. That is exactly what the Partial Fractions
Theorem is claiming.
It remains only to show that B is a linearly independent set. Consider
any linear combination of vectors in B which gives the zero vector:

s X
X ds (j)
Ci
j
= 0. (4.1)
i=1 j=1
(1 i x)

(j)
We must show that Ci = 0 for all 1  i  s and 1  j  di . Suppose not.
Then there is some index 1  p  s for which at least one of the coefficients
(1) (2) (d ) (t)
Cp , Cp , ..., Cp p is not zero. Letting Cp 6= 0 be the one with the largest
(t+1) (d )
superscript, we also have Cp = · · · = Cp p = 0.
Section 4.3 Recurrence Relations. 103

Now multiply equation (4.1) by (1 p x) . Separating out the terms with


t
(t+1) (d )
i = p and using the fact that Cp = · · · = Cp p = 0, we see that
t
X di
XX t
(j) (1 p x)
Cp(j) (1 p x)
t j
+ Ci j
= 0.
j=1 i6=p j=1
(1 i x)

The LHS is a rational function of the variable x which does not have a pole
at the point x = 1/ p , so we can substitute this value for x. But every term
on the LHS has a factor of (1 p x) except for the term with i = p and j = t.
Thus, upon making the substitution x = 1/ p this equation becomes
Cp(t) = 0.
But this contradicts our choice of p and t. This contradiction shows that all
(j)
the coefficients Ci in equation (4.1) must be zero, and it follows that the set
B is linearly independent.
Since B is a set of d linearly independent vectors in a vector space VQ
of dimension at most d, it follows that B is a basis for VQ , and the proof is
complete.

Example 4.13. Let’s re-examine the generating series

2 + 5x
G(x) =
1 3x2 2x3
from Example 4.7. This satisfies the hypotheses of the Partial Fractions
Theorem 4.12. The denominator 1 3x2 2x3 vanishes when x = 1, so
that 1 + x is a factor. Some calculation shows that

1 3x2 2x3 = (1 + x)(1 x 2x2 ) = (1 + x)2 (1 2x).

Thus, there are complex numbers A, B, C such that

2 + 5x A B C
= + + .
1 3x2 2x3 1 + x (1 + x)2 1 2x

Now multiply by the denominator on the LHS.

2 + 5x = A(1 + x)(1 2x) + B(1 2x) + C(1 + x)2 .

This is an equality of polynomials, so it holds for any value of x.


104 Recurrence Relations. Section 4.3

• At x = 1 we find that 2 5 = B(1 + 2), so that B = 1.


• At x = 1/2 we find that 2 + 5/2 = C(3/2)2 , so that 9/2 = C(9/4), so
that C = 2.
• At x = 0 we find that 2 = A + B + C, so that A = 2 B C =
2 + 1 2 = 1.
Therefore
2 + 5x 1 1 2
= + .
1 3x2 2x3 1+x (1 + x)2 1 2x
Now we can expand each of these terms using Binomial Series, and col-
lect the results.
X1 X1 ✓ ◆ X1
n n+1
G(x) = ( x) ( x)n + 2 (2x)n
n=0 n=0
1 n=0
1
X
= (( 1)n (n + 1)( 1)n + 2 · 2n ) xn
n=0
1
X
= 2n+1 n( 1)n xn .
n=0

It follows that gn = [xn ]G(x) = 2n+1 + n( 1)n+1 for all n 2 N. This can
be “reality checked” by comparison with the initial conditions g0 = 2,
g1 = 5, and g2 = 6, and the recurrence relation gn 3gn 2 2gn 3 = 0 for
all n 3 defining this sequence in Example 4.7. The first few values are

n 0 1 2 3 4 5 6
gn 2 5 6 19 28 69 122
Section 4.3 Recurrence Relations. 105

4.3.1 The Main Theorem.

Theorem 4.14. Let g = (g0 , g1 , g2 ) be a sequence of complex numbers, and


P
let G(x) = 1 n=0 gn x be the corresponding generating series. Assume that
n

the equivalent conditions of Theorem 4.8 hold, and that

P (x)
G(x) = R(x) +
Q(x)

for some polynomials P (x), Q(x), and R(x) with deg P (x) < deg Q(x) and
Q(0) = 1. Factor Q(x) to obtain its inverse roots and their multiplicities:
d1 d2 ds
Q(x) = (1 1 x) (1 2 x) · · · (1 s x) .

Then there are polynomials pi (n) for 1  i  s, with deg pi (n) < di , such
that for all n > deg R(x),
n n n
gn = p1 (n) 1 + p2 (n) 2 + · · · + ps (n) s.

Proof. The conclusion of the theorem only concerns terms with n > deg R(x),
so we can basically ignore the polynomial R(x). In truth, all it is doing is
getting in the way, and preventing the formula from holding for smaller
values of n. So we are going to concentrate on the quotient P (x)/Q(x), to
which the Partial Fractions Theorem 4.12 applies.
Consider the factor (1 i x)
di
of the denominator Q(x). In the partial
fractions expansion of P (x)/Q(x), this contributes
(1) (2) (di )
Ci Ci Ci
+ 2
+ ··· + di
.
1 ix (1 i x) (1 i x)

Each term is a binomial series, and can be expanded accordingly:

di
X (j) di
X X1 ✓ ◆
Ci (j) n+j 1 n n
j
= Ci ix
j=1
(1 i x) j=1 n=0
j 1
1
X Xdi ✓ ◆!
(j) n + j 1 n n
= Ci ix .
n=0 j=1
j 1

Notice that n+j 1


j 1
is a polynomial function of n of degree j 1. It follows
106 Recurrence Relations. Section 4.3

that
di
X ✓ ◆
(j) n+j 1
pi (n) = Ci
j=1
j 1
is a polynomial function of n of degree at most di 1. The contribution of
the inverse root i to the coefficient gn = [xn ]G(x) is thus pi (n) ni . By the
form of the partial fractions expansion we see that
n n n
gn = p1 (n) 1 + p2 (n) 2 + · · · + ps (n) s,

completing the proof.

The converse of Theorem 4.14 also holds – see Exercise 4.11.


One can use Theorem 4.14 to go straight from a recurrence relation to a
formula for its entries, without doing Partial Fractions explicitly. Here is an
example of this kind of calculation.

Example 4.15. A sequence h of integers is given by the initial conditions h0 =


1, h1 = 1, h2 = 0, h3 = 2, h4 = 4, h5 = 3, and the recurrence hn 3hn 1 +
4hn 3 = 0 for all n 6. Obtain a formula for hn as a function of n.
From Theorem 4.8 we see that the denominator of the generating se-
ries H(x) = hn xn is 1 3x + 4x3 . This vanishes at x = 1, so 1 + x is a
factor. After some work, we obtain

1 3x + 4x3 = (1 + x)(1 4x + 4x2 ) = (1 2x)2 (1 + x).

Theorem 4.14 implies that there are constants A, B, C such that for suffi-
ciently large n, hn = (A + Bn)2n + C( 1)n . To determine these constants
we need to take data from the sequence h from a point later than the de-
gree of the polynomial R(x) appearing in Theorem 4.14. From Theorem
4.8, in this case the degree of the numerator of the generating series H(x)
is no more than five, since the general case of the recurrence holds for
n 6. Writing

P (x) (1 3x + 4x3 )R(x) + P (x)


H(x) = R(x) + 3
= ,
1 3x + 4x 1 3x + 4x3
it follows that the degree of R(x) is at most two. So we can fit the form
hn = (A + Bn)2n + C( 1)n to the data h3 = 2, h4 = 4, and h5 = 3.
Section 4.3 Recurrence Relations. 107

This gives three equations in three unknowns – a standard linear algebra


problem.

h3 = 2 = (A + 3B)8 C = 8A + 24B C
h4 = 4 = (A + 4B)16 + C = 16A + 64B + C
h5 = 3 = (A + 5B)32 C = 32A + 160B C

In this case, it is a rather unpleasant linear algebra problem. Sparing you


the details, the solution is A = 5/16, B = 1/16, C = 3, and so

hn = (n 5)2n 4
3( 1)n

for all n 3. The values for hn with 0  n  2 are given in the initial
conditions.

4.3.2 Inhomogeneous Linear Recurrence Relations.

Example 4.16. Define a sequence of integers g = (g0 , g1 , g2 , ...) by the


initial conditions g0 = 1 and g1 = 2, and the recurrence relation

gn = gn 1 + 2 gn 2 n+1

for all n 2. This clearly determines the sequence g inductively:

n 0 1 2 3 4 5 6 7 8
gn 1 2 3 5 8 14 25 47 90

What is gn as a function of n 2 N?

We solve Example 4.16 by generalizing the method above just a little bit.
First, write the recurrence in the form

gn gn 1 2 gn 2 = n+1

for all n 2. This is an inhomogeneous linear recurrence relation according


to Definition 4.6. But we can proceed just as before. We begin by obtaining
P
a formula for the generating series G(x) = 1 n=0 gn x . Multiply both sides
n
108 Recurrence Relations. Section 4.3

by xn and sum over all n 2. On the RHS we obtain


1
X 1
X 1
X
( n + 1) xn = ( (j + 2) + 1) xj+2 = x2 (j + 1) xj
n=2 j=0 j=0
2
x
= .
(1 x)2
On the LHS we obtain
X1
(gn gn 1 2 g n 2 ) xn
n=2
X1 1
X 1
X
n n
= gn x gn 1 x 2 gn 2 xn
n=2 n=2 n=2
1
X 1
X
= (G(x) g0 g1 x) x g j xj 2 x2 g k xk
j=1 k=0
2
= (G(x) 1 2x) x (G(x) 1) 2x G(x)
= (1 x 2x2 )G(x) 1 x.
Equating the LHS and the RHS yields
x2 (1 + x)(1 x)2 x2
(1 x 2x2 )G(x) = 1 + x =
(1 x)2 (1 x)2
Noting that 1 x 2x2 = (1 + x)(1 2x), we obtain the formula
1 x 2x2 + x3
G(x) = .
(1 + x)(1 x)2 (1 2x)
This is a rational function, and so the Main Theorem 4.14 applies. There are
complex numbers A, B, C, D such that gn = A( 1)n + (B + Cn) + D2n for
all n 2 N. For n 2 {0, 1, 2, 3} this yields the system of linear equations
A +B +D =1
A +B +C +2D =2
A +B +2C +4D =3
A +B +3C +8D =5
Solving this system yields A = 1/12, B = 3/4, C = 1/2, D = 1/3, so that
1
gn = 2n+2 + (6n + 9) ( 1)n
12
for all n 2 N.
Section 4.3 Recurrence Relations. 109

Example 4.17. The denominator of G(x) in the above example is 1 3x +


x2 + 3x3 2x4 . From Theorem 4.8 we see that the sequence g satisfies the
homogeneous linear recurrence relation and initial conditions given by
8
>
> 1 if n = 0,
>
>
< 1 if n = 1,
>
gn 3gn 1 + gn 2 + 3gn 3 2gn 4 = 2 if n = 2,
>
>
>
> 1 if n = 3,
>
:
0 if n 4.

This agrees with the results above.

Examples 4.16 and 4.17 illustrate a general fact: if the generating series
of the RHS in an inhomogeneous linear recurrence relation is a rational
function, then the generating series for the entries of the sequence is also a
rational function. Thus, the sequence in fact satisfies a homogeneous linear
recurrence relation, so we are actually back in the case we have already
considered. Proving this in general is the main point of this subsection.
The following terminology is not standard but will be convenient. A
function q : N ! C is polyexp if there are polynomial functions qi (n) and
complex numbers i 2 C for 1  i  t such that

q(n) = q1 (n) n
1 + q2 (n) n
2 + · · · + qt (n) n
t (4.2)
p
for all n 2 N. For example, cos(n✓) = (ei✓n + e i✓n )/2 is polyexp, but n is
not. More generally, the function q : N ! C is eventually polyexp if there is
an integer M 2 N such that equation (4.2) holds for all n M . The Main
P1
Theorem 4.14 thus states that if G(x) = n=0 gn x is a rational function
n

then gn is an eventually polyexp function of n. (Exercise 4.11 is the converse


implication.)

Theorem 4.18. Let g = (g0 , g1 , g2 , ...) be a sequence of complex numbers.


The following are equivalent.
(a) The sequence g satisfies a homogeneous linear recurrence relation (with
initial conditions).
(b) The sequence g satisfies a possibly inhomogeneous linear recurrence
relation (with initial conditions) in which the RHS is an eventually
110 Recurrence Relations. Section 4.4

polyexp function.
P
(c) The generating series G(x) = n=0 gn x is a rational function (a
n

quotient of polynomials in x).


(d) The sequence g = (g0 , g1 , g2 , ...) is an eventually polyexp function.

Proof. Theorem 4.8 shows that conditions (a) and (c) are equivalent. Theo-
rem 4.14 shows that (c) implies (d). That (d) implies (c) is left as Exercise
4.11. It is clear that (a) implies (b). All that remains is to show that (b) im-
plies (c).
Thus, assume that g satisfies the linear recurrence relation
gn + a1 gn 1 + · · · + ad gd = q(n)
for all n N , with initial conditions g0 , g1 , ..., gN 1 , in which q : N ! C is an
eventually polyexp function as in equation (4.2) for all n M .
UNDER CONSTRUCTION

4.4 Quadratic Recurrence Relations.

Let g = (g0 , g1 , g2 , ...) be a sequence of numbers with generating series G(x) =


P1
n=0 gn x . In Theorem 4.8 we saw that g satisfies a homogeneous linear re-
n

currence relation (with initial conditions) if and only if G(x) = P (x)/Q(x) is


a rational function. Rewriting this as Q(x)G(x) P (x) = 0, we see that G(x)
is a solution to a linear equation: QG P = 0.

Definition 4.19. The sequence g satisfies a quadratic recurrence if its


generating series G(x) satisfies a quadratic equation:

A(x)G(x)2 + B(x)G(x) + C(x) = 0.

Here, the coefficients A(x), B(x), and C(x) are power series in x.

There are two solutions to the equation in Definition 4.19, and they can
be found using the Quadratic Formula:
p
G+ (x) B(x) ± B(x)2 4 A(x) C(x)
= .
G (x) 2 A(x)
Section 4.4 Recurrence Relations. 111

Rigorous justification for this kind of algebra with power series is discussed
in detail in CO 330. If G(x) is a generating series for some combinatorial ob-
jects then it has only nonnegative coefficients and nonnegative exponents.
This can be used to decide which case of the ± sign to take. In general, only
one of G+ (x) or G (x) is the correct generating series.

4.4.1 The general binomial series.

In Section 2.1 we saw the Binomial Theorem and The Binomial Series with
negative integer exponents. That is, for a natural number n 2 N,
n ✓ ◆
X
n n
(1 + x) = xk
k=0
k

and for a positive integer t 1,


1 ✓
X ◆
1 n+t 1
= xn .
(1 x)t n=0
t 1

These are two special cases of the general binomial series.

Definition 4.20. For any complex number ↵ 2 C and nonnegative


integer k 2 N, the k-th binomial coefficient of ↵ is
✓ ◆
↵ 1
= (↵)(↵ 1) · · · (↵ k + 1).
k k!

This binomial coefficient is a polynomial function of ↵ of degree k.

Theorem 4.21 (The Binomial Series). For any complex number ↵ 2 C,


1 ✓ ◆
X
↵ ↵
(1 + x) = xk .
k=0
k

Sketch of proof. We can think of (1+x)↵ as a function of a complex variable x.


The only possible singularity is that 0↵ might not be well-defined – this can
happen only for x = 1. Therefore, (1 + x)↵ is analytic in the disc |x| < 1,
112 Recurrence Relations. Section 4.4

and so it has a Taylor series expansion. By Taylor’s Theorem, the coefficient


of xk in this Taylor series expansion is
✓ ◆
1 dk ↵ 1 ↵ k ↵
k
(1 + x) = (↵)(↵ 1) · · · (↵ k + 1)(1 + x) = .
k! dx x=0 k! x=0 k
This proves the stated formula.

We will use the following special case.

X1 ✓ ◆
p 1 2k 2 k
Proposition 4.22. 1 4x = 1 2 x .
k=1
k k 1

1
X ✓ ◆
p 1/2 k
Proof. By Theorem 4.21, 1 4x = k k
( 1) 4 x .
k=0
k
For k = 0, the coefficient of x0 is ( 1)0 40 1/2
0
= 1.
For k 1, we can calculate as follows.
✓ ◆
k k 1/2 1
( 1) 4 = ( 1)k 4k (1/2)( 1/2)( 3/2) · · · (1/2 k + 1)
k k!
1
= 4k (1/2)(1/2)(3/2) · · · (k 3/2)
k!
k 1
= 2 (1)(1)(3)(5) · · · (2k 3)
k!
2 (1)(3) · · · (2k 3) (2)(4) · · · (2k 2)
= · ·
k (k 1)! (k 1)!
✓ ◆
2 2k 2
= .
k k 1
(Where did we use the fact that k 1 in this calculation?)

4.4.2 Catalan numbers.

The numbers Cn = n+1


1 2n
n
are called Catalan numbers . The first few Cata-
lan numbers are shown here:
n 0 1 2 3 4 5 6 7 8 9 10
.
Cn 1 1 2 5 14 42 132 429 1430 4862 16796
Catalan numbers occur surprisingly often in answers to counting problems.
Section 4.4 Recurrence Relations. 113

Example 4.23 (Well-Formed Parenthesizations.).


A well-formed parenthesization (WFP) is a sequence of n opening paren-
theses and n closing parentheses which “match together” using the usual
rules for grouping parentheses. The size of a WFP is the number of open-
ing parentheses in it. Here are all the WFPs of size 3:

()()() ()(()) (())() (()()) ((()))

And here are all the WFPs of size 4:

()()()() (())()() (()())() ()()(())


(())(()) (()(())) ()(())()
((()))() ((())()) ()(()()) (()()())
((()())) ()((())) (((())))

We determine the number wn of WFPs of size n, for all n 2 N. Let

1
X
W (x) = w n xn
n=0

be the generating series for WFPs with respect to size.


We can obtain a quadratic recurrence for W (x), as follows. The empty
sequence " contributes x0 = 1 to the generating series W (x). Any other
WFP begins with an opening parenthesis. There is exactly one closing
parenthesis which matches to the beginning parenthesis. That is, = ( ↵ )
for some other sequences ↵ and . Note that ↵ or might be empty. Because
of the way parentheses are matched to each other, both ↵ and are in fact
WFPs themselves. The total number of opening parentheses in is n( ) =
1 + n(↵) + n( ). Conversely, given any WFPs ↵ and we can always form
a new nonempty WFP: ( ↵ ) .
Writing a 0 instead of a (, and a 1 instead of a ), a WFP can be thought of
as a binary string in {0, 1}⇤ . Let W be the set of all binary strings correspond-
ing to WFPs. The previous paragraph justifies the claim that the recursive
decomposition

W = " ` 0W1W
114 Recurrence Relations. Section 4.4

is unambiguous. This allows us to calculate as follows:


X X
W (x) = xn( ) = xn(") + xn( )

2W 2Wr{"}
X X
=1 + x1+n(↵)+n( )

↵2W 2W
! !
X X
n(↵) n( )
=1 + x x x
↵2W 2W

= 1 + xW (x)2 .

Now we can solve this equation xW (x)2 W (x) + 1 = 0 using the Quadratic
Formula: p
W+ (x) 1 ± 1 4x
= .
W (x) 2x
p
Proposition 4.22 gives the power series for 1 4x, so that
p 1 ✓ ◆ !
1 ± 1 4x 1 1 X 1 2n n+1
= ± 1 2 x .
2x 2x 2x n=0
n+1 n

To get nonnegative coefficients, and to cancel the term 1/2x, we need to take
the minus sign from the ±. The result is
1
X ✓ ◆
1 2n n
W (x) = x .
n=0
n+1 n

Thus, the number of WFPs of size n is the n-th Catalan number Cn = n+1 1 2n
n
,
for each n 2 N.
p
Since the generating series for the set W is W (x) = (1 1 4x)/2x,
which is not a rational function, it follows from Theorem 3.13 that the set of
WFPs is not a rational language.
Section 4.5 Recurrence Relations. 115

4.5 Exercises.

Exercise 4.1. For each of the sets of compositions from Exercise 2.15,
do the following.
• Derive a recurrence relation and initial conditions for the coeffi-
P
cients of the corresponding generating series G(x) = 1 n=0 gn x .
n

• Calculate the coefficients g0 , g1 , ... up to g9 .

Exercise 4.2. Let K be the set of compositions = (c1 , c2 , ..., ck ) with


at least one part, and such that the first part is odd. Let K(x) be the
generating series for K with respect to size.
(a) Show that
x
K(x) = .
(1 + x)(1 2x)
(b) Use part (a) to show that, among all 2n 1 compositions of size
n 1, the fraction of these compositions in the set K is
✓ ◆n 1
2 1 1
+ .
3 3 2

Exercise 4.3. Consider the power series


1
X 1 2x2
cn xn = = 1 + 5x + 15x2 + 39x3 + · · ·
n=0
1 5x + 8x2 4x3

(a) Give a linear recurrence relation that (together with the initial
conditions above) determines the sequence of coefficients (cn :
n 0) uniquely.
(b) Derive a formula for cn as a function of n 0.
[Hint: 1 5x + 8x2 4x3 = (1 x)(1 4x + 4x2 ).]

Exercise 4.4. Consider the power series


1
X x + 7x2
A(x) = an x n = .
n=0
1 3x2 2x3
116 Recurrence Relations. Section 4.5

(a) Write down a linear recurrence relation and enough initial con-
ditions to determine the sequence (an : n 2 N) uniquely.
(b) Given that 1 3x2 2x3 = (1 2x)(1 + x)2 , obtain a formula for
an as a function of n 2 N.

Exercise 4.5. Consider the power series


1
X 3 11x + 11x2
cn xn = = 3 + x + 0x2 + x3 + 6x4 + 19x5 + · · ·
n=0
1 4x + 5x2 2x3

(a) Give a linear recurrence relation that (together with the initial
conditions above) determines the sequence of coefficients (cn :
n 0) uniquely.
(b) Derive a formula for cn as a function of n 0.

Exercise 4.6. A sequence of integers is determined by the initial con-


ditions g0 = 1, g1 = 2, g2 = 3, and the recurrence relation gn =
2gn 1 gn 2 + 2gn 3 for all n 3.
(a) Obtain a rational function formula for the generating series
1
X
G(x) = gn xn = 1 + 2x + 3x2 + 6x3 + 13x4 + 26x5 + 51x6 + · · · .
n=0

(b) Obtain a formula for the coefficient gn as a function of n 2 N.

Exercise 4.7. Define a sequence of numbers (cn : n 2 N) by the initial


conditions c0 = 1, c1 = 2, and c2 = 3, and the recurrence relation
cn = cn 1 + 2cn 2 + 2cn 3 for all n 3.
(a) Obtain an algebraic formula for the rational function
1
X
C(x) = cn xn = 1 + 2x + 3x3 + 3x3 + 7x4 + 5x5 + · · · .
n=0

(b) Obtain a formula for cn as a function of n 2 N.


Section 4.5 Recurrence Relations. 117

Exercise 4.8.
(a) Obtain a formula for the coefficients of the rational function
1
X 1 + 3x x2
B(x) = bn x n = .
n=0
1 3x2 2x3

(b) Derive a recurrence relation and use it to check your answer.

Exercise 4.9. Define a sequence (hn : n 2 N) by the initial conditions


h0 = 1, h1 = 2, h2 = 0, h3 = 5, and the recurrence relation hn =
2hn 1 + hn 2 + 4hn 3 + 2hn 4 for all n 4.
(a) Obtain an algebraic formula for the rational function
1
X
H(x) = hn xn = 1 + 2x + 0x2 + 5x3 + 0x4 + 9x5 + · · · .
n=0

(b) Obtain a formula for hn as a function of n 2 N.

Exercise 4.10.
(a) Find rational numbers A, B, C such that for all n 2 N,
✓ ◆
2 n+2
n =A + B(n + 1) + C.
2
P
(b) Write 1 n2 xn as a quotient of polynomials.
Pn=0
(c) Write 1 n=0 n x as a quotient
3 n
of polynomials.
P1 d n
(d) For each d 2 N, let Fd (x) = n=0 n x .
Show that F0 (x) = 1/(1 x) and for all d 1,

d
Fd (x) = x Fd 1 (x).
dx
(e) Let Fd (x) = Pd (x)/(1 x)1+d . Derive a recurrence relation for
the polynomials Pd (x).
Exercise 4.11. Show that the converse of Theorem 4.14 holds. That is,
assume that
n n n
gn = p1 (n) 1 + p2 (n) 2 + · · · + ps (n) s

for all n N , in which pi (n) is a polynomial of degree strictly less than


di and the i are distinct nonzero complex numbers, for 1  i  s. Let
d1 d2 ds
Q(x) = (1 1 x) (1 2 x) · · · (1 s x) .

Then 1
X P (x)
G(x) = gn xn = R(x) +
n=0
Q(x)
in which P (x) and R(x) are polynomials, and deg P (x) < deg Q(x)
and deg R(x) < N .

Exercise 4.12.
1 ✓
X ◆
1 2k
(a) Show that p = xk .
1 4x
k=0
k
Xn ✓ ◆✓ ◆
2j 2n 2j
(b) Deduce that for all n N, = 4n .
j=0
j n j
(c)* Can you think of a combinatorial proof of part (b)?

You might also like