Formal Logic & Discrete Math Guide
Formal Logic & Discrete Math Guide
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code
as cleverly as possible, you are, by definition, not smart enough to debug it.
--Brian W. Kernighan
Table Of Contents
Chapter 1: Logic and Proofs 1
Chapter 2: Sets, functions, sequences and sums 13
Chapter 3: Algorithms, the integers and matrices 20
Chapter 4: Induction and recursion 35
Chapter 5: Counting 44
Chapter 9: Graphs 50
Chapter 10: Trees 61
Logical Operators
negation:
not p read, not p true when p is false
connectives:
conjunction pq read, p and q true when both p and q are true
disjunction pq read, p or q true when either p or q are true
exclusive or pq read, p x-or q true when p q F but p q T
conditional:
implication pq read, p implies q true when p F or p q T
biconditional:
bi-implication pq read, p if and only if q true when p q T and q p T
Order of operations
1. ( ) 2. 3. 4. 5. 6.
Truth Tables
Truth tables start with a systematic and exhaustive list of the possible truth values for each
proposition, typically followed by the resulting truth value of the compound propositions leading
ultimately to the final truth value.
Example
p q r
There are three variables (simple propositions). There is a compound or, the negation of r and the
complete statement.
p q r pq r p q r
T T T T F F
T T F T T T
T F T T F F
T F F T T T
F T T T F F
F T F T T T
F F T F F T
F F F F T T
Bit strings consist of a sequence of 0's and 1's, each of which is a bit. The length of the string is
the number of bits. By convention 1 denotes true and 0 denotes false. Bitwise operations require
stings of the same length, the propositions are evaluated on bits in corresponding positions in the
strings.
Example
0 1 1 1 String 1
1 0 1 1 String 2
0 0 1 1 And
1 1 1 1 Or
1 1 0 0 Xor
HW
18. first determine p and q
a. Promotion - p, Wash Car - q. If you want to get promoted, then you must wash the boss's car.
b. Winds - p, thaw - q. If the winds are from the south, then there is a spring thaw.
c. Bought - p, warranty - q. If you bought the computer less than a year ago, then the warranty is
good.
d. Cheats - p, caught - q. If Willy cheats, then he gets caught.
e. Access - p, pay - q. If you access the website, then you must pay a subscription fee.
f. Know - p, elected - q. If you know the right people, then you will be elected.
g. Boat - p, sick - q. If Carol is on a boat, then she gets seasick.
46. Let p denote truth teller, and r denote truthful response to question.
a. p r actual r
T F p rF
F T p r F
b. If a truth teller is asked, do you tell the truth, the response is, yes. If a liar is asked the same
question the answer will also be yes. So ask either one, what would you say, if I asked you if you
tell the truth? The truth teller still says yes, but the liar must now say no.
62. Truth table approach. Let each be denoted by their first initial. The first compound
proposition that is false removes that combination.
Logical equivalence occurs when two propositions have the same truth values in all conditions.
The notation is, p q , and a consequence of equivalence is the fact that p q is a tautology.
p q pq p p q
T T T F T
T F F F F
F T T T T
F F T T T
To show that the bi-conditional of these is a tautology we use the equivalence we just showed.
Since either both are false, in which case the negation of one is sufficient to make an or
statement true, or both are true, in which case the negation of one is not sufficient to make an or
statement false.
Example
Show that p q r p q r .
Method 1, truth table.
p q r p q r p q r
T T T T T
T T F F F
T F T T T
T F F T T
F T T T T
F T F F T
F F T T T
F F F F T
TOC
Quantifiers modify predicates by expressing the extent to which the predicate is to be applied to
the domain. Two primary quantifiers are the universal, denoted , and the existential, . The
universal quantifier states that the predicate will yield the same output regardless of the input
(provided the input is valid), while the existential quantifier merely claims that there is some
valid input which will produce the claimed output. The area of logic that treats predicates and
quantifiers is called predicate calculus. Given a universal quantifier, one only needs to find a
single counterexample to show that the statement is false. Given an existential quantifier, a single
example suffices to show that the statement is true. Quantifiers may be employed to give the
exact number of x's which satisfy a propositional function. One such is uniqueness; it is
represented by placing an exclamation mark after the existence symbol. When quantifiers are
applied to a variable, it is said to be bound. When not bound, a variable is free. Statements
involving predicates and quantifiers are logically equivalent iff they have the same truth value no
matter what predicates are employed and regardless of domain.
Negation of quantifiers.
Negation Equivalent
xP x xP x
xP x xP x
HW
12. Q x : x 1 2 x xZ
a. Q(0) T b. Q(-1) T c. Q(1) F d. xQ x T e. xQ x F
f. xQ x T g. xQ x F
TOC
Statement Meaning
xyP x, y
P(x,y) is true for all x, y.
yxP x, y
xyP x, y For every x, there is at least one y which makes P(x,y) true.
xyP x, y There is at least one x which makes P(x,y) true for every y.
xyP x, y There is at least one x, y pair that makes P(x,y) true.
HW
28. Domain of all variables is .
a. xy x 2 y T b. xy x y 2 F
c. xy xy 0 T d . xy x y y x F
e. x x 0 y xy 1 T f . xy y 0 xy 1 F
g . xy x y 1 T h. xy x 2 y 2 2 x 4 y 5 F
i. xy x y 2 2 x y 1 F j. xyz z x y / 2 T
TOC
General Inference:
Rule Tautology Name
p
pq p p q q modus ponens
q
q
pq q p q p modus tollens
p
pq
qr hypothetical
p q q r p r
syllogism
pr
pq
p p q p q disjunctive syllogism
q
p
pq p p q addition
pq
p p q p simplification
p
q p q p q conjunction
pq
pq
p r p q p r q r resolution
q r
Remember that p q p q . This means the implication is true when p is false no matter
what value q takes on.
Assuming that p must be true because q is, is a fallacy known as affirming the conclusion.
By the same token, assuming that q must be false because p is false is equally invalid, a fallacy
known as denying the hypothesis.
Rule Name
xP x
universal instantiation
P c
P c for an arbitrary c
universal generalization
xP x
xP x
existential instantiation
P c for some c
P c for some c existential
xP x generalization
HW
6. Assign variables - r it rains, f it is foggy, s sailing race is held, l life saving demo
is on and t trophy is awarded. Form logical propositions:
r f s l
st
t
Construct a valid argument:
r f s l premise t premise
r f s simplification r f t premise
s t r f r f modus
premise
tollens
r f t hypothetical
syllogism
16.
a. Let P(x) mean x is enrolled in the university. Let Q(x) mean x has lived in a dormitory.
xP x Q x
Q Mia Universal modus tollens, valid.
P Mia
c. Let P(x) mean x is an action movie and Q(x) mean that Quincy likes x.
xP x Q x
Q Eight Men Out Affirming the conclusion, not valid.
P Eight Men Out
TOC
Section 6: Introduction to proofs
A theorem is a statement which can be proven to be true, propositions are theorems considered
to be less important. A proof for a theorem is a valid argument the conclusion of which
establishes the truth of the theorem. Often included in proofs are axioms or postulates statements
assumed to be true without proof. Often these are definitions. Complicated proofs are sometimes
broken up into smaller proofs (modules) called lemmas. Theorems which can be established as
the direct consequence of the truth of another theorem are corollaries and statements which are
thought to be true, but which are not yet proven are called conjectures.
The words obviously and clearly should be avoided in a proof. They bring no real information to
the reader except perhaps some insight into the ego of the author. Generally it is neither clear nor
obvious, but surely we must be in awe of anyone for whom it is.
Brown's Maxim - If you have completed more than three steps in a proof without mentioning
a definition, you are probably doing it wrong.
Statement Justification
Let p and q be odd integers By premise we have two odd integers
p = 2k -1 Definition of odd integer
q = 2j -1 Definition of odd integer
p + q = 2k +2j - 2 Definition of addition
p + q = 2(k + j -1) Distributive property of multiplication over addition
Definition of even integer (also closure of integers
P + q is even since 2(k + j -1) is even
under addition and multiplication)
Contraposition
A direct proof of the contrapositive, that is, since p q q p , assume that "not q" is true,
then show that "not p" must also be true.
Statement Justification
Let n be an odd integer Premise by way of contraposition
n = 2k - 1, k an integer Definition of odd integer
n3 5 8k 3 12k 2 6k 1 5 By definition of exponents
8k 3
12k 2 6k 1 5 2 8k 3 12k 2 6k 2 Distributive property of multiplication over
addition
2 8k 12k 6k 2 an even integer
3 2
Definition of even integer
n3 5 odd n even Contraposition
Contradiction
Assume that what you are trying to prove is false, then show that this leads to a contradiction.
r 3
r 1 0 r
Lemma 1
If k is an integer, then 2k + 1 is odd. 2k + 1 = 2k + 2 - 1 = 2(k + 1) -1. Lemma 1 is used in the
following Lemmas.
Lemma 2
If p is an odd integer, then pn is an odd integer when n is a positive integer.
n n
2k 1 2k n 2k 1 2k 1 ... 2k 1 n 2k 1 1
n n n 1 n2 2 2 n2 n 1 n
2 n 2
n 1
i n
by the Binomial Theorem. But this is equal to 2 k 1 1 , which is odd.
n i 1 n 1 n
2
i 0 i
Lemma 3
If p and q are both odd, then pq is odd.
(2k - 1)(2j - 1) = 2j(2·k - 1) - 2k + 1 = 2(j(2·k - 1) - k) +1 which is odd.
Lemma 4
If p, q, and r are all three odd integers, then the sum of the three is odd.
(2k -1) + (2j - 1) + (2m -1) = 2(k + j + m - 1) - 1.
Lemma 5
If p is even and q is odd, then p + q is odd. (2k) + (2j - 1) = 2(k + j) - 1
Statement Justification
r BWOC
r = p/q, p and q integers with no common divisors Definition of rational
p 3 pq 2 q 3 0 Multiply both sides by q 3
Case 1, p even and q even is not possible p and q integers with no common divisors
Case 2, p odd and q odd Premise
LHS is sum of three odd not 0 Lemmas 2 - 5
Case 3, p even and q odd Premise
LHS is sum of three odd not 0 Lemmas 2 - 5
Case 4, p odd and q even Premise
LHS is sum of three odd not 0 Definition of odd integer
p and q are integers, but neither odd nor even
r
HW
8. If n is a perfect square, then n + 2 is not. BWOC n = b2 and n + 2 = a2, where a and b are
both integers greater than zero (by definition of perfect square). This means a2 - b2 = 2,
equivalently (a - b)(a + b) = 2. Since n + 2 > n it follows that a > b, so a - b is a positive integer
as is a + b. Since 1 and 2 are the only positive integers whose product is 2, a - b = 1 and a + b =
2. But this implies 2a = 3 or a = 3/2 which is not an integer.
TOC
WLOG, without loss of generality is a shortcut that claims when assignment of values is
arbitrary; we may assume whichever assignment we want. For example to prove that a + b is odd
if a and b are not both odd. WLOG we can assume that the two cases, a odd, b even and a even, b
odd are equivalent and so let a be odd and b even.
Existence proofs require that some x exist so that P(x) is true. If such an x can be found, our
proof is constructive. It is sometimes the case that it can be proved that such an x exists, but not
possible (or convenient) to find a specific case.
Uniqueness proofs require that an x exists that meets our requirements, and that everything that
meets our requirements can be shown to be that same x.
When constructing a proof it must be shown, ultimately, that the premises lead to the conclusion.
Quite often it is more convenient to start at the conclusion and work backwards. Provided that
we can simply reverse the path of this train of thought, the proof is valid. Recall the delta-epsilon
proofs in calculus used to show that a limit existed, we needed to show that when x was no
farther from "a" than delta, that this forced f(x) to be no farther from the limit as x approached
"a" than epsilon. Choosing delta is very problematic unless we start at the conclusion. Just
remember to put the line of your proof in the correct order before you present the proof.
Be on the lookout for counterexamples whenever you are asked if a statement is true. For
example, is it true that cos(x2) ≥ cos2(x) for 0 ≤ x ≤ 2π. We know that cos(π) = -1, so if x then
LHS < RHS since the minimum value for RHS is zero. A single counterexample is sufficient, we need not
show that the claim is always false, merely that it is not always true.
HW
5. Proving the triangle inequality can be approached by taking 9 cases (trichotomy requires three choices
for x and three for y). This list can be reduced WLOG to 6 cases. Of these, one is vacuous and two re
trivial.
a a 1 c
33. we pick two arbitrary rational numbers with common denominator, WLOG, .
b b b
2
2 a
Since 1 , it follows a 2 c . By definition, a and b are integers, so proving that
2
b b b
2
a
x 2 is irrational should not be hard. We can assume that we already know 2 is irrational.
b
TOC
Section 1: Sets
A set is an unordered collection of objects called elements. We will use uppercase letters to
denote sets, and lower case letters to denote the elements.
A superscript of "+" or "-" indicates the positives or negatives only. By defining q as an element
of the positive integers we have avoided division by zero, and have not excluded any rational
numbers. Sets are considered equal iff they have the same elements.
The set U denotes the universal set, that is, the set containing all elements of interest in a
particular inquiry.
Subsets and proper subsets refer to sets which are contained in other sets. If all of the elements
in A are also elements of B, then A is a subset of B, written A B . If B has at least one element
not contained in A, and A is a subset of B, then A is a proper subset of B, written A B . The
empty set is considered a subset of every set, and every set is considered a subset of itself.
The cardinality of a set refers to the number of distinct elements in the set, it is written |S|.
The power set of any given set is the set of all subsets of that set. This includes the empty set and
the set itself.
Since sets are not ordered, we use n-tuples when order matters. These look much like n-
dimensional points. Equality of n-tuples occurs when all of the corresponding elements are
equal.
Cartesian Product of two sets is a binary operation on two sets which results in the set of
ordered pairs in which every element in the first set is paired with every element in the second
set.
Example
A = {1, a, cat} and B = {2, dog}. A × B = {(1, 2), (1, dog), (a, 2), (a, dog), (cat, 2), (cat, dog)}
This can be extended to multiple sets by taking the first element of the first set followed by every
possible combination of the remaining sets, each of which follows the same protocol as in the
binary case.
The truth set for P, where P is a predicate is the subset, T, of the domain of P so that
x T P x is true.
HW
15. Use contradiction, suppose x is an element of A, but not an element of C. By definition it
cannot be an element of B, since B is a subset of C. But it must be, since A is a subset of B.
28. In Excel this is easy; by hand use a table to make it easier.
TOC
A B A B A B
Bitstring representation assumes some arbitrary ordering of Un. Strings of length n, are then
formed so that a 1 in the jth position indicates that the jth element of U is included. A zero
indicates that it is not.
Example
Let U = {1, 2, 3, 4, 5, 6, 7, 8}
Let A = {1, 2, 3, 5, 8} 1110 1001
Let B = {1, 4} 1001 0000
HW
14. Use a Venn Diagram.
48. Problems such as this are often made easier by writing the first few sets.
a. A1 1, 2,3,... A2 2,3, 4,... A3 3, 4,5,...
Ai
i 1
A
i 1
i
b. A1 0,1 A2 0, 2 A3 0,3
Ai
i 1
A 0
i 1
i
c. Ai
i 1
A 0,1
i 1
i
d . Ai (1, ) A i
i 1 i 1
TOC
Section 3: Functions
The term, function, assigns elements from one set, the domain, to a single element of another
set, the range. This is stated symbolically as f(a) = b, where a A , the domain, and b B , the
range. Other ways of describing this is a is the preimage, f(a) is the image, f is a mapping or a
transformation of A to B. The set B is the codomain, where the set B b | xf x b is the
*
range. If we take f : where f x x the reals are the codomain, but there are elements
2
When a function, f, is a bijection there exists another function, f -1, called the inverse so that
f a b f 1 b a .
The floor function (round down) assigns the largest integer less than or equal to x, the ceiling
function assigns the smallest integer greater than or equal to x.
Floor: x example 3.9 3 (note that floor and greatest integer are the same).
Ceiling: x example 3.1 4 .
n
Factorial: n 0 n ! i , 0! = 1.
i 1
HW
14. Remember that the domain and codomain are the integers.
a. We can show that 2m - n will give all the odd integers if n is one and all of the even inters if n
is zero, and zero if m and n are both zero.
b. All perfect squares end in either 0, 1, 4, 5, 6, or 9. As a result, no integer ending in a 7 will be
a function value.
c. See a.
d. If either m or n is zero, we have all of the integers.
e. This is a subset of part b.
29. Start with the definitions, then establish that the requisite memberships must exist.
TOC
A sequence is an ordered set of objects, as such it can be thought of as a function from the
integers to the set. When the elements of an infinite set can be so listed, it is said to be countable.
Problems that involve discovering a generating function for a sequence, that is some function
where f n an are equivocal. No matter how many initial values are given, infinitely many
functions may produce those first n values.
Example
Example
4
i
i 1
2
3i The index variable, i, acts as a counter. It takes on the integers starting with the lower
limit, and proceeds to the upper limit. As the index variable takes on these values, they are
substituted into the sequence generating function; these values are treated as terms to be added.
4
i 3i 1 3 1 2 3 2 3 3 3 4 3 4 0
2 2 2 2 2
i 1
n nk
a i a i k
i r i r k
Double sums are a way of allowing two variables to take on predetermined values. Consider the
situation where the rows and columns of a checkerboard are assigned numeric values 1 through
8. A rule is assigned where the loser of a bet is required to place on each square the sum of
3r + 2c dollars. How much is the payoff of this bet worth?
8 8 8
8 8
8 8
6
Countability is a term which implies a one-to-one correspondence with the positive integers. An
infinite set with this property is said to be countably infinite. Such a set has cardinality of 0 ,
pronounced aleph null.
To show that the positive rationals are countable, consider arranging them in rows where the first
row has all rationals with a denominator of one, the second two etc. A zigzag pattern from the
upper left passes through all rationals in a clearly defined sequence. By the same token, the
negative rationals are countable. By straightening the two zigzags, and placing the negatives on
top, we can create another zigzag that combines the two. If we start with zero before
commencing this last zigzag, we have all of the rationals, and motivation for the fact that the
union of two countable sets is also countable.
To show that the reals are not countable, start by assuming that they are, then considering only
the interval (0, 1) list them:
r1 0.d11 0.d12 0.d13 ...
r2 0.d 21 0.d 22 0.d 23 ...
r3 0.d31 0.d32 0.d33 ...
r4 0.d 41 0.d 42 0.d 43 ...
HW
8. To determine a polynomial of degree n, it is sufficient to provide n + 1 points, think of the
sequence as the points (1,3), (2,5) and (3,7) where the x-value is the term number.
14. The notation f j implies that every element of S is used exactly once to create a term in
jS
the sum.
32. Creating a function, f n n , which produces the set in question is sufficient to show
countability.
TOC
Chapter 3: Algorithms, the integers and matrices
Section 1: Algorithms
An algorithm is a finite set of instructions for performing a computation or solving a problem.
While your text uses pseudocode for its examples, I will use TI-83+ programming language
where possible. It is very similar, in that the language is very intuitive, but for those with this
kind of calculator, it will actually run. The indentation is to make it easier to read. Function Y9 is
reserved as int(X), the floor function, and Y0 is reserved as (X>int(X))+int(X), the ceiling
function.
The TI programming language requires an, "End" statement for the for-next loops, if-then-else
and the while constructs. The TI-86's, 89's and 92's have some differences in their syntax and the
way functions are treated. You should consult your manual if these programs do not work on
your machine. The ability to program your calculator is not a course requirement. Many of the
algorithms discussed can be duplicated with built in functions; the object is to get a better feel for
how algorithms work, their structure and their size.
The algorithm below is a very simple algorithm designed to find the largest item in a list. Input is
performed by entering the desired numbers into a list, the output is performed in the display step.
Max
If a list of numbers is stored in L1 this algorithm will find the largest value in the list.
L1(1) sto M
For(I,2,dim(L1))
If L1(I)>M
Then
L1(I) sto M
End
End
Disp M
Stop
if yes, new m
loop
over
this
element
bigger
if no, same m than m
display
m
A flow chart like the one above is very useful in organizing your thoughts and assuring that your
algorithm performs as expected. The parallelograms are data, I use them to indicate data in and
data out, the rectangles are processes and the diamonds are decisions.
Some things that all good algorithms have in common are: Input/Output, clarity, correctness,
finite number of steps, finite amount of time and generality. Using the nomenclature of your
book:
It - Input a specified set of initial values/conditions.
Only - Output a subset of values from a solution set.
Does - Definiteness, each step is precisely defined.
Computations - Correctness, the output values are valid for the given input.
For - Finiteness, the number of steps from input to output are finite.
Eager - Effectiveness, each step is possible to execute and can be done in a finite time frame.
Geeks - Generality, the algorithm should be applicable to other problems of the same form.
Linear Search
If a list of numbers is stored in L1 this algorithm will find the position of a given value, or report
that the item is not found.
dim(L1) sto K
1 sto I
Input "Desired Num",N
While((I<K) and (L1(I)≠N))
I+1 sto I
End
If L1(I)=N
Then
Disp"Num In Pos ",I
Else
Disp"Num Not Found"
End
Stop
If there are no repeated entries, this search displays the position of the number. If there are
repeated entries, this algorithm returns the first position the desired number is in. It also makes
provision for the possibility that the number is not in the list.
The binary search will reduce the number of steps, but it requires that the entries be sorted before
implementing the search. Some sorting algorithms will be considered later.
Binary Search
If a sorted list of numbers is stored in L1 this algorithm will find the position of a given value, or
report that the item is not found.
0 sto P
Input "Desired Num",N
1 sto I
dim(L1) sto J
While I<J
Y9((I+J)/2) sto M
If N>L1(M)
Then
M+1 sto I
Else
M sto J
End
End
If N=L1(I)
Then
Disp"Num At Pos ",I
Else
Disp"Num Not Found"
End
Stop
Sorting routines are common in many applications. These algorithms are often not as efficient as
they could be in order to make them easier to understand. They should be rigorously tested to
ensure that they are correct in all situations before being implemented.
Bubble Sort
If a list of numbers is stored in L1 this algorithm will sort the list in ascending order.
dim(L1) sto N
For (I,1,N-1)
For (J,1,N-I)
If L1(J)> L1(J+1)
Then
L1(J+1) sto L
L1(J) sto L1(J+1)
L sto L1(J)
End
End
End
Stop
Insertion Sort
If a list of numbers is stored in L1 this algorithm will sort the list in ascending order.
dim(L1) sto N
For (I,2,N)
L1(I) sto M
For (J,1,I)
L1(J) sto T
If M<T
Then
M sto L1(J)
T sto M
End
M sto L1(I)
End
End
Stop
Greedy Algorithms
In optimization problems, it is sometimes the case that when given multiple choices the best
choice is always the largest choice (or smallest depending on the situation). An example is the
greedy change algorithm which seeks to minimize the number of coins by always selecting the
largest possible coin without exceeding the appropriate amount of change.
HW
2. a. since the input is n , then n is replaced with 2n this algorithm is not finite
b. eventually we have a step 1/0, so this algorithm is not effective
c. no value assigned to i, lacks definiteness
d. also lacks definiteness, no rule assigned for decision between a and b
23. Input the two lists, let D domain and R range . L1 - domain, L2 - range.
Does the function Y1 map L1 onto L2
0 sto S
dim(L1) sto D
dim(L2) sto R
For (I,1,r)
0 sto P
For(J,1,d)
If Y1(L1(J)) = L2(I)
Then
1 sto P
End
End
S+P sto S
End
If S<R
Then
Disp "Not Onto"
Else
Disp "Onto"
End
Stop
TOC
Since the idea is to get some sense of the magnitude, we want g to be as small as possible and as
simple as possible. When dealing with polynomial functions, Big-O is simply the x to the largest
exponent in the polynomial. For simplicity, let us assume that we are dealing with strictly
positive functions or functions which are ultimately positive.
While this is not rigorous, it suggests that provided C > B/A, there should be a number k to serve
as the required witnesses.
The factorial, n!, is Big-O to n n and log(n!) is Big-O to nlog(n). This is determined by noticing
that the factorial is 1*2*3*4*…*n < n*n*n*n*…*n. Similarly since n 2n , we can determine
that log(n) is Big-O to n.
Given two functions which are Big-O to two other functions, the sum of the two will be Big-O to
the larger of the original Big-O's.
The product of two functions will be Big-O to the product of their respective Big-O's.
Example
f x x3 x 2 log x log x 1 17 log x 19 x 3 2
In the first factor of the first term, x3 is the limiting portion. In the second factor of the first term,
log (x) is the limiting portion. The first term is therefore limited by x log x , the product of
3
these two. The second term of the function is also limited by x log x , so the Big-O is
3
Two other notations are the Big-Omega and Big-Theta. The Big-Omega notation replaces the
less than or equal in the Big-O with a greater than or equal. While the Big-O serves as an upper
bound, the Big-Omega serves as a lower bound. When a function can be found which will work
as both a Big-O and a Big-Omega, it is called Big-Theta. Different constants, C, may be needed
for establishing the upper and lower bounds, but the same constant, k, should suffice.
Example
x x 1 1 2 1
f x x x
2 2 2
1
k 1 C1 1 C2
2
x k f x x 2
1 2
x k f x x
2
The function f is both Big-O and Big-Omega (x2) so we say it is Big-Theta (x2).
HW
2. Use the notion of the largest (fastest growing portion) is the only relevant portion.
a. x ≤ x2, yes b. x2 ≤ x2, yes c. log(x) ≤ x so xlog(x) ≤ x2 d. x4 ≥ x2, no e. eventually x > 2, no
f. the floor and ceiling functions are always within 1 of x so their product is not much different
than x2, yes
TOC
The complexity of an algorithm is usually considered in two dimensions, space - the computer
memory required, and time - the number of calculations required. For our purposes, we will
consider the number of comparisons made.
The algorithm for finding the largest element in a set of n elements will serve as an example.
Max
If a list of numbers is stored in L1 this algorithm will find the largest value in the list.
L1(1) sto M
For(I,2,dim(L1))
If L1(I)>M
Then
L1(I) sto M
End
End
Disp M
Stop
Since we assign the first element to the temporary position of max, we need only deal with n-1
elements. Two comparisons are made for each pass through the loop, the first is to determine if
the end of the loop has been reached, and the second is to see if the current element should
become the new contender for max. There is one comparison for end of loop that returns a yes,
thus no corresponding test for new max. This gives 2(n - 1) + 1 or 2n - 1 comparisons. This
gives us complexity of (n). The highlighted lines indicate a comparison is made.
Algorithms such as the above, always run through the entire sequence, there is no option for
early completion. Other algorithms are able to stop as soon as the required conditions are met.
For these algorithms we sometimes look at worst case and average case. The linear search
algorithm is a good example of an algorithm with variable stopping time. If we restrict our focus
to those cases where the desired number is in the list somewhere, then the worst case is when the
desired number is in the last position.
Linear Search
If a list of numbers is stored in L1 this algorithm will find the position of a given value, or report
that the item is not found.
dim(L1) sto K
1 sto I
Input "Desired Num",N
While((I<K) and (L1(I)≠N))
I+1 sto I
End
If L1(I)=N
Then
Disp"Num In Pos ",I
Else
Disp"Num Not Found"
End
Stop
As shown by the highlighted portions, two comparisons are needed to see if the loop is still in
force, and one additional comparison is needed to determine what to report when the loop is
terminated. If the desired item is in the nth position, 2n +1 comparisons are required.
If we assume that any position is just as likely, then we should take the arithmetic mean to find
the average complexity.
While it is true that n + 2 is usually smaller than 2n + 1, both of these are (n).
Sorting is by nature a more complex task than searching. The sorts that we have looked at are
(n2). The most common complexity levels are:
Complexity Terminology
(1) constant
(log(n)) logarithmic
(n) linear
(nlog(n)) nlog(n)
(nb) polynomial
(bn), b > 1 exponential
(n!) factorial
Problems which are deemed solvable in worst case time are called tractable; those which are not
are called intractable. Some problems are simply not solvable. There is a specific class of
problems which are called NP-complete. These are problems are not determined, but if any of
them can be solved by a worst-case polynomial time algorithm, then all of them can.
Page 198 of your text has a table showing the length of time required to complete an algorithm
for the last six levels of complexity in the previous table for various input sizes. Of interest is
how fast exponential and factorial algorithms exceed reasonable time expectations.
HW
26.
Greedy Change Maker
For a given amount of change (some positive integer less than 100) this algorithm will determine
the fewest number of coins. The list, Coin, is set up as {25, 10, 5, 1}. 0 sto S
4 sto dim(LA)
Fill(0,LA)
Input "Int < 100 ",N
For(I,1,4)
While N≥LCOIN(I)
S+LCOIN(I)sto S
N-LCOIN(I)sto N
LA(I)+1 sto LA(I)
End
End
Disp "Quarters", LA(1)
Pause
Disp "Dimes", LA(2)
Pause
Disp "Nickels", LA(3)
Pause
Disp "Pennies", LA(4)
Stop
There will be a total of five comparisons on the For-Loop (the last of which will terminate the
loop). Since the smallest coin has value of 1, there will be at most n+1 opportunities for the
While-Loop to make a comparison. Worst case scenario, n+6 comparisons, O(n).
TOC
Division Algorithm if a and d are integers, d > 0, then there are unique integers q and r such that
a = dq + r, 0 ≤ r ≤ d.
Examples
Two examples will suffice; although, by trichotomy we could have the following cases:
a d result
8 |14 1 14 div 8 6 14 mod 8
+ + first example
a 14 d 8 q 1 r 6
+ 0 undefined
+ - WLOG second example
0 + zero
0 0 undefined
0 - zero
8 | 14 2 14 div 8 2 14 mod 8
- + second example
a 14 d 8 q 2 r2
- 0 undefined
- - WLOG first example
Note that we have defined the remainder to be positive, so we do not get q = -1, r = -6.
For reasons which I can only assume are aesthetic, it is considered good form to make the
denominator positive. Thus, in modular arithmetic, where we are only interested in the
remainder, we always assume a positive divisor.
The notation for this type of arithmetic is, "a mod b". Where a represents the remainder and b
represent the divisor. By definition, two integers are considered equivalent under modulo m (m a
positive integer) if m|(a-b). This equivalence is called congruence.
Theorem 1
Let a, b, and c be integers.
i. a | b a | c a | b c
ii. c a | b a | bc
iii. a | b b | c a | c
Theorem 3
Let a and b be integers, and m a positive integer.
a b mod m a mod m b mod m
Theorem 4
Let m be a positive integer.
a b mod m k a b km
Theorem 5
Let m be a positive integer.
a b mod m c d mod m a c b d mod m ac bd mod m
Applications
Hashing functions, take some input, perform some manipulation and return an output. The
remainder from integer division is a potential hashing function. If the output is to be a storage
location, the divisor should be equal to the number of locations.
A collision occurs when a hashing function sends more than one input value to the same output
value.
Cryptography is another application, but the functions used must be invertible, or the message is
not decodable.
HW
If the mod function is not built in to your calculator, this might be useful. (Recall the floor
function is stored in Y9
Input "Dividend ",A
Input "Divisor ",B
Y9(A/B) sto Q
A-(B*Q) sto R
Disp "{Q R} ",{Q,R}
Stop
TOC
Theorem 1 FTA
Every positive integer is either prime, or it can be written as the product of primes.
Theorem 2
If n is a composite integer, then n has a prime divisor less than or equal to the square root of n.
Theorem 3
There are infinitely many primes. BWOC P = {p1, p2, p3… pn} is the set of all primes. Let
n
s 1 pi . If s is a composite, then there is a prime that divides s. But this number would also
i 1
n
divide s pi 1.
i 1
Theorem 4 PNT
The ratio of the number of primes not exceeding x and x/(ln(x)) approaches one as x goes to
infinity. Thus the odds of a randomly selected positive integer being prime is 1/ln(n).
The greatest common divisor of two numbers, a and b, is the number g such that if f is any other
number that divides both a and b, g > f.
Given a set of integers, it is pairwise relatively prime if any two elements not equal to each other
are relatively prime.
The least common multiple, m, of two integers, a and b, is the smallest integer such that a|m and
b|m.
HW
14.b. Recall,
n
ar n 1
k 0
ar k =
r 1
r 0,1
The sum of the proper divisors for 2 2 1 will include 20 through 2 p1 and
p 1 p
20 2 p 1 through 2 p 2 2 p 1 as terms. These two sums are both geometric and fit the form
above.
26. Writing the two integers as gcm*j and gcm*k should help.
proof is trivial.
TOC
Section 6: Integers and algorithms
Theorem 1
If b is a positive integer greater than 1, and n is a positive integer, then there is a unique
representation of n using b as the base.
We are most familiar with base ten, but other bases have been used, the French word for 80,
quatre vingt (four twentys), suggests that at one time a base 20 system was used in Europe. The
Babylonians used a base 60 system. In computer science, binary, octal and hexadecimal systems
are all employed for some purpose or another.
This routine will "convert" a base 10 positive integer into its constituent parts in another base.
The values for each place are stored in L1. The function Y9 used in the program is the floor
function defined previously. The first position is the 1's position, the second is the b's position,
etc.
ClrList L1
Input "Int To Cnvrt ",N
Input "Base ",B
N sto Q
1 sto K
While Q≠0
Q-(B*Y9(Q/B)) sto L1(K)
Y9(Q/B) sto Q
K+1 sto K
End
Stop
Converting from binary to Hex is performed by grouping the binary bits into groups of four bits,
nibbles. Insert leading zeros if necessary on the last nibble. Each of these nibbles, when
translated is the appropriate value for that place.
Example
Binary 0001 1101 0000 1111 0101
Hex 1 D 0 F 5
An adding algorithm in binary, A is stored in L1, B in L2 and A + B in L3. As before the place
values increase as the order in the list increases.
0 sto C
dim(L1) sto N
N+1 sto dim(L3)
For(I,1,N)
Y9((L1(I)+L2(I)+C)/2) sto D
L1(I)+L2(I)+C-2*D sto L3(I)
D sto C
End
C sto L3(I)
Stop
For simplicity, add leading zeros as needed to make the two addends the same length. This
algorithm is O(n) whether looking at adds or comparisons.
A multiplication algorithm can be constructed using the pattern for multiplication as we currently
know it in base ten. The actual multiplication is very straightforward, 0* = 0, 1* = *.
1 0 1 1 0
× 1 0 1
1 0 1 1 0 shift 0
0 0 0 0 0 shift 1
1 0 1 1 0 shift 2
1 1 0 1 1 1 0 sum
n 1
The number of shifts required when the multiplier has n digits is, i . This makes the
i 0
complexity O(n2). As seen previously the sums are performed O(n), making the entire algorithm
O(n2).
Your book also provides an algorithm for finding the quotient and remainder for the division of
two integers as well as modular exponentiation. The Euclidean Algorithm for finding the gcd of
two integers is a good example of working smarter, rather than harder. Suppose that we have two
integers, a and b. WLOG, a < b. If we find the quotient of b divided by a to be ad + r, it follows
that any integer which divides both a and b must also divide both a and r. These being smaller
than the original pair of integers, pose a simpler problem. Repeat as needed until r is zero.
Example
Find the gcd of 1529 and 14038. (Using the mod program developed earlier.)
14039 = 9(1529) + 278
1529 = 5(278) + 139
278 = 2(139) + 0
The gcd is 139.
HW
20. 11644 mod 645 the hard way:
i ai x power power expanded power result
2
0 0 1 11 mod645 121mod645 121
1 0 1 1212mod645 14641mod645 451
2 1 451 4512mod645 203401mod645 226
3 0 451 2262mod645 51076mod645 121
4 0 451 1212mod645 14641mod645 451
5 0 451 4512mod645 203401mod645 226
6 0 451 2262mod645 51076mod645 121
7 1 391 1212mod645 14641mod645 451
8 0 391 4512mod645 203401mod645 226
9 1 1 2262mod645 51076mod645 121
The table above, was completed by programming the algorithm for the TI-83+, and inserting a
pause and display step at the end of the loop.
Section 8: Matrices
The dimensions of a matrix are given by number of rows, number of columns. Two matrices
with the same dimension may be added by adding the corresponding elements. Matrix addition is
commutative.
Aj ,k Bk ,l C j ,l to multiply A and B, B must have the same number of rows as A has columns.
The resulting matrix will have as many rows as A, and as many columns as B.
ci , j ai b j the element in the i, j position of the resulting matrix, C, is the sum of the products
of the elements in row i of A and column j of B.
A square matrix, n n, consisting of 1’s when the row and column number are equal and 0’s
elsewhere is called the identity matrix of order n. If A is an m by n matrix, AIn = ImA = A.
When dealing with a square matrix, an exponent is understood to mean repeated multiplication,
or the identity matrix when the exponent is zero.
The matrix operation, transpose, is indicated by a superscript, “t”, and is the result of
interchanging the rows and columns.
Example
a1 a2 a3 a1 b1 c1
A b1 b2 b3 A a2
t
b2 c2
c1 c2 c3 a3 b3 c3
A zero-one matrix is a matrix whose entries consist of 0’s and 1’s. When a matrix of this form is
used to represent a discrete structure, the Boolean operators and are used with a 1 acting as
True, and a 0 as False. A B, called the join, is the result of an element by element or. A B,
called the meet, is the result of an element by element and.
The Boolean product of A and B also requires that B have the same number of rows as A has
columns. Its result also will have the same number of rows as A, and the same number of
columns as B. The Boolean product of A and B is symbolized by, A B , it differs from an
ordinary product in that the multiplication is replaced with the “and” operator, and the addition is
replaced with the “or” operator.
Store zero-one matrices of the appropriate size in A and B. The Boolean product will be stored in
C.
Input “A Rows : M
Input “B Cols : N
Input “B Rows : K
{M,N} sto dim([C])
For (I,1,M)
For(J,1,N)
0 sto [C](I,J)
For(Q,1,K)
[C](I,J) or ([A](I,Q) and [B](Q,J)) sto [C](I,J)
End
End
End
Stop
Finding the product of two n by n matrices, Boolean or otherwise, is typically an algorithm with
complexity O(n3).
HW
14. Let A and B both be n by n matrices. Consider the two cases for element position in the
resultant matrix.
Case 1, off diagonal, that is, row i and column j, i j. WLOG i < j, then cij = ai1b1j + ai2b2j + ai3b3j +
…+ ainbnj. For any given term, if aik 0, then bkj = 0 since i < j. This means all of the off diagonal
entries are zero.
Case 2, on diagonal, that is, row i and column j, i = j. The entry for ckk = ai1b1j + ai2b2j + ai3b3j +…
+ akkbkk +…+ ainbnj. Since i = j, all terms will be zero except for the intersection of row and
column where the term will be the product of the two diagonal entries.
TOC
Chapter 4: Induction and recursion
Mathematical induction is a method of proving that some statement is true for all n, where n is
an integer, by the following:
P 1 nP n P n 1 nP n
In plain language, we know that no matter what value we pick for n, if what we claim is true for
that value, then our claim will also be true for the next value. We also know that our claim is true
when n is 1. What we have accomplished is a one to one mapping of our claim to the positive
integers. Whenever we can perform such a mapping of truth values for the premise we are
interested in, induction is a reasonable method for demonstrating that our premise is always true.
This method uses the terminology, basis step, to describe that our premise is true for n equal to 1,
and inductive step, to describe the demonstration that truth for n implies truth for n + 1.
Example
Show that the sum of the first n, odd positive integers is, n 2 .
1. n 1 S1 1 12 True Basis step.
2. By assumption, the sum of the first n 1 odd integers will be n 2 2 n 1 1 Inductive step.
3. n 2 2 n 1 1 n 2 2n 1 n 1
2
In the example above, we have showed that our premise was valid for n = 1, as our basis step.
What is important is that we show that our formula works for the first case, whether that is n = 1,
or some other number.
Example
n
n!
Prove that
i 3
i
2
. Our basis step will be to show that this is true for n = 3 rather than one.
3
1* 2*3 n !
i 3
i 3
2
2
n
n!
Assume i
i 3 2
n 1
n! n 1 !
i 2 n 1
i 3 2
Of equal importance to proving equality, is proving inequality. Induction is valid for this purpose
as well.
Example
Prove that for n > 6, 3n n ! . Here we are not only considering an inequality, but also a basis step
that starts at a number other than one.
37 2187 5040 7!
Since n 6, n 1 3
3n 1 3n3 n ! n 1 n 1 !
It should be noted that induction is a great way to determine if something is true, it is generally
speaking not a good method for determining what that something might be. For example, we
have formulas for determining the nth partial sums of arithmetic and geometric sequences.
Induction can be used to demonstrate that these formulas are valid, but the use of induction
presumes that we already have a formula at hand, how can we show that when something is true
for n, that it must be true for n + 1, when we do not have that something in the first place?
Example
n
Consider the sum, 6 2 . While it has some of the characteristics of an arithmetic series, a
i 1
3 i
constant plus some value, it is not arithmetic in that we do not add the same value to each term. It
also has some of the characteristics of a geometric series; each term has a multiplication by a
half. But the entire term is not multiplied by 1/2, so it is not geometric. We may believe that we
n n
ar n 1 a
can split this into two sums, one , , and the other ,
k
c nc ar . We can then use
k 1 k 0 r 1
induction to see if we are correct.
i 3
12
n n n
6 2 3 i
6
i 1 i 1 i 1
i 3 i
n n
1 n
1
i 1
6
i 1 2
6n 8
i 1 2
n
1
n 1 i 4 4 i
n
1 1
6n 8 6n 4 6n
2
i 1 2 i0 2
1
2
n
1
4 4
6n
2
6n 8 23 n
1
2
6 2 6n 8 2
i 1
3 i 3 n
i 1
6n 8 2 6 2
3 n 3 n 1
6 n 1 8 2 2 n
Induction proofs about sets, are still proofs about sets. This means that we will most likely be
making claims about membership of the elements. Consider the following two situations.
Prove that if A1 , A2 ,... An and B1 , B2 ,...Bn are sets such that Aj B j for j 1, 2,..., n , then
n n
A B
j 1
j
j 1
j .
The basis step merely requires that A1 be contained in or equal to B1. This is given as true in the
statement of the problem. Assuming that the premise is true for n, leaves us having to show that
it is true for n + 1.
To do this, we consider membership in the left hand side, and show that it must imply
membership in the right hand side.
n 1
Let a A j
j 1
Case 1
n n
a A j a B j by assumption.
j 1 j 1
n n 1
a B j a B j by definition of union.
j 1 j 1
Case 2
a An 1 a Bn 1 by initial premise.
n 1
a Bn 1 a B j by definition of union.
j 1
j A B Aj B
j 1 j 1
Aj An 1 B A j B
j 1 j 1
HW
30.
n 1
n 3 2n 3
3|3
n 1 2 n 1 n3 3n 2 5n 3
3
TOC
In simple induction, we start with a basis step in which we show that our premise is true for n
equal to some starting value, typically one. We then show that whenever our premise is true for
some value k, that it must also be true for k + 1. For strong induction, we must show that when
our premise is true for all numbers less than k, that it must be true for k = 1. Strong induction is
sometimes known as complete induction or the second principle of induction. Two examples
from the book should illustrate.
Example 1
The match game is a game where two piles of matches are laid side by side. Players take turns
removing as many matches as they want from one pile or the other. If both piles initially have the
same number of matches, then the second player has a winning strategy.
The basis step, when n is 1 is true. The first player is forced to take the match in one of the piles
leaving only one match left for the second player.
The induction step requires that we provide a winning strategy for n = 2 through n = k which
guarantees a winning strategy for n = k + 1. Player 2 duplicates player 1’s move. If player 1
leaves k matches in pile 1, then player 2 leaves k matches in pile 2. Ultimately player 1 either
leaves 1 match in pile 1 or no matches. If player 1 leaves 1 match, so does player 2. This brings
us to the basis step, which has been established. If player 1 removes pile 1, then player 2
removes pile 2 winning the game. Now if there are k + 1 matches in each pile, player 2 follows
the same strategy outlined above for the first k matches leading to the basis step.
Example 2
Prove that every amount of postage of 12¢ or more can be formed using only 4¢ and 5¢ stamps.
The basis step is established by the fact that three 4¢ stamps will make 12¢. We can make 13-
15¢ by replacing a 4¢ stamp with a 5¢ stamp.
For the inductive step, we assume that P(j) is true for 12 j k, where k ≥ 15. Since P(k-3) is
true, 12 k – 3 15, we need only add a 4¢ stamp and we have P(k + 1).
The notion of well ordering is that every nonempty set of nonnegative integers has a least
element.
HW
6. a.
3 3 3 10 33 10 10 3
3 3 33 3 3 3
3 33 3 3
33 3
3 3
3
3 6 9 10 12 13 15 16 18
At this point, replace three 3’s with a 10 to add 1. Replace six 3’s with two 10’s to add 2, or add
another 3 to add 3. So, all of the listed positive integers, and all positive integers greater than 18
can be formed with only 3’s and 10’s.
TOC
A recursively defined function provides a function value for the oth term, then defines f(n) in
terms of f(n – 1). Sometimes it requires the first k terms to be provided, then f(k + 1) is given in
terms of f(1) through f(k). A good example of this is the Fibonacci numbers. Some recursive
sequences are given below.
Arithmetic – an = an-1 + d
Geometric – an = an-1(r)
Fibonacci – fn = fn-1 + fn-2, given f0 = 0, f1 = 1, n = 2,3,4,…
Functions of this type are called well defined, which is to say that its values are found with no
ambiguity.
Sets may also be defined recursively. Sometimes it is specifically mentioned that only the basis
element(s) and those elements found by applying the recursion formula. This is known as an
exclusion rule. We will assume that the exclusion rule is operative without specifying it.
Example
Let be the set of symbols, {0, 1}. Let * be a set of, strings formed by concatenation of these
symbols using the following rule. The empty string, “ “, is an element of *. If w is in * and x
is in , then wx is in *.
HW
12. Recall that f0 = 0.
24. c. Let S be a set containing the integers, thus all of the constant polynomials. If p(x) is in S,
then xp(x) + n is in S.
TOC
Chapter 5: Counting
The two most basic counting principals are the product and the sum rules.
The product rule is employed when a task may be represented as a sequence of choices, each
n
with ci possiblities. The total number of possibilities for the completed task is given as c .
i 1
i
When using the product rule, care should be given in determining whether or not the choices are
independent. For instance, if we are assigning ID’s to accounts, with each ID assigned, there is
one less available to assign. If we are picking digits for an ID number, there are ten available for
each position. This notion is useful when counting the number of steps performed in a nested
loop.
The following code has been written to find the sum below.
4 5
i j
i 1 j 1
0 sto S
For (I, 1, 4)
For (J, 1, 10)
S + (I + J) sto S
End
End
Disp S
Stop
In the outer loop, there are four valid values for I, in the inner loop, there are ten valid values for
J. This means there will be forty times that the code, “S + (I + J) sto S,” will be implemented.
When tasks are selected from one of a variety of sets of tasks, each with ni choices, then the sum
N
rule is in play. It states the number of possibilities will be n , where N is the number of sets.
i 1
i
The significant difference is that the product rule applies to a sequence of tasks whereas the sum
rule applies to a single simple task.
Some counting problems will require that both rules be employed, and some will require that you
be aware of multiple counting.
Example
How many number between 1 and 100 inclusive, are divisible by either 2, 3, 5, or 7?
There are 50 numbers in the first 100 positive integers divisible by 2, but some of these are
divisible by 3 and or 5 and or 7. If we just made a list of divisible by 2, divisible by 3, divisible
by 5 and divisible by 7, we would conclude that 117 numbers fit this requirement. This is an over
count. The correct answer is 78.
HW
20.
a. Note, the lowest integer to examine is 1, and the largest is 999.
b-f I used Excel to get the following values
The columns seven and eleven ask if the integer portion of the division is equal to the
division, if it is a 1 is placed in the column, if not a zero. The sums represent the number of
integers divisible by either seven or eleven. The column, prod, is the product. It will be a 1
only if both are a 1. A simple Venn diagram should sort out the various divisibility
categories.
g. Assume no leading zeros, then we have three cases:
Case single digit double digit triple digit
product rule 9 9*9 9*9*8
While there are 10 digits, the first cannot be a zero so we have 9 choices for the first digit.
h. You may want to count the odds, then subtract.
38. For a set with n elements, there are 2n subsets in the power set. One of these is the empty set,
and n of them have a single element.
53. There are two ways that come to mind for arranging these letters,
linear and circular. If the arrangement is circular, we would need to
specify a rotation direction, or that direction did not matter. What we want
in this case is a linear string where the string is to be read from left to
right. Consider the four cases:
i. a is in the first position
ii. a is in the second position
iii. a is in the third position
iv. a is in the fourth position
TOC
The pigeonhole principle simply states that if you have more than k objects, and only k boxes,
then at least one box has more than one object. The generalized pigeonhole principle states that if
N
N objects are placed into k boxes, then there is at least one box with at least objects. A
k
proof by contradiction is provided in your text. Recall that the ceiling function provides an
integer as its output. If all k of the boxes had one fewer objects then we see that:
N N
k 1 k 1 1 N
k k
N N
note that 1 1
k k
The most common type of question requiring the use of the pigeonhole principle is of the form,
what is the minimum number of objects required to guarantee at least r of these objects must be
in one of the k boxes?
Example
How many cards from a standard deck of 52 must be selected to guarantee at least three cards of
the same suit are chosen? There are four suits in a standard deck, these will correspond to the k
N N
boxes, that is k = 4. We want to find N so that 3 2 N 8 . Nine cards suffice.
4 4
While the concept is simple, sometimes the implementation is not.
HW
30. N = 100,000,000; k = 99,999,999
31. N = 677; k = 38
TOC
Combinations are concerned with which elements are to be included, not with what order the
elements will appear. If we were interested in knowing who was in the top ten list, but not in
where they appeared in the list, it might occur to us that if we had all of the permutations, then
we would have in effect every combination represented by all of its permutations. Thus from the
Boston Marathon example above, for any distinct group of ten people, there are 10! permutations
for that group. Dividing by the number of permutations for each group will give the number of
n n! n
groups (combinations). The general formula is . The LHS, , is read, “n
r n r !r ! r
choose r.” This is what Pascal was really up to when he derived the binomial theorem. The
“choose” comes from the fact that this tells us how many distinct ways r objects can be chosen
n
from a set n. The notations , nCr and C(n,r) all refer to combinations.
r
By the way, there are 12,081,513,884,639,814,681,154,670,562,780,037,639,564,800
permutations for a top ten list of the 2007 Boston Marathon, but only
3,329,341,348,280,372,211,517,490,785,598,555,346 combinations.
HW
12. This problem deals with combinations, in part a we are choosing how many ways to pick
three positions out of twelve to contain a one. In part b we add the combinations for no 1’s, one
1, two 1’s and three 1’s. In part c, we could do the same thing, or we could subtract from 212,
total number of bit-strings of length twelve the values we don’t want.
TOC
j 0 j 0 1 n 1 n
Using the formula for combinations as the coefficient makes sense in that we know that the sum
of the exponents in each term will be n. We need to choose how many ways j of these n can be
assigned to y, with the rest assigned to x. This is a combination. This becomes a little less clear
when we expand a binomial where the variables in the binomial have coefficients other than one.
Example
Find the coefficient of the 4th term in the expansion of 2 x 3 y . The first thing to remember is
7
that we start counting with the 0th term, so the 4th term will be,
7
3 2 x 3 y 35 2 3 15120 .
4 3 4 3
Some interesting results can be obtained from the Binomial Theorem when we remember what
the constituent parts are:
n
n n n k nk n
n
1 1 k 1 1 1 1 1 1
k nk n n
Corollary1 2n
k 0 k k 0 k k 0
n
n n
n n
n
k k 1 1 k 1 1 1 1 1 1
k k nk k n k n n
Corollary2 1 0
k 0 k 0 k 0
n
n n n nk k n
n
2 k k 1 2 k 1 2k 1 2 1 2
k nk n n
Corollary3 3n
k 0 k 0 k 0
Despite its name, Pascal’s Triangle was
known to the Chinese as early as the
twelfth century, around five centuries
before the time of Pascal. It is used as a
means of generating binomial coefficients
and as such, once methods for solving
quadratic and cubic equations were known,
was used to generalize these methods to
higher roots.
This theme was taken up by James Bernoulli (c. 1712) and later mathematical writers who turned
their interest to probability, primarily on behalf of the new insurance industry. "Il est tres bon
ésprit," wrote Pascal to Fermat about the Chevalier, "mais quel dommage, il n'est pas geometre."
["He's a fun guy but, alas, no mathemetician."]
Pascal’s Identity
n 1 n n
k k 1 k
A combinatorial proof from your text (a proof in which it is demonstrated that both sides are
counting the same set of objects, only in different ways) is given below.
If T is a set with n + 1 elements and a is one of those elements, and S is the set T with the
n 1
element a removed from it, we will note that there are subsets of T that contain k
k
elements, the LHS. Now a subset of T with k elements can have a in it along with k – 1 other
n
elements, of this, or it did not have a in it and was merely k elements chosen from n. The
k 1
sum of these (sum because these are independent and or) is the RHS.
Some other identities are:
m n r m n
r r k k
k 0
2
2n n n
n k 0 k
n 1 n k
r 1 k r r
The proofs for these are given in your text, and are easy to follow.
HW
19.
n 1 n 1 !
k
n 1 k !k !
n n n! n!
k 1 k n k 1 ! k 1 ! n k !k !
n! n! n !k n 1 k n !
n k 1 ! k 1 ! n k !k ! n 1 k !k !
n !k n 1 k n ! n ! k n 1 k
n 1 k !k ! n 1 k !k !
n ! k n 1 k n ! n 1
n 1 k !k ! n 1 k !k !
n ! n 1 n 1 ! n 1
n 1 k !k ! n 1 k !k ! k
28.
a) Suppose you had a group of n men and another group of n women, and you wished to pick
2n n
two people from this collection of 2n people, . You could choose two men, , or two
2 2
n n n n 2
women, also , or you could choose one of each, 1 1 . This is 2 2 n
2
TOC
The rule for combinations is slightly less obvious. For a set with n elements from which we will
choose r, with repetitions allowed, we will use the formula corresponding to C(n + r – 1, r). Your
text describes the process of making this decision in terms of stars and bars. With n – 1 bars, we
can partition an interval into n bins. In each of these bins we will place the number of stars
equivalent to the number of times we choose to repeat the selection of the item contained in that
bin. In example two we are faced with a bowl containing three types of fruit, from which, we are
to choose four pieces. While we are choosing from a set of fruits and repetitions, we are still
actually choosing four things. The n in this problem is three, the r is four, so the solution is C(3 +
4 – 1, 3) or C(6, 3) which equals 15.
Some counting problems resolve themselves to the equivalent problem of choosing boxes in
which we will place objects. We may consider the case of boxes and objects which can be
distinguished or which are indistinguishable.
The problem of n distinguishable objects and k distinguishable boxes is the same as permutations
of n objects in which the objects are partitioned into k categories with repetitions of the objects
in each category.
The problem of n indistinguishable objects with k distinguishable boxes is the same as the
combinations with repetitions.
The problem of distinguishable objects and indistinguishable boxes is much more complex. The
notion of Stirling numbers is employed in this sort of problem. Let S(n, j) denote the number of
ways to distribute n distinguishable objects into j indistinguishable boxes so that no box is
k
j ! i 0 i
Finally, n indistinguishable objects into k indistinguishable boxes resolves itself into a k-partition
of n, that is finding the integer sums so that we have no more than k integers, and no fewer than
one integer, such that the sum of these integers (non-negative) is n. There is no nice formula for
this number. We will require that the integers as read from left to right be non-increasing.
HW
16.
a) If all the x’s are greater than 1, this means they are all at least 2. Subtracting two from
each integer on the left and 12 from the right gives the problem of non-negative integers
whose sum is 17. This is C(6 + 17 - 1, 17)
b) - d) Look at the restrictions as eliminating possible non-negative integers. n is 6, r is
what is left after eliminating the integers.
54. This asks for the partitions of 5 into at most 3 parts. They are enumerated below:
box 1 2 3
5 0 0
4 1 0
3 2 0
3 1 1
2 2 1
TOC
Chapter 9: Graphs
Section 1: Graphs and graph models
A graph is a set of vertices, sometimes called nodes, and a set of edges (arcs) that connect the
vertices. The use of straight lines or curved lines, overlapping lines or non-overlapping is only
important aesthetically. While some graphs cannot be drawn on paper without overlapping the
edges, what is important is that the connections between vertices be accurate.
A graph is a visual representation of the relationships between a set of objects. Graphs may be
employed as a method of identifying patterns that might not be apparent when the information is
displayed by other means. As an example of this, suppose ten people are at a party, Alice, Betty,
Charles, David, Eric, Frank, Gail, Helen, Irene and James. An observer records whom each
person talks to over the course of the party. Using this information, we want to see if any patterns
emerge. A tabular representation of this data is presented.
Alice Betty Charles David Eric Frank Gail Helen Irene James
Alice X X X
Betty X X X X
Charles X X X
David X X X
Eric X X X X
Frank X X X X X
Gail X X X
Helen X X
Irene X X X
James X X X X X X X X X
Find the name in the row; an X in the column indicates that the person in that row talked to the
person in that column. By looking at the table, we see that James talked to everyone. The rest of
the people talked to two, or more other people.
Letting the people represent nodes, and edges represent conversations, we obtain the graph
below.
A C
E G B D
I
F
J H
The letters are the first letters of the person’s name. The dots represent the person. The lines
indicate conversations between two people. James talking to everyone kind of clutters the
picture, but in the table and the graph, it is easy to see who is hosting the party. Lets take James
out of the graph and see what happens.
A C
E G B D
I
F
H
The graph with James removed allows us to see that there are three cliques at the party:
Alice, Eric and Gail
Betty, Charles and David
Helen and Irene
Frank seems to be the “odd duck”. He talked to one member of each group, but did not appear to
get accepted by any group. He also talked to himself. Maybe Frank was actually providing
security and was really talking into a hidden microphone to his backup. Or perhaps Frank was
muttering to himself after being snubbed by yet another group.
While the visual appeal of graphs is a major draw, we can infer a lot from graphs mathematically
as well. First we need to establish some definitions.
A simple graph is a graph in which there is no more than one edge between the same two
vertices. The edges do not have any implied direction of travel, and will not connect any vertex
to itself.
A multigraph differs from the simple graph in that it does allow for redundant edges between
vertices. It also will not allow loops.
A digraph has directed edges. Often, edges are referred to by the pair of vertices that they
connect. In a digraph, this is an ordered pair, (Start, End).
Some graphs have been used sufficiently often to merit their own specialized names:
HW
12. Since the graph is not directed, uRv vRu, thus the relation is symmetric. The presence of
loops at every vertex guarantees uRu, therefore the relation is reflexive.
13. c.
A1 x | x 0 A2 x | 1 x 0 A3 x | 0 x 1
A4 x | 1 x 1 A5 x | x 1 A6
TOC
Vertices connected by an edge are called adjacent or neighbors. The edges are said to be
incident to or incident from the vertices.
The degree of a vertex, deg(v), is the number of edges incident to it. Note that a loop contributes
to the degree of a vertex twice, once upon leaving, then again upon returning.
A vertex is isolated if it has a degree of zero, and is pendant if it has a degree of one.
The Handshaking Theorem, in an undirected graph with e edges, deg v 2e . Your text
V
presents the following as a theorem, but perhaps it should be a corollary, an undirected graph has
an even number of vertices of odd degree.
In a directed graph we refer to edges as being incident from the initial vertex, and incident to the
terminal vertex. Also, in a directed graph, we distinguish between the in-degree (superscripted
with a minus sign) and the out-degree (superscripted with a plus sign). This prompts the theorem,
deg v deg v E .
V V
When the directions of the edges are ignored, we refer to the resulting graph as the underlying
graph.
Some special simple graphs are the complete, cycle and wheel graphs.
The complete graph is formed by placing edges as needed so that any two
vertices are adjacent. If v is the number of vertices, then the number of
edges is v(v-1)/2. Complete graphs are refered to as Kn, where K is used to
denote complete, and n denotes the number of vertices.
If we remove the interior edges, we have a cycle. Cycles have exactly the same
number of edges as vertices, but need at least three of each for the name to
make any sense. The same notation as above is used, with the
exception that C is used for cycle. C5 is shown on the right.
Another specialized graph is the n-cube. We can think of the n-cube as a progression of graphs,
most of which do not look like what we commonly consider as a cube. Q1 is a single edge
connecting vertices labeled 0 and 1. To increase the order to Q2, copy Q1 so that you now have
two graphs with a single edge connecting vertices labeled 0 and 1. On one of them, insert a
leading 0 to the names of the vertices, on the other a leading 1. Connect the vertices that match
except for the leading digit. See page 602 of your text.
A bipartite graph consists of two distinct sets of vertices. Edges are used to connect vertices from
one set to the other. We sometimes consider complete cases of the bipartite graph, by which we
mean every vertex in set A is connected by an edge to very vertex in set B.
Assigning jobs to employees could be modeled with a bipartite graph. LAN’s may be modeled
with stars, cycles or wheels.
Sub graphs, like subsets are formed by taking part or all of an existing graph. A proper sub graph
is not a simple copy. We say that we have taken the union of two graphs when we have added
any new vertices and edges contained in one to the other.
HW
n
36. Part of this problem is solved by the so-called Handshaking theorem, if d mod 2 0 , then
i 1
i
f) The sum of the degrees is even. Since all vertices have a degree of 1, the graph will not
be connected. Three pairs of vertices with one edge joining them will form the graph.
g) Again we have an even sum, so a graph may be possible. In fact the graph below is one
example.
h) The sum is 20, so a graph may be possible. There is a problem however with the size of
the degrees. With 6 vertices, we cannot have two with degree 5 and still have a simple
graph. The first may neighbor the remaining five, but the second vertex with degree five
will also have to neighbor the other five. But one of the other five has degree 1.
TOC
Other than simply drawing the graph, a graph may be represented by a list of edges, (va,vb),
where the edges are identified by the vertices they connect. Alternatively an adjacency list could
be used. These methods are appropriate for graphs without multiple edges. With minor
modifications they could be adapted to multi-graphs as well.
If the graph is directed, the edges in the edge listing will be ordered pairs (vo,vt), and the
adjacency list will have as the left hand column a list of initial vertices.
The adjacency matrix is an n by n matrix (n being the number of vertices) with one row and one
column reserved for each vertex. The cell in the ith row and jth column will be an integer
indicating the number of edges connecting vertices i and j. If there are no loops in the graph, the
main diagonal will always consist of zeros. The adjacency matrix for the graph above is:
A B C D E F
A 0 1 0 0 1 0
B 1 0 1 1 0 0
C 0 1 0 1 0 0
D 0 1 1 0 1 0
E 1 0 0 0 0 1
F 0 0 0 0 1 0
Considering that the labels for the vertices are arbitrary, an adjacency matrix is not necessarily
unique. Irrespective of the ordering of the vertices, whether loops and or multiple edges are
allowed, all adjacency matrices are symmetric.
Another matrix representation for a graph is the incidence matrix. For this representation, the
edges are listed as columns, the vertices as rows. If an edge is incident to a vertex, a one denotes
it. Every column will thus have two ones, and the rest zeros. The exception to this is when the
edge is a loop. If an edge is a loop, then there will be a single one in that column.
A graph G1 is said to be isomorphic to G2, if there exists a bijection f which maps the vertices in
G1 to the vertices in G2, f(v1) = v2, such that if a and b are adjacent in G1, then f(a) and f(b) are
adjacent in G2. Properties preserved by isomorphism are referred to as graph invariant; these
include adjacency, degree, the number of vertices, connectedness, and the number of edges. If
we can determine the correspondence between the vertices, then the adjacency matrices for the
two graphs will be identical.
The graphs are isomorphic. The function has been stated explicitly and the adjacency matrices
are identical when the RHG is ordered by f(v).
TOC
Section 4: Connectivity
A walk is a sequence of edges such that any two successive edges in the sequence share a vertex
(aka node). The walk is also considered to include all the vertices (nodes) incident to those
edges, making it a subgraph.
A trek is a walk that does not backtrack, i.e. no two successive edges are the same.
A trail is a walk where all edges are distinct and all vertices. By distinct, we mean that no edges
or vertices are repeated.
A path is a walk where all edges are distinct, but vertices may be repeated.
A circuit is a path that ends on the same vertex from which it started.
A graph is connected if all vertices can be joined by a path, and is disconnected if at least two
vertices may not be joined by a path.
Components of a graph are the connected portions of a disconnected graph, as well as any
isolated vertices.
A bridge or cut edge is an edge which, if removed, would cause the graph to become
disconnected.
When dealing with a directed graph, we have two notions of connectedness. A digraph is
strongly connected if for any two vertices in a graph there is a path from one to the other and
vice versa. A digraph is weakly connected if there is a path between any two vertices when the
directions are removed from the edges. Paths and circuits are invariant under isomorphisms.
The number of paths between two vertices in a graph can be obtained using the adjacency
matrix. If you want the number of paths of length r between v1 and v2, then look at the entry in
the cell corresponding to the intersection of the two vertices in Ar. Your text provides a proof by
induction for this result.
HW
You should be able to do these without help.
TOC
The Seven Bridges of Konigsberg - In Konigsberg, Germany, a river ran through the city such
that in its center was an island, and after passing the island, the river broke into two parts. Seven
bridges were built so that the people of the city could get from one part to another.
The people wondered whether or not one could walk around the city in a way that would involve
crossing each bridge exactly once. Answering this question is Leonhard Euler, a Swiss
mathematician tutored by Johann Bernoulli. Euler lost sight in his right eye in 1735, and in his
left eye in 1766. Nevertheless he continued to publish his results by dictating them to his wife.
(There is some controversy concerning some of these dictated works. Some people speculate that
some of these papers may be the work of his wife who put his name on the work because of
prejudice against female mathematicians.) Euler was the most prolific mathematical writer of all
times finding time (even with his 13 children) to publish over 800 papers in his lifetime.
Euler’s second graph theorem: If more than two vertices have an odd degree, then no Euler
Path exists.
If the graph is connected and has exactly two odd degree vertices, then at least one Euler Path
exists.
All such Euler Paths must start at one of the odd vertices, and end at the other.
{FB, BA, AJ, JB, BC, CD, DE, EC, CI, IG, GE, EF, FG, GI, IH, HG, GF}
While Euler focused on the edges, another mathematician turned his attention to the vertices.
In a connected graph, if every edge is traveled, every vertex will be visited. Is it possible to visit
every vertex without traveling every edge? If this can be done in more than one way, is it
possible to find the shortest way?
Sir William Rowan Hamilton focused his attention to this problem. Hamilton’s original work in
this regard was intended as a game, and he sold it as such to a toy and game manufacturer. In
1857 Hamilton described his Icosian game at a meeting of the British Association in Dublin. It
was sold to J. Jacques and Sons, makers of high quality chess sets, for £25 and patented in
London in 1859. The game is related to Euler's Knight's Tour problem since, in today's
terminology, it asks for a Hamiltonian circuit in a certain graph. The game was a failure and sold
very few copies.
We have two theorems that may be of some help in answering this question, Dirac’s and Ore’s.
Dirac’s Theorem: In a connected graph with three or more vertices, if each vertex is adjacent to
at least half of the remaining vertices, then the graph has a Hamilton circuit.
Ore’s Theorem: In a connected graph with three or more vertices, the sum of the degrees of
every pair of non-adjacent vertices is greater than or equal to the number of vertices, then the
graph has a Hamilton circuit.
HW
20.
Each vertex has the same in-degree as out-degree, the underlying undirected graph has all even
degree vertices. An Euler circuit should exist. Start with a, must go to d. From d go to b, then
return to d. Next to e, to b and back. Then e to c, c to b and back to a. {a,d,b,d,e,b,e,c,b,a}
TOC
In the late 1950’s a Dutch mathematician, Edsger Dijkstra proposed an iterative algorithm that
would find the shortest path between two vertices. The gist of his algorithm is to start at the
beginning of the path, proceed to find the shortest path to a next vertex, then to a vertex with one
intervening vertex, etc., until the end of the path is reached. The algorithm he proposed will find
the shortest path between two vertices in a connected, simple, undirected graph. The complexity
of this algorithm is O(n2), where n is the number of vertices in the graph.
When this problem is escalated to finding the shortest circuit, the complexity increases to (n –
1)!/2. Basically the solution is to list all such circuits, then sort by length to find the shortest.
A certain deli delivers sandwiches every day to five businesses; Jones‘, King's, Landry's,
Martin's and Novak's. Since the delivery person must follow the streets, the distances between
points are the number of blocks traveled. What is the shortest path that will start and end at the
deli, visiting all five businesses? (The map is on the next page.)
J
Deli K
Distribution of Lengths
50
Number of Circuits
40
30
20
10
0
24 26 28 30 32 34 36 40
Le ngth of Circuit
In reality, there are only 60. There are two circuits that tie for best, D-J-K-M-N-L-D and
D-K-M-N-L-J-D. It only took me a little over an hour and a half to discover this. Adding just one
more vertex would have made this problem six times as big.
In short, problems of this type (known as the traveling salesman problem) are NP-complete, that
is, they are not solvable in polynomial time. There are a host of algorithms which find
approximations. Two of these are, nearest neighbor, and cheapest link. These are both “greedy”
algorithms.
The nearest neighbor method simply says, of all the available options, take the closest.
To improve on this, try each of the vertices as the starting point. Once you have a circuit, you can
start anywhere.
The cheapest link connects the two closest, then the next etc. until a circuit is formed.
The only requirement is that the final path be a circuit.
The two examples we looked at both gave us the optimum solution. Neither one is guaranteed to
perform that well all of the time. In fact, often, the best you will get is around 80% or so of the
true optimum.
TOC
A rooted tree is called m-ary if every internal vertex has no more than m children, and full m-
ary trees are those in which all internal vertices have exactly m children. A rooted tree is
considered ordered if the children are arranged in order from left to right. When these are binary,
the children are referred to as left and right.
Trees have applications ranging from modeling chemical compounds, Bernoulli Trials,
organizational structures and many others.
Properties of Trees
All trees with n vertices have n – 1 edges.
A full m-ary tree with i internal vertices contains n = mi + 1 vertices.
A full m-ary tree with n vertices has i = (n – 1)/m internal vertices and l = [(m – 1)n +
1]/m leaves
A full m-ary tree with i internal vertices has l = (m – 1)i + 1 leaves
A full m-ary tree with l leaves has n = (ml – 1)/(m-1) vertices and i = (l – 1)/(m – 1)
internal vertices.
The level of a vertex in a rooted tree is the length of the unique path from the root to that vertex.
The height of a rooted tree is the maximum level in that tree. A rooted m-ary tree, of height h, is
called balanced if all leaves are at levels h or h – 1.
TOC
Start with an initial vertex, the root. The first item in the list is assigned to the root.
To add subsequent items, start with the root, if the new item is less than the root, move to the
left, else move to the right. When the item is less and no left child exists, it is inserted as a left
child. Similarly on the right. An example is perhaps the best way to ensure that the algorithm is
understood. The integers 1 through 18 were randomly shuffled. They appear in the order that
they are to be inserted into a binary tree: 9, 17, 12, 7, 16, 6, 2, 13, 4, 10, 8, 5, 14, 11, 15, 1, 3, 18
The first integer, 9, will be assigned as the key value for the root. Since 17 is larger than 9, it will
be assigned as the key to the right child. Next is 12, from the root we move to the right 12 > 9.
The right child has no children,
and 12 < 17 so 12 becomes the 9
left child of 17. The next
7 17
integer is 7. From the root we
move left, 7 < 9. There are no
left children so 7 is assigned to 6 8 12 18
that position. The final tree is
shown below. To retrieve an
2 10 16
item we use similar reasoning
as we do in placing an item.
Suppose we are looking for 11. 4 13
Since 11 is bigger than 9 we go 1 11
to the right. On the right we 5
3 14
find 17, 11 < 17 go left. To the
left, we find 12, 11 < 12 go
left. To the left of 12 is 10. 15
Since 11 is larger than 10, we
look to the right and find the value.
Decision Trees
Another type of m-ary tree, is the decision tree. This tree models a sequence of decisions which
lead ultimately to a decision. This type of decision can be anything from successive weighing to
find a counterfeit coin, to sorting values. A binary sort is of the form, new element is either less
than or greater than reference element. This type of sorting is at least Log(n!) complexity (in
terms of comparisons) and is n log n .
Prefix Codes
In an attempt to save memory and reduce transmittal time, a scheme could be employed which
would assign unique prefixes to characters. More commonly used characters would then be
assigned shorter prefixes. A simple method is to use 1’s as the prefixes and a zero to denote the
end of the prefix. Thus 0 might be a letter, but not 1. For that matter not 11 or any sequence of
1’s other than the last letter in the list.
Huffman Coding
Huffman Coding is the result of a greedy algorithm employed to turn a forest into a tree. Weights
are assigned to nodes (characters) based on the relative frequencies. First, the two smallest are
joined by creating a new root node and assigning the larger to the left, and the smaller to the
right. The sum of these two is now assigned to the root of the treelet. The edges are assigned 0’s
if going to the left, and 1’s if going to the right. The final label is the sequence of 0’s and 1’s
formed by following the path from root to desired character.
Game Trees
A tree that starts with the opening position of a game as its root, and then proceeds to enumerate
all subsequent possible moves as sequential levels is a game tree. Trees of this nature may have
their vertices labeled to show the payoff for the players if they follow the so called minmax
strategy.
TOC
Algorithms designed to systematically visit every vertex in a tree are called traversal algorithms.
One such algorithm is the preorder traversal. Preorder traversal starts at the root, then moves
from left to right exhausting the subtrees as it goes. In the tree above, the preorder traversal
would be 0, 1, 1.1, 1.2, 2, 2.1, 2.1.1, 2.1.2, 3, 3.1, 3.2, 3.2.1.
Inorder transversal starts at the root, but if there are subtrees, it will exhaust them in order by
pursuing the subtree on the left, then coming back to the root, then the next subtree to the right.
For the tree above, 1.1, 1, 1.2, 0, 2.1.1, 2.1, 2.1.2, 2, 3.1, 3, 3.2.1, 3.2.
Postorder transversal moves from left to right, exhausting the vertices from the bottom before
visiting the root. The tree above would yield, 1.1, 1.2, 1, 2.1.1, 2.1.2, 2.1, 2, 3.1, 3.2.1, 3.2, 3, 0.
Your text provides the pseudocode algorithms for these three transversals on pages 716 and 718.
The use of ordered rooted trees for storage leads to an equivalence of order of operations. If we
let internal vertices be operations, and leaves be operands, then we can evaluate the subtrees
from left to right. Trees used for this purpose result in expressions that are referred to as infix,
prefix or postfix depending on the transversal scheme being used. Infix notation will sometimes
require the use of parenthesis to avoid ambiguity. Prefix (Polish notation) and postfix (reverse
Polish notation) do not require parenthesis. The trees themselves are not ambiguous however.
The binary tree representing the right hand root from the quadratic formula,
1 b b 4 a c 2 a , is given below. Start on the lowest level, as you perform the
2 1 2
indicated operations, replace the vertex containing the operator, with the result of the operation.
Reading infix expressions are no problem, since we must include parenthesis to avoid ambiguity.
Reading prefix and postfix expressions simply require starting from the right for prefix, and
starting from the left for postfix.
To help see this, we write the prefix expression for the above tree:
÷ + × -1 b ^ - ^ b 2 × × 4 a c ÷ 1 2 × 2 a
To read this, start at the right and move in until you find an operator. Apply this operator to the
two values just to its right. We find a multiplication which we apply to the values 2 and a. At this
point we have the expression:
÷ + × -1 b ^ - ^ b 2 × × 4 a c ÷ 1 2 2a
We divide 1 by 2.
÷ + × -1 b ^ - ^ b 2 × × 4 a c 0.5 2a
We multiply 4 and a.
÷ + × -1 b ^ - ^ b 2 × 4a c 0.5 2a
We multiply 4a and c.
÷ + × -1 b ^ - ^ b 2 4ac 0.5 2a
We raise b to the power of 2.
÷ + × -1 b ^ - b2 4ac 0.5 2a
We subtract 4ac from b2
÷ + × -1 b ^ (b2-4ac) 0.5 2a Parenthesis inserted to avoid confusion.
We raise the parenthetic quantity to the 1/2 power.
÷ + × -1 b (b2-4ac)0.5 2a
We multiply –1 and b.
÷ + -b (b2-4ac)0.5 2a
We add the radical to –b.
÷ (-b + (b2-4ac)0.5) 2a
We divide by 2a.
When dealing with postfix expressions, start at the left, and move in to the first operator. Apply
that operator to the two values immediately to its left.
TOC
One such method, depth first, picks some arbitrary vertex as the root of the tree, then follows the
longest path possible from this root. Vertices that are not included require backtracking to the
first vertex from which there exists a path that will not produce a circuit. For the graph below, we
can construct a spanning tree by assigning A to the root and generate the path: A, B, D, G, K, I,
H, E, F.
We can see that the vertices C, J and L have been omitted. Backing up the path, we find an edge
from E to J which does not create a circuit, so we add it. Backing up one more vertex to H, we
see that the remaining two vertices may be added with out forming a circuit. The edges selected
are called the tree edges, and the remaining edges are called back edges. Spanning trees formed
in this way are not unique. One reason is that the root is chosen arbitrarily, another is that no
serious effort is used to make the longest possible path from that root. Typically we start with an
adjacency list and progress sequentially to the first vertex adjacent to the current one.
Another method for creating spanning trees is the breadth-first method. As in the previous
method the choice of the root is arbitrary, but once it is assigned, all vertices adjacent are added
along with the edges incident so long as no circuit is formed. These vertices are arbitrarily
ordered, then, in order, the same process is followed. Using the same graph as before, with the
arbitrary ordering of children being alphabetic we find the tree given below.
B C
D E I G H
F J K L
Different trees may be obtained by changing the ordering of the children. If we had reversed the
ordering of the children, there would have been an edge from C to D rather than from B to D.
When the underlying graph is directed, these two methods may well produce only a spanning
forest.
TOC
The table below gives the lengths of the edges in a completely connected graph.
Dist. A B C D E F G H
A 4 5 5 7 10 11 10
B 4 5 3 7 6 11 8
C 5 5 6 4 11 6 11
D 5 3 6 4 5 8 5
E 7 7 4 4 7 4 7
F 10 6 11 5 7 7 4
G 11 11 6 8 4 7 7
H 10 8 11 5 7 4 7
Using Prim’s Algorithm, we would start with edge (B, D) since it is the
shortest. Proceeding with the algorithm we obtain the tree below, since
edges are added only to vertices already in the graph. I have numbered
the edges to indicate the order in which they were added. The final tree
has a weight of 28. While Kruskal’s Algorithm also
guarantees us a minimum spanning tree, it may not
be the same tree as we found with Prim’s algorithm.
We also start with edge (B, D) since it is the
smallest. Again, the edges are labeled in the order
in which they were added. The tree is not the same,
but the overall sum of the edges is the same. With
this particular graph, we get the same spanning tree, the only difference in
the two is the order in which the edges are added.
The two algorithms we have considered will give the minimum spanning
tree when we require our tree to be a subset of a given graph. It may not
provide the minimum network for the given vertices if the vertices are placed geometrically to
represent physical locations. You may have a minimal spanning tree for your graph, but if a
smaller tree can be made by inserting additional vertices, you don’t have a minimal network.
It may seem strange, but sometimes having more can lead to getting less. There are
circumstances where it is possible to get a smaller network by inserting additional vertices.
These vertices are optimally located at what are called Steiner points. Jakob Steiner (1796-1863)
was a geometer who discovered this property of the relationship between location and distance
between a collection of points. He is pictured below.
A good example of the use of Steiner points is the graph
consisting of vertices that form an equilateral triangle with
legs of 500 miles each. The minimal spanning tree is 1000
miles. If we allowed ourselves the ability to create additional
vertices and edges as needed, we could construct a much
smaller network.
What is gained by using a Steiner point, at best the difference between the minimum spanning
tree and the shortest network is 13.4%. Often, it is less than 5%.
What do we lose by looking for these points? In a large network, there are a lot of possible
Steiner points to look for, in fact the number grows at a factorial rate. Add to this the relative
complexity of finding the points algebraically and we have greatly added to the complexity of
specifying our tree.
Oddly enough, we can let soap deal with all of these issues. Suppose you build a model of the
vertices using two pieces of Plexiglas, held about an inch apart with dowels. The dowels are to
be positioned exactly where the vertices should be. Use an appropriate scale for the model, so
that the dowels are no closer than one or two inches from each other. Dip this apparatus in a soap
solution. The bubbles that form between the dowels will form a network that passes through one
set of Steiner points. There is no guarantee that this will be the optimum set, but it will beat the
MST.
TOC