Quick Data Structures - David Matuszek
Quick Data Structures - David Matuszek
com
Quick Data Structures
If you want to upgrade your programming skills, the most important thing
you need is a solid understanding of fundamental data structures. The
proper choice of data structures distinguishes excellent programmers from
merely competent ones.
A good choice of data structures will simplify your job, not complicate it.
Your code will be not only faster but also easier to understand and debug.
There is no downside to using the right data structures for the job.
This book
OceanofPDF.com
Quick Programming Series
Most programming books are either textbooks aimed at beginners, or tomes
intended to provide everything the programmer needs to know. Books in
this series fulfil an important niche by offering a bridge for programmers
who need a quick entry into a language, and do not need to know
everything about it yet.
David Matuszek
OceanofPDF.com
Designed cover image: 100covers.com
First edition published 2026
by CRC Press
2385 NW Executive Center Drive, Suite 320, Boca Raton FL 33431
and by CRC Press
4 Park Square, Milton Park, Abingdon, Oxon, OX14 4RN
CRC Press is an imprint of Taylor & Francis Group, LLC
© 2026 David Matuszek
Reasonable efforts have been made to publish reliable data and information,
but the author and publisher cannot assume responsibility for the validity of
all materials or the consequences of their use. The authors and publishers
have attempted to trace the copyright holders of all material reproduced in
this publication and apologize to copyright holders if permission to publish
in this form has not been obtained. If any copyright material has not been
acknowledged please write and let us know so we may rectify in any future
reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be
reprinted, reproduced, transmitted, or utilized in any form by any electronic,
mechanical, or other means, now known or hereafter invented, including
photocopying, microfilming, and recording, or in any information storage
or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work,
access www.copyright.com or contact the Copyright Clearance Center, Inc.
(CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. For
works that are not available on CCC please contact
mpkbookspermissions@tandf.co.uk
Trademark notice: Product or corporate names may be trademarks or
registered trademarks and are used only for identification and explanation
without intent to infringe.
ISBN: 978-1-041-03813-9 (hbk)
ISBN: 978-1-041-03810-8 (pbk)
ISBN: 978-1-003-62550-6 (ebk)
DOI: 10.1201/9781003625506
Typeset in Minion
by codeMantra
OceanofPDF.com
To all my students past, present, and future
OceanofPDF.com
Contents
Preface
Author
Where’s the Code?
CHAPTER 4 ◾ Recursion
4.1 Recursive Data Structures
4.2 Recursive Functions
4.3 Four Rules
4.3.1 Rule 1: Do Base Cases First
4.3.2 Rule 2: Recur Only with Simpler Cases
4.3.3 Rule 3: Don’t Use Non-Local Variables
4.3.4 Rule 4: Don’t Look Down
4.4 Examples of Recursion
CHAPTER 9 ◾ Heaps
9.1 Heap Implementation
9.2 Deallocation Problems
9.3 Garbage Collection
9.3.1 Reference Counting
9.3.2 Mark and Sweep
CHAPTER 10 ◾ Trees
10.1 Applications of Trees
10.1.1 File Systems
10.1.2 Family Trees
10.1.3 Game Trees
10.1.4 Expressions
10.2 Tree Searching
10.2.1 Depth-First Searching
10.2.2 Breadth-First Searching
10.2.3 Depth-First Iterative Deepening
10.2.4 State-Space Searches
10.2.5 Pruning
10.2.6 Alpha–Beta Searching
10.3 Tries
CHAPTER 11 ◾ Graphs
11.1 Graph Applications
11.2 Adjacency Matrix Representations
11.3 Representation by Sets
11.4 Searching a Graph
11.5 Sparse Arrays
11.6 Dijkstra’s Algorithm
11.7 Spanning Trees
11.8 Mazes
11.9 Heuristic Searching
11.9.1 Solving the Fifteen Puzzle
11.9.2 The A* Algorithm
11.9.3 IDA*
CHAPTER 12 ◾ Hypergraphs
12.1 Plexes
Afterword
Index
OceanofPDF.com
Preface
M ANY YEARS AGO, A friend came to me with a problem.
Back then, you typed your program on punched cards and put them in a tray
along with other programs to be run. Some hours later, you would get the
printed results.
My friend’s problem was that his program was taking more than 20 minutes
to run on the Control Data 6600, which (in those days) was the world’s
fastest computer. According to the Computation Center rules, any program
that took more than 20 minutes would be stopped and not run again until
the weekend when the computer was less busy. My friend asked me if I
could speed up his program.
The program was doing a lot of table look-ups, using an ordinary array. I
replaced the array with a hash table–a change which reduced the running
time to under 20 seconds. My friend didn’t fully trust my work because, as
he later told me, he spent the entire weekend hand-checking the results.
Times have changed, and today’s computers are orders of magnitude faster
than the “supercomputers” of yesteryear. That same program would, today,
run in a few milliseconds.
Today’s computers are so fast and have so much memory that, for most
programs, it doesn’t make sense to worry about efficiency. However, there
are important exceptions: video games, popular websites, deep learning,
weather modeling, and more. Besides, why struggle with a poor choice of
data structures when there might be one perfectly suited to your needs?
OceanofPDF.com
Author
I’ M DAVID MATUSZEK, KNOWN to most of my students as “Dr. Dave.”
I finally left the industry and joined the Villanova faculty full time for a few
years before moving to the University of Pennsylvania, where I directed a
master’s program (MCIT, Master’s in Computer and Information
Technology) for students transitioning into computer science from other
disciplines.
I retired in 2017, but I can’t stop teaching, so I’m writing a series of “quick
start” books on programming and programming languages. I’ve also written
three science fiction novels—Ice Jockey, All True Value, and A Prophet in
Paradise—and I expect to write more. Check them out!
And, hey, if you’re a former student, drop me a note. I’d love to hear from
you!
david.matuszek@gmail.com
OceanofPDF.com
Where’s the Code?
T book. If you want the code for, say, a merge sort, you
HIS ISN’T A RECIPE
can do a web search and find code for a merge sort in any of the couple
dozen most common languages. Instead, the goal of this book is to explain
(for example) how a merge sort works, and when and why you might want
to use one. Once you understand that, you can write your own or grab one
of the many published versions.
Code isn’t always the best way to explain an algorithm or data structure—
but sometimes it is. In such cases, the code should be as readable as
possible.
It’s generally agreed that Python is the most readable language, but every
language has glitches. Python’s for loop is
Similarly, Python has dictionaries, “lists,” and sets, which we will try to
avoid. Consequently, the code in this book is “Python-like” but, in the
interest of making code as readable as possible for everyone, not
necessarily “real” Python.
Building Blocks
DOI: 10.1201/9781003625506-1
Any data structure can be created from just three basic components: arrays,
nodes, and pointers.
An array is a linear sequence of values, all of the same type and size.
Because all the values are the same size, the array can be efficiently
indexed—the location in memory of the n-th element is simply the starting
location of the array plus n times the element size.
Pointers raise security concerns. Unless handled with extreme care, they
can allow malicious code to be loaded as data and then executed.
A reference can be stored and duplicated like any other data value, it can be
dereferenced to get the item it points at, and two references can be
compared for equality. No other operations are provided. Because the
integer implementation is hidden, references are inherently more secure
than pointers and require less syntax. This is how links are handled in Java,
Python, and most other modern languages.
1.2 ARRAYS
An array is a deceptively simple data structure. It’s just a linear sequence of
values, generally all of the same type. To index into it, the language simply
adds the index value times the element size to the starting location, giving
the desired memory location. Indexing is a very fast operation.
There are minor variations. Depending on the language, the first location
might have an index of 0, an index of 1, or some integer chosen by the
programmer. Some languages allow discrete data types, such as characters,
to be used as indices.
Usually, the size of the array is defined when it is created and cannot later
be changed. Again, this is language-dependent.
Some languages (Fortran, for example) allow arrays to have two or more
dimensions. It may be important to know the order in which values are
stored (rows first or columns first), because some operations will depend on
this ordering.
In any language based on C, all arrays are one-dimensional, but the values
in the array may themselves be arrays. In these languages, the size of an
array is not a part of its type, so arrays within an array are not all required to
be the same size; if they are of differing sizes, the result is called a ragged
array.
To index efficiently into an array, all the values in the array must be the
same size. Since strings may be of different sizes, there is really no such
thing as an “array of strings.” Such an array actually contains pointers to
strings; the strings themselves are stored elsewhere. See Figure 1.1.
The same is true when a node is depicted as containing a string: The node
actually contains a pointer to the string.
Strings are almost always stored in a data structure called a heap (see
Chapter 9). Heaps have the ability to hold data of varying sizes.
OceanofPDF.com
CHAPTER 2
Essential Math
DOI: 10.1201/9781003625506-2
First, you need to understand that some growth rates are better or
worse than others (e.g., exponential growth is much worse than linear
growth).
Second, you need to be able to analyze a program to determine which
kind of growth occurs as the amount of data increases. This is (usually)
much simpler than it sounds.
This chapter covers all the mathematics necessary for a basic understanding
of how to determine these factors. Along the way, several popular sorting
algorithms are described.
It is very unusual for algorithms to require explosive amounts of memory,
but quite common for them to demand excessive amounts of time. For this
reason, our analyses are primarily about running time.
After reading this chapter, you will understand what “Big-O” notation is all
about.
On the one hand, computers are literally millions of times faster than they
were a few decades ago. Back then, computers were expensive, while
human labor was (relatively) cheap. It was important to get value out of
every machine cycle. Today, however, the economics are completely
reversed, and it is—to put it gently—unwise to spend an hour of a
programmer’s time to save a millisecond of computer time. For a large
majority of programs, efficiency simply is not and should not be a concern.
The bottom line is: The effective programmer knows when to spend effort
on making a program more efficient and when not to. Knowledge of data
structures is the key to doing this.
Since the formula depends on the “size” of the input, we need some
measure of the size, and that depends on the nature of the problem.
When searching an array, the “size” of the input could be the size of
the array.
When merging two arrays, the “size” could be the sum of the two array
sizes.
When computing the nth Fibonacci number or the nth factorial, the
“size” is n.
The “size” should be a parameter that can be used to calculate the actual
time (or space) required. It is frequently obvious what this parameter is, but
sometimes some experimentation may be required. Sometimes we need two
or more parameters—for instance, if we are dealing with a two-dimensional
array where the rows and columns are treated very differently.
Here are some things that probably take more than constant time:
A loop.
A call to a complex or recursive method.
An operation that takes constant time k is said to be O(1), that is, “Big-O of
one.” Notice that we don’t care how big the constant k is.
You might notice that some operations, such as adding up all the numbers in
an array, are not exactly doubled if you double the size of the array. That’s
true; it takes some small amount of time to set up the loop. So the actual
time required to sum up an array of n numbers is c+kn, where c is the
(constant) amount of time required to set up the loop and k is the (constant)
amount of time required to add each element. This expression is linear in n.
The Big-O notation for an expression linear in n is O(n). Notice that we
don’t care about the size of c or k; they are just constants, so we ignore
them.
Here we will briefly consider three such sorting algorithms. They are of
interest because they are easy to implement, because they illustrate how to
informally determine the Big-O running time of an algorithm, and because
they can be used to describe the important concept of a loop invariant.
Note: We refer to the “first” and “last” indices in the array because
many programming languages use zero as the smallest index, several
languages use one, and some languages allow an arbitrary starting
index.
Set some variable limit to the index of the last element in the array
and set a flag variable swapped to true.
While limit is greater than the first index and swapped is true:
Set swapped to false.
For index i ranging from the first index in the array up to and
including limit-1:
If array[i] > array[i+1], swap the two values and set
swapped to true.
In bubble sort, the while loop is executed up to n times, where n is the size
of the array. (It could be fewer if no swaps occur, but as usual, we’re
interested in the worst case.) The inner for loop is initially executed n-1
times, but each time the number of executions is reduced, so on average it is
executed about n/2 times. Dropping the constants, the outer loop is
executed n times, and for each of those times, the inner loop is executed n
times, so the overall running time is n times n, or O(n2).
The actual running time of bubble sort depends on the initial state of the
array. For a random array, the expected running time is quadratic, but
bubble sorting an array that is already in sorted order takes only O(n) time.
Although bubble sort has the same Big-O running time as insertion sort and
selection sort, it is generally the slowest of the three. There is another
reason to avoid it: Bubble sort is the sorting technique most often invented
by beginners, so it has a bad reputation as amateurish.
Sorting algorithms essentially do two things: compare two values and swap
two values. At each step, swaps may or may not occur, but comparisons will
always occur. Therefore, comparisons make a good characteristic operation;
swaps do not.
Bubble sort proceeds in a series of n passes, where n is the size of the array.
The average length of each pass, L, is n/2. During each of those passes, L
comparisons are made. Therefore, n passes times n/2 comparisons equals
n2/4, or O(n2).
Note: The careful reader will notice that all the values mentioned in
the previous paragraph are approximate. For example, the number of
passes made during a bubble sort is n-1, not n. The important point is
that these simplifications do not affect the conclusion that the running
time is O(n2).
If the algorithm has an innermost loop, we might just look at how many
times that loop is executed. If a single pass through that loop takes constant
time, the loop itself can be considered a characteristic operation.
The invariant for insertion sort differs from that of bubble sort in two
respects. First, we’re moving elements to the beginning of the array, rather
than to the end. This is minor, and either algorithm could be adjusted to
work from the other end. The second difference is important: The elements
are sorted with respect to one another but can still be moved about; they
may not be in their final positions.
To begin, notice that when we consider only the first value, it is trivially
true that it is in sorted order.
Now suppose that the first k values are in sorted order. What about the next
value, at location k+1? If it is at least as large as the immediately preceding
value, it can remain where it is. Otherwise, we can take that value out of
location k+1 and insert it somewhere earlier. We can use a binary search
(see Section 2.11) to find where among the first k values it should be
inserted, and we can move all the values between that location and location
k up one space. Conveniently, the value that was in location k can be moved
to location k+1, which has just been vacated.
Analysis: We run once through the outer loop, giving a factor of n. Each
time, we perform a binary search (which takes log(n) time) and then move,
on average, n/4 elements, so that each time through the outer loop the work
required is log(n)+n. (It’s n/4 because we move, on average, half of the
already sorted values, which is, on average, half of the total values.) We
ignore constants, so the result is n×(log(n)+n), or n×log(n)+n2. Finally,
n×log(n) grows more slowly than n2, so our final result is O(n2).
Often, we are sorting not just numbers but objects according to some key
value. A stable sort is one that does not change the order of objects with
equal keys. For example, if we are sorting customers by name and “John
Smith” from New York comes before “John Smith” from Boston, those
objects should remain in the same order after sorting.
Search the entire array to find the smallest value. Swap it with the value in
the first location. Search the array starting from the second location to find
the smallest remaining value and swap it with the value in the second
location. Search the array starting from the third location to find the
smallest remaining value and swap it with the value in the third location.
And so on.
At each step, we search the unsorted part of the array for the smallest value
and swap it with the value just past the sorted part, so the sorted part gets
larger by one value.
The loop invariant is that the first k elements are sorted and in their final
position, as k varies from 1 to the size of the array.
Any positive number may be used as the base. Here are the three most
commonly used kinds of logarithms:
Common logarithms are those that use 10 as a base; these are often
encountered in engineering.
Natural logarithms use the number e (approximately
2.718281828459045) and are favored by mathematicians.
Binary logarithms use 2 as a base and are favored by computer
scientists.
Starting with the number 64, repeated halving gives 32, 16, 8, 4, 2, and 1.
That’s six halvings to get to exactly 1, so log264 is 6.
Things won’t usually work out this exactly. Starting with 60 instead of 64,
we get the sequence 30, 15, 7.5, 3.75, 1.875, and 0.9375. This tells us that
five halvings (1.875) isn’t enough, but six halvings (0.9375) is slightly less
than 1, so the binary logarithm must be between 5 and 6, and closer to 6.
(The true value is about 5.907.)
This idea of “cutting in half” will occur quite often in our discussion of
algorithm timing.
2.11 BINARY SEARCH
Binary search is an algorithm for searching a sorted array for a particular
value.
In a binary search, you only search the array between two indices—we’ll
call them left and right. These indices are initially the lowest and highest
possible indices, and they will gradually move toward each other. If the
item is not in the array, the indices will cross, and left will become greater
than right.
If left is greater than right, return failure; the item is not in the array.
Compute mid as the (integer) average of left and right.
If array[mid] is the sought-after item, return mid.
If array[mid] is too large, recursively search between left and mid-1;
Otherwise, array[mid] is too small, so recursively search between
mid+1 and right.
At each point we either find the item and return, or we recursively search
half the remaining elements. Since we are eliminating half the remaining
elements each time, the required time is actually binary logarithmic time
(logarithms to the base 2), which we could write as O(log2n). Recall,
however, that it doesn’t matter which base we use, as the results only differ
by a constant. Since we ignore constants, we write logarithmic time simply
as O(log n).
In this example, the façade function will take only two parameters: the item
sought and the array. It will then determine the appropriate initial values for
left and right and make a single call to the recursive function.
In some languages, the façade function and the recursive function can have
the same name. In other languages, different names are required, and the
façade function should have the more user-friendly name. If the
programming language allows functions to be nested, the recursive version
can be “hidden” inside the façade function.
2.12 QUICKSORT
The quicksort algorithm is one of the fastest sorting algorithms known. It is
a recursive algorithm—that is, the quicksort method calls itself.
The initial test (left < right) checks whether anything more needs to be
done. The initial call will be with the entire array, so left will be zero (in
most languages), and right will be the size of the array minus one. If
quicksort is called with left greater than or equal to right, the partition
size is zero or one, and this branch of the recursion is finished.
The partition method moves smaller numbers to the left, larger numbers
to the right, and returns the index p of the rightmost small number. How this
is done will be explained after the following example.
Start with the array [68, 81, 20, 50, 60, 78, 47, 90].
If we take 68 as the pivot, we can partition the array into two parts:
[47, 60, 20, 50 | 81, 78, 68, 90]. The first part contains the
numbers less than 68, while the second part contains the numbers
greater than or equal to 68. Neither part is sorted.
Quicksort the left part, [47, 60, 20, 50].
Taking 47 as the pivot, we partition this into the two parts [20 |
60, 47, 50].
With this array segment, using 68 as the pivot, the search from the left finds
68, while the search from the right finds 50.
Continuing the searches from where we left off (just to the right of the 50
and just to the left of the 68), we find 81 as greater than 68, and 60 as less
than 68.
Continuing the searches from where we left off (just to the right of the 60
and just to the left of the 81), we find 81 as the first number greater than 68,
and 47 as the first number less than 68.
At this point, the left index has become greater than (or equal to) the right
index, so the partition operation is finished; the numbers 50 to 47 are all
less than 68, and the numbers 81 to 90 are all greater than or equal to 68.
The index of 47 (the rightmost number in the left partition) is returned as p,
the value of the partition method.
In the above example, we took the first value in each array segment as the
pivot. Other options include picking the value in the center of the array
segment and picking a random value in the array segment. Each approach
may have some minor advantages.
Quicksort is faster than insertion sort for large arrays, but for small arrays
(up to 10 or 12 elements), insertion sort is faster. For this reason, a hybrid
sort is sometimes implemented, where small partitions are sorted using
insertion sort. This additional effort is probably worthwhile for a library
routine to be used by the general public.
To partition the entire array, we find a large number from the left end, a
small number from the right end, and swap them. Every element of the
array is compared to the pivot once and possibly swapped with another
element. The comparison and the possible swap each take constant time,
and there are n elements in the array, so to partition the entire array takes
O(n) time.
But that’s just for the first level of partitioning. What about the second
level? In our example, the array was split in half, but each partition
operation had half as much to do, so ½O(n) + ½O(n) = O(n). If the array
were split differently, it would still work out: ⅓O(n) + ⅔O(n) = O(n). We
can conclude that O(n) work is done at each depth of the recursion.
While our initial conclusion is basically correct, there are caveats. Our
example was chosen so that each partition operation split the part being
sorted into equal halves; this is the best case. For a random array, the
expected size of the division is roughly ⅓ and ⅔. This isn’t as neat, but a
careful analysis still results in an expected running time of O(n log n).
The worst case occurs when every partition operation splits an array
segment of size n into a segment of size 1 and a segment of size n-1. The
depth of the recursion is then O(n) instead of O(log n), and O(n) times
O(n) gives quicksort a running time of O(n2).
Unfortunately, the worst case will occur when the pivot is chosen to be the
first element (or the last element) in the array segment, and the array is
already sorted. The amount of work done at each level of the recursion is
still O(n), but the maximum depth of the recursion, instead of being log(n),
is now n. In this case, quicksort takes O(n2) time.
There are a couple of ways to avoid this. You can do a pre-check to make
sure the array isn’t already sorted; or you can choose the middle element
rather than the end element of the array segment; or you can choose a
random element in the array segment. Other approaches are possible and
almost always work, but there is no way to absolutely guarantee that
quicksort won’t take O(n2) time.
Final conclusion: Quicksort almost always has a running time of O(n log
n), where n is the size of the array, but there is no guarantee that it won’t
take O(n2) time. Therefore, quicksort should not be used in critical
applications where an O(n log n) running time is an absolute requirement.
Copy half the numbers into a new array, and the remaining numbers
into a second new array. (If the array size is odd, one array will be
slightly larger than the other.)
Independently merge sort the two smaller arrays. If an array size is 0 or
1, it is already sorted.
For example, we will start with the array A = [56, 1, 44, 17, 24, 60,
71, 51]. Copy the two halves of this array into new arrays B and C, so B =
[56, 1, 44, 17] and C = [24, 60, 71, 51].
Merge sort B to get [1, 17, 44, 56].
Merge sort is much slower than quicksort on average but has a guaranteed
running time of O(n log n). Like quicksort, it can be sped up by switching
to an insertion sort for smaller arrays.
You will often see that the answer is “no.” The proof is based on the fact
that an array of n values can be arranged in n! ways, and to sort these out
requires n×log(n) decisions.
But that proof assumes that all values may be distinct. In this section, we
will discuss two sorting algorithms, each of which requires only O(n) time.
Algorithm 1
Suppose you have an array of a thousand scores, where each score is in the
range 0 to 25. All you need to do is set up an array of 26 locations (0 to 25),
zero out the array, and for each score, add 1 to the corresponding location of
your array of counts. After all scores have been tallied, you can put these
scores back into the original array.
This is a special case because there are only a small, finite number of
scores.
Algorithm 2
Back when programs were typed onto punched cards, one line of text per
card, the last 8 columns of each card were reserved for a card number. The
idea was that if your cards were numbered, and you dropped the deck, they
could be put in order again by using a large machine called a card sorter.
To use the machine, you would put cards in a bin, set the machine to sort on
the least significant digit (often column 80), and start it. As you waited, the
machine would place every card with a zero in column 80 into hopper zero,
every card with a one in column 80 into hopper one, and so on. Then you
would take the ten decks out, put them together, and run them through
again, this time on the second least significant digit. Then the third. And so
on.
Each pass through the card sorter would take O(n) time, where n was the
number of cards. To sort on d digits required d passes, so the total running
time was O(d × n). For any given card deck, d was a constant number of
digits, so O(d × n) could be regarded as O(n) running time.
Although punched-card sorters are long obsolete, the ideas behind this
algorithm can still be used in certain specialized problems.
For small problems, yes. With Big-O, we assume that small problems run
fast enough anyway (usually true, but not always), and it’s the large
problems that we need to be concerned about.
There are also some algorithms that require superexponential time, such as
c
O(2 ). Any such algorithms are beyond the scope of this book.
n
Along with Big-O, two other measures are sometimes encountered. They
are Big-Ω (“big omega”) and Big-Θ (“big theta”). Loosely speaking, here’s
what these measures tell us:
Big-Θ is the actual running time and is what we would really like to have,
but that’s not always well-defined. Quicksort usually runs in n log(n) time,
but it could take n2 time, so there isn’t a Big-Θ value for it. Big-O is the
best we can do. In fact, Big-O is the best we usually can do, which is why
you don’t hear much about Big-Θ.
Big-O is usually taken to mean the upper limit on how long an algorithm
takes, but it can also be used to mean an upper limit. If an algorithm
actually takes quadratic time, O(n2), then it can correctly be described as
O(n3), O(n4), O(2n), O(n × 2n), and in many other ways. A catch-all term
for these longer running times is called little-o.
An algorithm has little-o running time if, for sufficiently large n, the
algorithm is always faster than this. Put another way, if an algorithm takes
O(f(n)) running time and also o(g(n)) running time, then f(n)/g(n)
approaches zero as n goes to infinity.
Again, notice that we simply don’t care what happens for small values of n,
but only care about what happens when n is greater than some number N.
The precise value of N doesn’t matter; only that it exists.
Optimizing is finding the best solution. This may require finding and
comparing all solutions, which isn’t always possible.
Here’s an analogy: If you are about to buy a car, there may be one car out
there that is the best choice for you. To find it, you would have to visit
every car dealer, compare all the models, compare all the features of each
model, and compare their prices and warranties. Nobody can do that.
Instead, you satisfice: You shop around, get a general idea of what is
available, and buy the best car you can find with the time and energy you
have available.
For many exponential problems, satisficing is the best you can do.
OceanofPDF.com
CHAPTER 3
DOI: 10.1201/9781003625506-3
This “magic function” would have no other purpose. If we look at how the
function transforms its input to its output, it probably won’t make sense (it’s
magic, after all). This function is called a hash function because it “makes
hash” of its inputs.
We can’t actually do magic, but we can come close. We’ll demonstrate with
an extremely small but otherwise reasonable example.
Suppose you are a bird watcher and want to keep a table of all the birds you
have seen. You can look in the table to determine if you’ve seen a particular
kind of bird, and you can add new birds to the table. We’ll use an array of
ten elements (absurdly small, but big enough for an example). For a hash
function, we’ll use the number of characters in the bird’s name.
So far, you’ve seen a wren, a cardinal, a robin, and a chickadee, and your
hash table looks like the table on the right of Figure 3.1.
FIGURE 3.1 Hash table. ⏎
With this table, you can tell you have seen a robin by computing
hashCode("robin") = 5 and finding "robin" at location 5 of the table. You
haven’t seen a sparrow, because hashCode("sparrow") is 7, and location 7
of the hash table is empty.
Suppose you next see a hummingbird and want to add it to the table.
Unfortunately, “hummingbird” is 11 characters long, and there are only ten
locations in our table. The solution, regardless of what hash function is
used, is to always take the result mod the table size.
When a collision occurs, one solution is to probe (look at) the next location,
and the next, and the next, until we find an empty location. In our “crow”
example, we see that location 4 is already occupied (and not by “crow”), so
we look at location 5 and see that it is already occupied (and not by
“crow”), so we look at location 6 and it’s empty, so we can put “crow” in
location 6.
If we had found a location with “crow” in it, we would know that “crow”
was already in the table, and we could stop there.
The full math is beyond this book, but we can approach it. Let’s start with
two assumptions. (1) Our hash function is really good, so different inputs
almost always produce different outputs. (2) Our table is really big, so
collisions are highly unlikely. In these circumstances, it should be obvious
that most searches take constant time, at the cost of a great deal of wasted
space.
Let’s look at the hash function in our example: the length of a bird’s name.
It will always produce the same result for the same bird’s name (required);
counting letters is reasonably fast (good); it gives a very small range of
values (even the longest name is probably only a couple of dozen characters
(bad)); and names like “catbird” and “cowbird” have the same hash code
(bad). All in all, not a very good hash function.
The size of the hash table also matters. Obviously, we don’t want it to get
too full. The more the hash table contains, the more collisions will occur.
The more collisions occur, the slower it will be to look something up or to
insert something. A good rule of thumb is to make the table large enough so
that it never gets more than about 70% full. At this size, we can typically
find the correct location with only two or three probes. (Some individual
searches might take quite a bit longer, but on average we can expect
constant time.)
We might create a hash table with 1000 entries—a nice round number.
Surprisingly, a nice round number is a poor choice for the size of a hash
table. Here’s why:
Suppose f is a factor of the table size t, that is, t = f x for some integer x. We
find a hash code h for the value we want to put in the table and compute h
mod t (h % t) to decide where to insert it. Unfortunately, if h has f as a
factor, then h % t will also have f as a factor. There are only a limited
number of table locations that are multiples of f, therefore collisions are
more likely.
It is not usually possible to delete an item from a hash table. If this must be
done, one approach is to leave the item in the table but somehow mark it as
not available. Another approach is to remove all the values in the
surrounding “clump” and then reinsert all but the unwanted value. A third
approach is to use a linked hash table (see Section 6.8).
There are two simple ways to implement such a hash map. First, it can be
done with an array of objects, where each object has a string field to contain
the name of a bird, and a pointer to information about that bird. See Figure
3.3.
The second implementation is to use two arrays: one for bird names and the
other for bird information. The bird name can be looked up in the first array,
and the same index can be used to access information in the second array.
For example, suppose you define a “lookup table” in which a user can (1)
create a new lookup table, (2) add an item to the table, and (3) test if an
item is in the table. Then you hide the implementation (by making the array
“private,” or however you hide code in your language), so that no other
access to the lookup table is allowed. You have an ADT.
You might implement a lookup table as a sorted array and use a binary
search to insert or look up values. If you didn’t hide the implementation,
someone who wants the smallest value in the lookup table might simply get
the first element of the array. This code will break if you change the
implementation from a sorted array to a hash table.
The primary user of your lookup table is probably yourself. If you need to
get the smallest value in the table, you can add a new operation to do this.
Then, if you change the implementation, you just need to recode this
operation along with all the others.
Errors happen. If some error puts your lookup table in an inconsistent state,
and the only access to the implementation is through methods you provide,
then the error is in your methods. But if the implementation is exposed, then
any code anywhere in the program could be causing the problem (and you
bear the blame).
Every ADT should have a contract (or specification) that tells the users
everything they need to know in order to use the ADT, and does not tell the
user anything they do not need to know. In particular, they do not need to
know your data representation or your algorithms.
What makes a good contract? If you buy a house, you want to pay as little
as possible. If you sell a house, you want to get as much as possible. You
want a contract that is in your favor.
The same principle applies when designing an ADT. You will almost always
be in a position where you know what the ADT will be used for, and you
should provide that much functionality and no more. This keeps down both
the cost and the complexity of your ADT. Of course, it’s nice if the design is
such that more functionality can be added later, when it may be needed, but
it’s generally a mistake to write code before it is actually needed.
If the documentation is too hard to write, this probably means the ADT is
too hard to use. In that case, it’s the ADT that should be fixed.
If you design for generality, it’s easy to add functionality later—but you
can’t remove functionality unless you are absolutely sure that it has never
been used.
OceanofPDF.com
CHAPTER 4
Recursion
DOI: 10.1201/9781003625506-4
If you are comfortable with recursion, feel free to skip this chapter. There’s
nothing in it specifically about data structures, so you won’t miss anything.
1, or
Any natural number plus 1.
So 1 is a natural number because of rule one; 2 is a natural number because
it is the natural number 1 plus 1; 3 is a natural number because it is the
natural number 2 plus 1; and so on.
An open parenthesis,
Zero or more numbers or lists, and
A close parenthesis.
By this definition, (), (1 2 17), and ((1) (1 2) (1 2 3)) are all lists.
A number,
A sum, product, difference, or quotient, or
Parentheses around an arithmetic expression.
A sum is:
An arithmetic expression,
A plus sign, and
An arithmetic expression.
… and similarly for product, difference, and quotient. We’ve left out a few
operations, but the above is sufficient to express (1 + 2) * (3 + 4 + 5).
The factorial of a natural number n is the product of all the natural numbers
up to and including n. For example, the factorial of 5 (written 5!) is the
result of 1 × 2 × 3 × 4 × 5. From this, it’s easy to see that 5! can be
computed by multiplying 4! by 5, that is, (1 × 2 × 3 × 4) × 5.
function factorial(n):
If n == 1, return 1,
else return factorial(n - 1) * n.
Here, the n == 1 case is a base case: a case that can be computed without
recursion. Every recursive function must have at least one base case.
function askYesOrNo(question):
Display the question.
Read in the answer.
If answer starts with “Y” or “y”, return TRUE.
If answer starts with “N” or “n”, return FALSE.
Display “Please answer with ‘yes’ or ‘no’.”.
Call askYesOrNo(question) and return the result.
If the user responds with, say, Maybe, this code just calls itself again, and the
answer returned is then returned from the original call. It works for any
sequence of unacceptable answers, not just one. You can do the same thing
without recursion, but it takes a little extra work.
One or more base cases, where you compute the answer directly,
without recursion; and
One or more recursive cases, where you do part of the work and recur
with a simpler problem.
Every recursive function must have base cases. If your function accidentally
recurs with what should have been a base case, it’s likely to result in an
infinite recursion. Checking for and handling base cases before doing any
recursions, although not absolutely necessary, makes this problem less
likely.
The following definition of the factorial function works equally well for
natural numbers.
function factorial(n):
If n > 1, return factorial(n - 1) * n,
else return 1.
This version does not explicitly list n==1 as a base case and, in fact, it
behaves differently for zero and negative numbers.
The factorial function clearly does this. The simpler problem is finding the
factorial of a smaller number (a number closer to 1), and the extra work is
multiplying by n.
“Simpler” means “more like a base case.” It can involve using a smaller
number, a smaller part of a data structure, or just about anything.
Any time you recur with a case that isn’t closer to a base case, you get the
recursive equivalent of an infinite loop.
function factorial(n):
If n == 1, return 1,
else return factorial(n).
4.3.3 Rule 3: Don’t Use Non-Local Variables
Ideally, a function should use its parameters, and only its parameters, to
compute a result. This makes the function more self-contained and therefore
easier to understand and debug.
If a parameter is passed by reference, this means that there is only one copy
of that value, and what the function receives is a link or pointer to that
value. Therefore, a reference parameter behaves like a global variable.
The problem arises when we try to both modify a global variable and use it
in the recursion. This usually isn’t a problem with simple numerical
calculations, but can get complicated when a data structure is involved.
function factorial(n):
If n == 1, return 1,
else return factorial(n - 1) * n.
function factorial(n):
If n == 1, return 1,
else return factorial(n - 2) * (n - 1) * n.
The first and second Fibonacci numbers are both 1. The definition of the n-
th Fibonacci number, for n > 2, is
function fibonacci(n):
If n < 3, return 1
else return fibonacci(n - 1) + fibonacci(n - 2).
While easy (and correct), this is not efficient. Since there are two recursions
at each level, the number of calls increases exponentially. For n = 10, only
109 calls are required, but for n = 30, 1664079 calls are required. For an
efficient solution, see Section 13.5.
OceanofPDF.com
CHAPTER 5
DOI: 10.1201/9781003625506-5
S TACKS, QUEUES, AND DEQUES are similar data structures. They consist of a
linear sequence of values, to which new values can be added at an end or
removed from an end.
5.1 STACKS
A stack is an abstract data type with the following operations:
Items are inserted at one end of a stack and removed from the same end.
The consequence is that items will be removed in the reverse order from
that in which they were added. For this reason, a stack is sometimes called a
LIFO (last in, first out) data structure.
A stack can be implemented with two components: one array and one
integer. See Figure 5.1.
In Figure 5.1, the values a, b, c, and d, in that order, have been pushed onto
the stack. The most recently pushed value, d, is at the top of the stack; the
next item that is pushed will be added after it.
The integer top can be defined in several ways. It can be the index of the
topmost element (as in Figure 5.1), the index of the first available location
(just past d), or as a count of how many items are in the stack.
With this implementation, two errors can occur: overflow, in which the
number of items added exceeds the size of the array, and underflow, when
an attempt is made to peek or pop from an empty stack. In Section 6.2, we
will see an implementation that allows stacks of virtually unlimited size.
Simple counting is not enough to check balance, but you can do it with a
stack. Going from left to right:
The operator stack will occasionally become empty. To avoid treating this
as a special case, invent a new “operator” with the lowest possible priority
and initialize the operator stack with this value. To apply this operator, just
quit, because all the work has been done.
Table 5.1 shows the evaluation of 2*(3+4), using '_' as the “quit” operator.
['_'] [14]
Functions can call other functions and can be recursive (they can call
themselves). At each call, the location of the calling statement (the return
address) is pushed onto a call stack. If no errors interrupt the process, each
return from a function pops a value from the call stack, and execution
returns to that popped location.
Each function has its own set of local variables, including its parameters.
Storage for these variables is allocated when the function is called and
released when the function returns. These local variables are also pushed
onto the call stack when the function is entered and popped off when the
function returns.
If all the local variables can be determined at compile time, they can be put
into a node called a stack frame. This single entity can then, along with the
return address, be pushed onto and popped off from the stack.
5.2 QUEUES
A queue is an abstract data type with the following operations:
A queue can be implemented with three components: one array and two
integers. Figure 5.2 shows what a queue would look like after the items a
through g have been added to it and items a through c have been removed.
When the queue has exactly one element, front and rear will be equal.
When the queue is empty, rear will be equal to front-1 (modulo the array
size). When every queue location holds a value, rear will also be equal to
front-1! This means an empty queue cannot be distinguished from a full
queue. Obviously, this is a problem.
Make the array large enough so that it never gets full. This works (for
a while) but is a disaster waiting to happen. Please don’t do this.
Declare the queue to be full when rear equals front-2 (modulo the
array size), so that one array location remains unused. Any attempt to
enqueue something more will result in an overflow error.
Keep a count of the number of elements in the queue, and don’t let it
exceed the array size. This also works, but it requires a bit more work
and has no obvious advantages.
5.3 DEQUES
A deque (pronounced “deck”) is an abstract data type. The operations are
the same as those of a queue, except that insertions and deletions (and
peeks) may be performed at either end.
Deques are rarely used and the names of the operations vary considerably
from one implementation to another. We might, for example, have
add_left and add_right to enqueue items, or perhaps add_at_front and
add_at_rear.
A deque, like a queue, can be implemented with an array and two integer
indices. Like a queue, deque operations take O(1) time.
OceanofPDF.com
CHAPTER 6
Linked Lists
DOI: 10.1201/9781003625506-6
In many languages, a “list” is a structure built around one of the two types
of basic lists that we will explore in this chapter: the singly linked list
(SLL) and the doubly linked list (DLL).
Figure 6.1 represents a singly linked lists containing the values a, b, c, and
d. Links are represented by solid circles and arrows, while the slash
represents a “null link.”
To create a non-empty list, the simplest way to begin is to cons a value onto
the empty list. Then cons a value onto that, and then another, building the
list in reverse order. For example, Figure 6.1 could be created using the
following code.
Recursive functions and singly linked lists are ideally suited for each other.
To write a recursive function on a list, the fundamental recipe is: Do
something with the head and recur with the tail. For example, to find the
length of a list:
function length(L):
if L is empty, return 0
else return 1 + length(tail(L))
function sum(L):
if L is empty, return 0
else return head(L) + sum(tail(L))
function last(L):
if tail(L) is empty, return head(L)
else return last(tail(L))
function largest(L):
if tail(L) is empty, return head(L)
else:
tail_max = largest(tail(L))
if head(L) > tail_max, return head(L)
else return tail_max
To use reverse, the user must remember to create a second, empty stack
and pass it in as the L2 parameter. This isn’t ideal. In cases like this, it’s
better to provide a façade function—another function that “stands in front
of” the function that does the work, and whose only purpose is to provide a
nicer interface.
function reverse(L):
return help_reverse(L, empty list)
The operations is_empty, head, and tail all take O(1) time. The cons
operator allocates memory for a new node, which we can assume also takes
O(1) time.
A stack can be implemented directly with a singly linked list (see Figure
6.2).
The operations on a stack are almost exactly the same as the operations on a
linked list.
The use of a header node makes it easy to implement the is_empty method
and allows the creation of an “empty” list that is distinguishable from a null
value. The disadvantage is that every use of the tail operation involves
creating a new header node.
A “node,” like an “array,” is not a specific type, but rather a general
designation for a class of types. You can have one kind of node for the
elements of singly linked list, another kind for the nodes in a binary tree,
and so on. In languages with strict typing, the type of every field in a node
must be specified in advance. In particular, the type of every link must be
specified in advance so that when the link is followed, the type of the node
it references is known.
A few languages have variant records or tagged unions. These are nodes
containing a tag to specify which type of data is included so that the same
memory may be used for different types.
The operations defined for singly linked lists in Section 6.1 do not support
any changes to the content of lists once they are created, which is perfect
for functional programming languages. If such operations are provided, as
they might be in a non-functional language, the type of structure sharing
shown in Figure 6.4 is highly inadvisable.
In Figure 6.5, the middle node (containing b) has a previous field pointing
to the node containing a and a next field pointing to the node containing c.
To refer to the list as a whole, you could define a header node with just the
fields front and rear. The header points to the first and last nodes in the
list (or contains null links if the list is empty).
Singly linked lists can also be made circular, but doing this loses the
advantage of structure sharing (see Section 6.4).
The values in an array must be all the same type. The values in a Python list
may be of varying types and sizes because the array contains only pointers,
which are all of the same size.
Indexing into an array takes constant time, and dereferencing a pointer also
takes constant time, so accessing an element by its array index takes
constant time.
If the array has unused capacity, appending a new item to the end or
removing one takes constant time.
If appending a new item exceeds the capacity, then space must be allocated
for a new, larger array, and all the values in the array must be copied to the
new array. This, of course, takes O(n) time; however, the new array is
chosen to be large enough that adding enough new values will amortize
(average out) so that the result will still be O(1).
This approach means that a hash table never gets “full”—new entries can
always be added (at least until memory is exhausted, but that’s a different
problem).
When adding to a hash table, we must first check that the item isn’t already
present. This requires stepping through all the items with the same hash
code. If the word being added is dove, this would end at the node containing
crow, so it costs about the same to add dove before wren or after crow. In
many cases, more frequent items are seen and added earlier, making it
desirable to add later, less common entries at the end.
To delete a node from a list, change the pointer of the previous node to
point to the node after the one being deleted. That is, if node A points to
node B and node B points to node C, you can delete node B by simply
changing node A to point to node C. The memory used by B can then be
deallocated or garbage collected.
A hash map is a hash table in which each entry has associated information.
This can be achieved by adding a field to each node. See Figure 6.8 for a
small portion of such a hash map.
Binary Trees
DOI: 10.1201/9781003625506-7
A BINARY TREE IS COMPOSED of zero or more nodes. Each node contains three
components: a value (some sort of data item), a reference or pointer to a left
child (which may be null), and a reference or pointer to a right child (which
may be null). The children of a node are themselves binary trees.
A binary tree may be empty (contain no nodes). If not empty, a binary tree
has a root node (usually drawn at the top), and every node in the binary tree
is reachable from the root node by a unique path (see Figure 7.1).
FIGURE 7.1 A binary tree. ⏎
A node with neither a left child nor a right child is called a leaf.
An abstract data type for a binary tree should have, at a minimum, functions
for creating and navigating the binary tree. For example,
get_right_child(node)
get_value(node)
set_value(node, value)
is_leaf(node)
These functions all take O(1) time, and their names are self-explanatory. Any
additional binary tree functions can be built from this set.
Here is some additional terminology:
Note: One thing binary trees do not usually have or need is a link from
each node “upwards” to its parent. Such a link can be added if needed
by the application.
Here are some important but less obvious definitions (see Figure 7.2):
Since a binary tree has three “parts,” there are three possible ways to traverse
the binary tree in a forward direction. These are named according to when the
root is visited.
One way to visualize these traversals is to attach a “flag” to each node. For
preorder, the flags go on the left of each node; for inorder, on the bottom of
each node; and for postorder, on the right of each node. To traverse the binary
tree, simply collect the flags (see Figure 7.3).
There are three additional ways to traverse a binary tree; these are simply the
reverses of the forward traversals.
Most operations involving binary trees are best done recursively. For
example, here is how to use a preorder traversal to make a copy of a binary
tree.
function copy_tree(node):
If node is null, return null.
root = create_leaf(copy(node)).
left = copy_tree(get_left_child(node)).
right = copy_tree(get_right_child(node)).
return new binary_tree(root, left, right)
To insert a value into a binary search tree, compare its value to the value at
the root node. If the new value is less, insert it into the left subtree; if greater,
insert it into the right subtree. If the chosen subtree is absent, place the new
value in that location.
As the name implies, binary search trees are used primarily for fast lookup. If
the tree is balanced, lookups (and insertions) are O(log n). Other fast
operations include finding the smallest value, finding the largest value, and
performing an inorder traversal to get all the values in ascending order.
If a search tree is not balanced, lookups and insertions will take more than
O(log n) time. In the worst case, if the values are inserted in ascending (or
descending) order, all left subtrees (or all right subtrees) will be absent. This
will result in O(n) insertion and lookup times.
Consider any node in the tree. If its left subtree is deeper than its right
subtree, we can perform a right rotation (Figure 7.5, going from left to
right). If the node’s right subtree is deeper than its left subtree, we can
perform a left rotation (Figure 7.5, going from right to left).
In Figure 7.5, A and B are individual nodes with specific values, while the
triangles marked x, y, and z represent complete subtrees containing any
number of nodes.
Consider the tree on the left, with A at the root. Since we are considering
binary search trees, all the values in x are smaller than any other values in the
tree, and all the values in z are larger than any other values. Of more interest,
the values in y are all less than A but greater than B.
We can perform a right rotation as follows: B keeps its left subtree x, and A
keeps its right subtree z, but A becomes the root of a right subtree of B
(allowable because A > B), and y becomes a new left subtree of A (allowable
because the values in y are all greater than B but less than A). The inverse
operation, left rotation, also maintains the sorted order of the binary search
tree.
As a check, notice that inorder traversals of the two trees in Figure 7.5
produce identical results: x-B-y-A-z.
To rotate right:
To rotate left:
7.4 HEAPSORT
Heapsort is a well-known, traditional sorting algorithm that any student of
data structures would be expected to know. It is generally slower than
quicksort but has the advantage that its running time is always O(n2), so it is
safer in time-critical situations. In addition, it’s a really interesting algorithm.
We’ll explain heapsort in three stages. First, we’ll talk about building a
binary tree. Second, we’ll show how that binary tree can be mapped into an
array. Third, we’ll show how rearranging values in the binary tree is
equivalent to sorting the array.
A node in a binary tree has the heap property if the value in the node is
at least as large as the values in its children. (Leaves, having no
children, automatically have the heap property.)
A binary tree is a heap if every node in it has the heap property.
Note: The word “heap” is also used to denote a large area of memory
from which the programmer can allocate blocks as needed and
deallocate them (or allow them to be garbage collected) when no longer
needed (see Chapter 9). This is a completely unrelated meaning of the
word “heap.”
In a binary tree, a node has links to its children, but (in most
implementations) it has no link to its parent. We will just be talking about a
binary tree, not implementing one, so we can ignore that limitation.
7.4.1 Phase 1: Heapifying a Binary Tree
To begin, consider a binary tree with only one node. This node has no
children; therefore, it has the heap property, and the entire binary tree is a
heap.
Next, consider adding a node to a binary tree (of whatever size) that is a
heap. For reasons that will be apparent later, we will add the node next to the
leftmost node in the bottom level or, if that level is full, as the leftmost node
in a new level. The result will be a binary tree that is balanced and left-
justified.
There are two cases: (1) The value in the new node may be smaller than or
equal to the value in its parent. In this case, the parent node retains the heap
property, and nothing more needs to be done. (2) The value in the new node
may be larger than the value in its parent, in which case the parent no longer
has the heap property, and we need to sift up (see Figure 7.6).
In Figure 7.6, the binary tree is initially a heap, but then a node containing 14
is added to it. Since 14 is greater than 10, these two values must be
exchanged. But then 14 is less than 12, so these two values must be
exchanged. This process continues up the tree, possibly as far as the root.
After sifting up, all nodes in the binary tree again have the heap property.
This is because the values in all the affected nodes (except the one in the
leaf) can only increase, so the values in those nodes are still at least as large
as either of their children. The only node to have its value reduced is the
newly added leaf, which has the heap property because it’s a leaf.
When all the values to be sorted have thus been added to the binary tree, the
tree is a heap. It isn’t sorted—values seem to be somewhat randomly
arranged—but the largest value is at the root.
To repair the damage, we will remove the rightmost node in the bottom level
of the tree and use it to replace the old root node. This gives a binary tree
again, but the root node may not, and probably does not, have the heap
property. We need to reheap the binary tree (see Figure 7.7).
To reheap the binary tree, exchange the value in the root node with the value
in one of its two children, whichever is larger. (If they are equal, choose
either.) This restores the heap property of the node at the root, but the chosen
child may or may not have the heap property. If it does not, exchange its
value with that in the larger child; and so on, down the binary tree, until the
value that was in the root reaches a position where it again has the heap
property.
By repeatedly removing the root and reheaping the binary tree, we get a
series of steadily decreasing (or at least, non-increasing) values, so we have
the conceptual basis of a sorting technique. To turn this into an actual sorting
technique, we need to put the binary tree into an array and work with it there.
This mapping of binary trees to arrays works only when the binary tree is
balanced and left-justified. If it isn’t, there will be “holes” in the array that
don’t correspond to nodes, and these would have to be marked in some way.
There are simple formulas for finding the left child, right child, or parent of a
node. They are slightly different for 0-based arrays and 1-based arrays.
All the values to be sorted are already present in the array, but we will
pretend otherwise. As nodes are added to or removed from the binary tree,
the values to the right become “invisible” to us. In other words, we will think
of only the initial part of the array as representing the binary tree.
Heapify the array. Initially, the only value “visible” to us is the root, in the
first location, so it is a leaf and has the heap property. As we step forward in
the array, successive values become visible to us, and for each new value, we
need to check whether its parent still has the heap property. If not, we sift up,
exchanging values with the parent, and possibly its parent’s parent, and so
on. When all the nodes have thus become visible, the array is a heap, but it
isn’t yet sorted.
Repeatedly exchange the value in the root (first) location with the value in
the last visible location. This is the array equivalent of removing the root and
replacing it with the rightmost leaf in the lowest level of the tree. This puts
the largest remaining value in the last visible location, which now becomes
“invisible,” that is, no longer part of the binary tree. The root location
probably no longer has the heap property, so it has to be reheaped by
comparing its value with those of its two children, and so on down into the
binary tree. When the visible part of the array is reduced to a single node, the
array has been sorted.
7.4.5 Analysis
In the first phase, we “add” n nodes to the binary tree. Each added node may
have to be sifted up. Since the tree is balanced, the maximum depth is log(n),
so for each node added we may have to do as many as log(n) exchanges;
therefore, the running time of this phase is O(n log n).
In the second phase, we “remove” n nodes (those currently at the root). Since
these are replaced by nodes that probably do not have the heap property, they
may have to be reheaped. The maximum depth of the binary tree is log(n), so
this phase also requires O(n log n) time.
Construction of the tree is quite simple. Make a list of the leaves and their
associated frequencies. At each step, the two smallest values in the queue are
removed, a new (non-leaf) node is created with the sum of these values, and
placed back in the list. (A priority queue, described in Chapter 8, is ideal for
this purpose.)
For a small example using only eight characters, see Figure 7.9.
The encoding for each character is determined by the path from the root to
that character, using 0 for the left child and 1 for the right child. For example,
d is encoded as 010, and the word decade as 0101011000001010.
Text encoded in this way can be decoded because the codes have the unique
prefix property: None of the resultant binary codes is a prefix of any other
code. This property holds because, in a binary tree, a leaf is not on a path to
any other node.
Using the above encoding, the entropy of English text is about 4.7
bits/character. This can be considerably improved by using digraphs (letter
pairs), trigraphs (letter triples), whole words, or even larger units. For whole
words, the entropy measure drops to about 2.62 bits/character. Similar results
can be expected for other languages that use an alphabet.
To decode a Huffman-encoded file, the code table must be included with the
encoded data. This is a minor cost for large files, but for small files, the
encoded file plus the code table may be larger than the unencoded file.
OceanofPDF.com
CHAPTER 8
Priority Queues
DOI: 10.1201/9781003625506-8
This isn’t the only way an ADT (abstract data type) could be defined for a
priority queue. For example, the test for whether a priority queue is empty
could be replaced by a function that returns the number of elements it
contains. As another example, instead of the priority being inherent in the
element to be added, it could be assigned when the element is added. And,
of course, many additional operations could be added.
The point is that there is not one universally agreed-upon way to define an
ADT. Rather, what is important is that the programmer defines some fixed
set of operations for a data structure and prohibits others by hiding the
implementation.
Heaps
DOI: 10.1201/9781003625506-9
Larger items, such as arrays, nodes, and strings, are kept in a heap. As an
abstract data type, the heap has only two fundamental actions: (1) allocate a
block of storage of a given size from the heap and return a pointer to it, and
(2) deallocate a block of storage—that is, recycle it by returning it to the
heap.
Initially, the entire heap consists of a single block, and the system has a
pointer (let’s call it free) to that block. We will fill up the heap from the far
end (the end farthest from the header).
Find an unused block in the heap that has at least n+2 storage locations
available.
Use the n+2 locations at the end of the block we just found to create a
new block, and reduce the size of the block it was taken from by n+2.
Set the size of the new block. The pointer field is not used.
For security reasons, zero out the user data area of the block.
Give the user a pointer to the new block.
Figure 9.2 (1) shows the state of the heap after A has been allocated, and then
B, and then C. Note that the blocks may be of different sizes. After each
allocation, the size of the free block is decreased.
FIGURE 9.2 Heap before and after allocations and deallocations.
⏎
In Figure 9.2 (2), the block assigned to A has been deallocated. The free
pointer points to this newly deallocated block, while the pointer in the block
is assigned the previous value of free. This begins a linked list of
deallocated blocks.
In Figure 9.2 (3), the block assigned to C has been deallocated. Pointers have
been updated to maintain a list of deallocated blocks.
The other kind of error occurs when storage is no longer in use but isn’t
deallocated, causing the heap to gradually fill up with inaccessible data. This
is called a memory leak; if the program runs long enough, it will fail with an
“out of memory” error.
These errors are common in languages that leave allocation and deallocation
up to the programmer; they are much rarer in languages that perform
garbage collection (see Section 9.3).
There are two well-known algorithms (and several less well-known ones) for
performing garbage collection: Reference counting and Mark and sweep.
When a pointer to the block is duplicated and saved in a new variable, the
reference count is incremented. If a variable is changed to no longer point to
the block, the reference count is decremented. If the reference count reaches
zero, no remaining program variables point to it, and it can immediately be
garbage collected.
Mark and sweep is much more reliable than reference counting, but it takes
substantial time and, unlike reference counting, it must be done all at once—
nothing else can be going on. The program stops responding during garbage
collection. This can be a problem for many real-time applications.
OceanofPDF.com
CHAPTER 10
Trees
DOI: 10.1201/9781003625506-10
A TREE IS LIKE A binary tree, except that each node may have any number of
children. To emphasize the distinction, a tree is sometimes called a general tree.
The usual way to implement a tree is with nodes containing three fields: some
data value, a link to a list of children, and a link to the next sibling. See Figure
10.1.
Most of the terminology used to describe binary trees (ancestor, sibling, depth,
etc.) can also be used for general trees, and there are a few additional terms:
Branching factors can be important when determining the Big-O running times of
tree algorithms.
To define a tree as an abstract data type, the following operations must be defined:
If node removal is permitted, that usually means deleting the entire subtree
whose root is that node.
General trees can be traversed in preorder: Visit the root and then traverse each
child. Similarly, general trees can be traversed in postorder: Traverse all the
subtrees, and then visit the root. Inorder traversals are not well defined.
A folder also contains a link to its parent, usually indicated by two dots (..). In
UNIX, the root of the tree is denoted by a forward slash (/); in Windows, the root
is probably denoted by C:.
This approach can easily be extended to multiple individuals. Suppose you have
an extensive family tree with some individual at the root. A related individual
could then have a “mother link” (left child) to a node in that binary tree, a “father
link” (right child) to a node in that tree, or both, or neither. In this way, subtrees
could be shared between individuals. To complete the data structure, you could
add a list of “root” individuals, forming a “forest” of binary trees.
However, such “biological family trees” are probably only useful for
medical/genetic purposes. To properly represent the wide variations in “societal
family trees” (with adoptions, remarriages, same-sex marriages, etc.), some other
data structure must be used, and it won’t be as simple as a tree.
A node represents the state of the game at one point in time. For example, if the
game is chess, the state would include the positions of all the pieces, whose turn it
is, and whether check has been called.
Each possible move represents a single step from the current node. The branches
from a node represent the possible moves; the children represent the new
positions. Planning ahead (in a game) means choosing a path through the tree.
One way to handle a repeated state is to ignore the fact that it has occurred
previously and treat it as a new node in the usual fashion. In theory, this would
result in an infinitely deep tree and could lead to an infinite loop when choosing a
path through the tree. This might or might not be a problem—if a tree is being
built, it is built as needed, so an infinitely large tree would not be built. Similarly,
deciding on a move (i.e., determining which child node to go to next) almost
always involves a limited search, since finding a path all the way through to the
end of the game is infeasible for any game much larger than tic-tac-toe.
Another way to handle a repeated state is to give a node a link back up in the tree
to one of its ancestors. While this isn’t exactly illegal, it violates the definition of
a “tree,” and the result is more properly called a graph (see Chapter 11) and
should be treated as such.
10.1.4 Expressions
When a program is compiled, the first step is almost always parsing the program.
Parsing creates a tree structure that is equivalent in meaning to the text of the
program.
In the resultant parse tree, a node that is a leaf could hold either a value or the
name of a variable whose value could be looked up (e.g., in a hash table). A node
that is not a leaf could hold the name or symbol for an operation to be applied to
the values of its children.
Control statements (while, if, etc.) are considered to be just another kind of
operator. For example, the statements “First assign 1 to m; then while m is less than
1000, multiply m by 2” can be represented as shown in Figure 10.2; the sequencing
operation is represented by a semicolon so that A;B means “first do A, then do B.”
FIGURE 10.2 Tree representation of code. ⏎
While you may never be called upon to write a compiler, anything that has syntax
—dates, addresses, phone numbers—can be parsed into its components. If you
can’t parse text inputs, you are limited to reading simple things like numbers and
strings. But if you can parse text input, you can make sense of:
fire phasers at 3, 7
jane.doe@google.com
28°12"48'
3:30pm-5pm
One simple approach, which we won’t go into any detail here, involves two
phases. In the first phase, the code breaks the input into a list of tokens, for
example, ["3", ":", "30", "pm", "-", "5", "pm"]. In the second phase, the
next operator is found, suitable operands are looked for in the list of tokens, and
those are assembled into a tree structure.
For some problems, any goal node is acceptable (K or O); for other problems, you
want a minimum-depth goal node, that is, a goal node nearest the root (only K).
A depth-first search (DFS) explores a path all the way to a leaf before
backtracking (going back to a previous node and exploring from another child of
that node).
In the example tree, after visiting A (the root), the search proceeds to B, and then to
E. The search then backtracks to B and searches from F, which leads to I, then M,
and then successive backtracking to I leads to N and finally to O. Since O is a goal
node, the search is complete.
A breadth-first search (BFS) explores nodes nearest the root before exploring
nodes further away. In the example tree, a BFS would first visit A (the root), then
B, C, and D, then E, F, G, and H, and finally I, J, and the goal node K.
When this method succeeds, it returns some goal node but doesn’t report the path
taken to it.
function search(node):
If node is a goal, return success.
For each child c of node:
If search(c) is successful,
save the node, and
report success.
Return failure.
The stack only needs to be large enough to hold the deepest search path.
When the function succeeds, the (implicit) stack contains only the nodes on a path
from the root to a goal. As the recursion “unwinds” through multiple levels, those
nodes can be saved in some external data structure, such as a stack.
Since this method is basically DFS, when it succeeds, the path to a goal node can
be recovered by pushing the current node onto a stack just before the return
success statement.
We can now use this function to perform a depth-first iterative deepening search.
limit = 0.
found = false.
While not found:
found = limitedDFS(root, limit, 0).
limit = limit + 1.
This code searches to depth 0 (root only), then if that fails, it searches to depth 1
(root and its children), then if that fails, it searches to depth 2 (root and its
children and grandchildren), and so on.
Like BFS, if a goal node is found, it is a nearest node, and the path to it is on the
stack.
Like DFS, the required stack size is only the search depth (plus 1).
One apparent disadvantage is that when doing a limited DFS to depth n, all the
previous work (to depth n–1, n–2, etc.) is simply discarded. While true, this is less
of a waste than it may appear. When searching a binary tree to depth 7, a single
DFS requires searching 255 nodes, while iterative deepening requires searching
502 nodes. In general, iterative deepening takes about twice as long. With a tree
that has a branching factor of 4, DFS to depth 7 requires searching 21845 nodes,
while iterative deepening searches 29124 nodes—about 4/3 = 1.33 times as long.
The higher the branching factor, the lower the relative cost of iterative deepening
DFS. In general, if the branching factor is b, the difference is about b/(b–1).
The start state represents the initial problem. Applying an operator to a state in the
state space transforms it to another state in the state space. Some states may be
goal states; these represent solutions to the problem.
Example 1: Maze
A maze can be represented as a state space. Each state represents “where you
are” in the maze. The start state represents your starting position, and the
goal state represents the exit from the maze.
Operators (for a rectangular maze) are: move north, move south, move
east, and move west. Each operator takes you to a new state, which is simply
your location in the maze. Operators may not always apply because you are
not allowed to walk through walls.
See Section 11.8 for an example of a rectangular maze.
One of the best-known sliding block puzzles is the fifteen puzzle. It contains
15 tiles, numbered 1 through 15, in a 4x4 grid. The start state is some
apparently random configuration of the tiles, while the goal state is one
where the numbered tiles are in order; see Figure 10.4.
Note: In the fifteen puzzle, only half the possible configurations are
reachable from (or to) the goal state. If the state space is thought of as
an undirected graph, it has two distinct connected components. Hence,
the start state cannot be completely random.
We will explore the state space of this problem in somewhat more detail.
We need to represent the possible states, preferably in as simple a manner
as possible. A triple of numbers is enough: The number of angels on the left
bank, the number of demons on the left bank, and the number of canoes (zero
or one) on the left bank.
We will define five possible operations, named a, d, aa, dd, and ad:
We don’t have to specify “west to east” or “east to west” because only one of
these will be possible at any given time.
Figure 10.6 shows the initial portion of a state-space search for this
problem. The search space continues after the vertex in the bottom right.
FIGURE 10.6 Initial part of the search space for angels and
demons. ⏎
In Figure 10.6, each node shows in the top line what is on the left bank
and (redundantly) in the bottom line what is on the right bank. This
redundancy does not need to be reflected in the code; it’s in the figure to
make it easier to see when demons outnumber angels. A failure node
(marked with a skull and crossbones) occurs when demons outnumber angels
on either bank.
The search space is shown as an undirected graph. Most edges may be
traversed in either direction; the exception is that there is no exit from a
failure node. Because there are cycles in the graph, graph searching
techniques (see Section 11.4) are appropriate.
10.2.5 Pruning
In any kind of search, pruning—deleting (or just ignoring) subtrees that cannot
contain a goal node—can save considerable effort. For very large search trees, it
may also be advisable to prune subtrees that seem unlikely to contain a goal node
so that more promising subtrees may be searched to a greater depth.
Because such games result in very large search trees, it is generally not possible to
search deep enough to find a winning node, so heuristics are used to evaluate the
“goodness” of nodes. The idea of an alpha–beta search is to prune branches that
are unlikely to be taken, thus allowing deeper searches on more promising
branches.
An alpha cutoff occurs when it is your move, and you decide not to explore
certain subtrees because you have already found a more promising subtree of that
node. A beta cutoff occurs when it is your opponent’s move, and you believe that
your opponent will not move into that subtree because it is more desirable to you.
Figure 10.7 shows the result of an alpha–beta search on a tree that is just barely
big enough to show some examples. Your moves are shown as rectangular nodes;
your opponent’s moves are shown as round or rounded nodes.
a. The search extends down to node a, which has a heuristic value of 17.
b. The value 17 is brought up to node b. This is known as a preliminary backed-
up value (PBV); it can change.
c. The next child of node b, node c, is evaluated. It has a heuristic value of 20.
b. Since 20 is better than 17, the PBV at b is replaced by 20.
d. A PBV of 20 is brought up to node d.
e. From node d, we explore down to node e, which has a heuristic value of 12.
f. Node f is assigned a PBV of 12.
g. Node g is visited, and it has a heuristic value of 18.
f. The PBV of 12 at node f is replaced by the better value of 18.
d. Node d is the opponent’s move, so they will replace the PBV of 20 with the
(better for them, worse for you) value of 18.
h. The PBV of 18 is brought up to node h.
i, j, k, l. Node l is explored, and gets a PBV of 15. But 15 is worse than the
parent node’s PBV of 18, and your opponent will never bring up a larger
value; therefore, there is no point in exploring any further subtrees of l. This
is an alpha cutoff, and the value of 15 is not brought up to node h.
m, n, o, p. The leftmost child of p is explored, and p gets a PBV of 25.
q, r. Nodes q and r are explored, and a PBV of 32 is brought up to node r. But 32
is worse for your opponent than the PBV of 25 at p, so the value of 32 is not
brought up, and a beta cutoff occurs.
h. The PBV of 25 at node p is better than the PBV of 18 at node h, so it replaces
the value in node h.
According to what has been determined so far, you should move from h to p, your
opponent will likely move from p to n, and you should move from n to o.
However, after your opponent’s move, you will probably have an opportunity to
do another, deeper search and very likely get some different values.
Alpha–beta searching assumes that your opponent has the same heuristic function
as you (i.e., they assign the same heuristic values to nodes). This is probably an
incorrect assumption, but better heuristics and deeper searches tend to win out
over weaker heuristics and shallower searches.
10.3 TRIES
A trie is a data structure used for storing and retrieving a very large collection of
strings—say, a complete lexicon or a large number of DNA nucleotide sequences.
The trie in Figure 10.8 represents 15 words: a, an, and, any, than, that, the,
there, these, those, what, when, where, who, and why. Black nodes indicate that a
complete word has been formed at that point. To locate a given word, start at the
root and follow the link labeled with the first letter; then follow the link labeled
with the second letter; and so on.
Letters are not explicit in the trie; they are implied by locations in the array. The
first location of the root array contains a link to a sub-trie for all words beginning
with ‘a,’ the second contains a link to a sub-trie for words beginning with ‘b,’ and
so on. Each sub-trie has the same form, with links to its own sub-tries.
To look up the word ‘any,’ start in the root array and follow the link in the first
location (a); from the array that link points to, follow the link in the fourteenth
location (n); from that array, follow the link in the twenty-fifth location (y). This
is a leaf, so a-n-y is a word. (So are ‘a’ and ‘a-n,’ but as they are not leaves, those
array locations must contain a Boolean to indicate they are also complete words.)
It may seem that the space complexity required for a trie could be exponential. In
theory, a trie using 26 letters might require 26 raised to the power of the
maximum word length. However, in practice, the space required is limited to a
constant times the actual number of strings. While this may represent a substantial
amount of storage, it isn’t exponential.
A trie can also be represented using hash maps instead of arrays. This approach
has the advantage of supporting a flexible set of characters (e.g., Unicode) rather
than a fixed-size set, but it loses the ease of generating all words (or other types of
strings) in alphabetical order.
OceanofPDF.com
CHAPTER 11
Graphs
DOI: 10.1201/9781003625506-11
A GRAPH IS A DATA
connected
structure that consists of a collection of vertices
by edges. There are two kinds of graphs: directed graphs
(sometimes called digraphs) and undirected graphs (see Figure 11.1). Edges
in a digraph can be followed only in one direction, while edges in an
undirected graph may be followed in either direction.
There are many ways to implement graphs. But first, some terminology.
The size of a graph is the number of vertices it contains. The empty
graph has no vertices, so its size is zero.
If two vertices are connected by an edge, they are neighbors, and the
vertices are adjacent to each other.
The degree of a vertex is the number of edges it has.
For directed graphs,
If a directed edge goes from vertex S to vertex D, we call S the source
and D the destination of the edge.
An edge from S to D is an out-edge of S and an in-edge of D. S is a
predecessor of D, and D is a successor of S.
The in-degree of a vertex is the number of in-edges it has, while the
out-degree of a vertex is the number of out-edges it has.
A path is a list of edges such that every vertex but the last is the
predecessor of the next vertex in the list.
A cycle is a path whose first and last vertices are the same (e.g., [rock,
scissors, paper, rock]).
A cyclic graph contains at least one cycle, while an acyclic graph does
not contain any cycles.
An undirected graph is connected if there is a path from every vertex to
every other vertex.
A directed graph is strongly connected if there is a path from every
vertex to every other vertex, and weakly connected if the underlying
undirected graph (ignoring edge direction) is connected.
Vertex X is reachable from vertex Y if there is a path from Y to X.
A subset of the vertices of a graph is a connected component (or just a
component) if there is a path from every vertex in the subset to every
other vertex in the subset.
The best way to implement a graph depends on how the graph is to be used;
here are some questions to consider:
This representation shows connections between vertices but does not support
storing data in those vertices. If that is needed, it must be done elsewhere.
The adjacency matrix for an undirected graph is symmetric about the main
diagonal, as shown in Figure 11.3. Each edge is represented by two marks,
except in the special case where an edge goes from a vertex back to the same
vertex.
For very dense graphs—ones in which there are edges between almost every
pair of vertices—matrix representations may be appropriate. However,
adjacency matrices require O(n2) space to represent n vertices, regardless of
the number of edges. For less dense graphs, this disadvantage can be
overcome by the use of a sparse array (see Section 11.5).
Many languages provide sets as a built-in data type. If not, they can be
implemented, perhaps as a hash table. Our focus in this section will be on
using sets, although if an ordering is desired, lists can be used instead.
If the edges p through v have no associated values, they can be elided. Each
vertex, instead of pointing to a set of edges, can point to the set of vertices
reachable by following those edges.
To avoid getting trapped in a cycle, keep track of which vertices you have
already searched, so you don’t repeat those searches. There are two ways to
do this: (1) keep a list of vertices you have already visited, or (2) put a mark
on the vertices you have visited. The latter approach is more intrusive—it
might interfere with later searches—and in a team setting, you may not have
the option to change how vertices are implemented.
We will compare the code for performing a depth-first search (DFS) on a tree
with doing the same kind of search on a graph. Other types of searches
require similar changes.
Here is how to do DFS on a tree:
return success,
else,
put all the successors of the vertex
onto the stack.
If you get to here, return failure.
You can use DFS to find the connected components of an undirected graph.
For each vertex in the graph, if it isn’t already in some component, create a
new component for it, then perform a DFS, and add every reachable vertex to
the same component. The result will be a set of components, that is, a set of
sets of vertices.
The same approach does not work for finding connected components of a
directed graph. To do that efficiently requires an algorithm (Union-Find) not
covered in this book.
To implement these two operations for a sparse array, you could use a hash
map (see Section 3.4), using some combination of the indices to compute a
hash code. That works, but it turns out to make access to rows and columns
very difficult.
Returning to the college example, you might ask: What courses has this
student taken? or, Which students have taken this course? With a hash map
representation, these would be very difficult questions to answer. In terms of
an array, these questions are equivalent to finding all the non-null values in a
row and finding all the non-null values in a column.
For each row, you can use a linked list. Each node in the linked list will
contain the column number (which would be hard to find otherwise) and the
value (grade) in that location of the array. You could then create a linked list
of these “row lists.” Similarly, you can create a linked list of all the values in
a column (along with their row numbers) and make a linked list of these.
Figure 11.5 shows a sparse array with both a list of “row lists” and a list of
“column lists.” Since a node can be approached from either direction (row or
column), it should contain both the row number and the column number,
along with whatever other value it may hold.
FIGURE 11.5 A sparse array and its representation. ⏎
If, as in the college example, you can expect almost every row and every
column to contain some meaningful values, you can use an array instead of a
list to hold links to the “row lists,” and another array of links for the “column
lists.”
Sparse arrays are also useful in linear algebra, where you may have large
matrices with almost all zero entries.
Dijkstra’s Algorithm builds a kind of “inverse tree,” where all paths lead to
the root rather than from the root. To find the best path from vertex X to
some vertex Y in this tree, we start at Y and follow the only available path
back to X.
Each edge in the graph has a cost (or distance) measure on it. For the
algorithm to work, no edges can have a negative cost.
The cost of the best path to V that has been found so far.
Whether the cost of the best path to V is final or is still only tentative.
Initially, all costs are infinite (or some suitably large number) and
tentative.
A link (directed edge) from that vertex. Initially, all links are null.
We also need to keep a priority queue of vertices adjacent to the vertices that
have been visited so far.
For the graph in Figure 11.6, begin by putting vertex X into the priority
queue. The cost to get to X from itself is zero, that cost is final, and the
directed edge is null. Then, at each step:
Pull from the priority queue the vertex N with the smallest tentative cost
and mark that cost as final. (Any other path to N must cost more because it
must go through a vertex with a higher tentative cost.)
Add to the priority queue any vertices adjacent to N that are not already in
the priority queue and whose cost is still tentative. Add links from each of
those vertices back to N.
For each vertex V adjacent to N that does not already have a final cost,
compute the cost of getting to V by way of N (the cost of N plus the cost of
the edge from N to V). If this cost is less than the tentative cost of V, update
V’s tentative cost, and make it link back to N.
When the priority queue becomes empty, all vertices reachable from X have
been processed and the “inverse tree” is complete.
11.7 SPANNING TREES
A spanning tree of a connected, undirected graph is a connected subgraph
that includes all the vertices, but only enough of the edges to maintain
connectivity. It will have one fewer edge than vertices and no cycles.
A graph typically has many possible spanning trees; the ones you find
depend on the type of search you do. Figure 11.7 shows (a) an initial graph,
(b) one possible spanning tree after a breadth-first search, and (c) one
possible spanning tree after a DFS.
Suppose you want to supply a set of houses (say, in a new subdivision) with
electric power, water, sewage lines, telephone lines, and internet. To keep
costs down, you might want to connect some of these (such as water and
sewage) with a spanning tree.
However, the houses are not all equal distances apart, and longer pipes cost
more, so you might want to use a minimum-cost spanning tree. The cost of
a spanning tree is the sum of the costs of its edges.
There are two basic algorithms for finding minimum-cost spanning trees, and
both are greedy algorithms (see Section 13.4).
Prim’s algorithm starts with putting any one vertex into the spanning tree
and creating a set of edges adjacent to that vertex. The main loop then
consists of taking the cheapest edge from that set and testing whether the
vertex to which it leads is already in the spanning tree. If so, the edge is
discarded; otherwise, it and the new vertex are added to the spanning tree,
and the edges from that vertex are added to the set of edges. The algorithm
ends when the correct number of edges (or vertices) are in the spanning tree.
An edge of the lowest cost can be found with a priority queue, and testing for
a cycle is automatic. This makes Prim’s algorithm far simpler to implement
than Kruskal’s algorithm.
11.8 MAZES
Typically, every location in a maze is reachable from the starting location,
and there is only one path from start to finish. If the locations are “vertices”
and the open doors between cells are “edges,” this describes a spanning tree
(see Figure 11.8a).
Since there is exactly one path between any pair of cells, any cells can be
used as the “entrance” and “exit.” Often, both entrance and exit are to the
outside of the maze, but it is also common for one or the other to be near the
center of the maze.
It usually works well, when selecting the next cell from ADJ, to choose one
randomly.
All the previous searches have been blind searches: They make no use of
any knowledge of the problem. If we know something about the problem, we
can usually do much better by using heuristics.
For any given state, we can compute an estimate of the number of moves
required to reach the goal state. This will be our heuristic measure—the
smaller the measure, the more desirable the state. The goal state itself will
have a heuristic measure of zero (no moves required).
To compute the heuristic measure: For each piece, count how many moves it
would take to move the piece into its proper position if no other pieces were
in the way. Do this for every piece and add up the counts. The result is a
(very) optimistic measure of how many moves it will take to solve the
puzzle.
Note: The distance covered to get from one point to another when only
horizontal and vertical moves are allowed, is called the Manhattan
distance.
In Figure 11.9, the 3 is two moves from its proper location; the 10 is two
moves away; the 13 is five moves away; and so on.
The number of possible arrangements of the tiles is 16!, but only half of
these are reachable from the goal node, so there are 16!/2 =
10,461,394,944,000 states in the search space. Even our simple heuristic
measure leads to a very quick solution.
The A* (or A-star) algorithm is a best-first search with the additional feature
that the distance already traversed (from the root to each node) is added to
the heuristic function.
Let g(N) be the distance from the start state to node N. Let h(N) be a heuristic
estimate of the distance from node N to a goal node. Then, f(N) = g(N) +
h(N) is the (partially known and partially estimated) distance from the start
node to a goal node (see Figure 11.10).
FIGURE 11.10 A* algorithm: f(N) = g(N) + h(N). ⏎
The quality of the solution also depends on h(N). It can be proved that if
h(N) is optimistic (never overestimates the distance to a goal), then A* will
find an optimal solution, that is, one that has the shortest path to a goal.
In the previous section, our heuristic for the fifteen puzzle was (very)
optimistic; therefore, an A* search will find a solution with the fewest
possible moves.
11.9.3 IDA*
In the worst case (that is, with a poor heuristic), A* is equivalent to a
breadth-first search and will require exponential storage. Iterative deepening,
which was described in Section 10.2.3, can be applied to the A* algorithm.
IDA* gives the same results as A*; however, because IDA* is essentially a
DFS, storage requirements are linear in the length of the path, instead of
exponential in the branching factor.
The best searches combine a basic blind search technique with heuristic
knowledge about the problem space, and A* and its variations, especially
IDA*, are the best heuristic search techniques known.
OceanofPDF.com
CHAPTER 12
Hypergraphs
DOI: 10.1201/9781003625506-12
The elements of a graph are vertices and edges, while the elements of a
hypergraph are vertices, edges, and graphs.
Now consider how a hypergraph like this might be generalized by using sets
instead of single values.
Finally, there is no reason to restrict the sources and targets of an edge to all
be in the same graph.
12.1 PLEXES
We can represent arbitrarily complex hypergraphs by means of a simple
data structure called a plex. A plex consists of some user data (e.g., “John”)
and four sets of plexes.
containers: The plexes that “contain” this plex. For example, a graph
may be a container for vertices and edges.
contents: The plexes “contained in” this plex. For example, a graph
may have vertices and edges as its contents.
origins: The plexes “from which” this plex comes. For example, the
origin (source) of an edge may be a vertex.
targets: The plexes “to which” this plex goes. For example, an edge
may have a vertex as its target.
There are two simple validity rules:
Plexes allow almost anything. For example, a plex that represents an edge
may have multiple sources and multiple targets. A plex that represents a
vertex may belong to multiple graphs, and it may have multiple graphs
within it. And so on.
Plex structures are extremely flexible; probably too flexible. They provide
an example of what can be done with data structures, but perhaps not what
should be done. If hypergraphs are ever actually needed for a project (which
is unlikely), plexes stand ready to serve.
OceanofPDF.com
CHAPTER 13
Algorithm Types
DOI: 10.1201/9781003625506-13
Depth-first searching of a tree has been described in Section 10.2.1, and the
modifications for searching a graph are described in Section 11.4.
Given a maze, the task is to find a path from start to finish. At each
intersection, you have to decide between three or fewer choices: You
can go straight, you can go left, or you can go right. You don’t have
enough information to choose correctly, and each choice leads to
another set of choices (another intersection). One or more sequences of
choices will, if the maze is solvable, lead to a solution.
The four-color theorem states that only four colors are required to
color any map so that any countries that share a border are different
colors.
To color a map, try to choose a color for the n-th country (initially
the first country) that isn’t used by any adjacent country. If you can,
and if this is the last country, report success; otherwise recursively
color the next country. If you can’t choose a color, report failure.
At each step, you don’t have enough information to choose the
correct color; each choice leads to another set of choices (or failure);
and one or more sequences of choices will lead to a solution (if the
map representation is correct).
In a peg jumping puzzle, all holes but one are filled with pegs. The
only allowable moves are to jump one peg over another peg, and
remove the jumped-over peg. The goal is to remove all pegs but one.
As in the other examples, you have to choose a move on the basis
of incomplete information, each move leads to other possible moves,
and (given a well-designed puzzle) some sequence of moves will lead
to a solution.
We search this virtual tree for a goal node—one that represents a solution to
the problem we are trying to solve. If we reach a non-goal node from which
we have no legal moves (a failure node), we backtrack to the most recent
node that has remaining choices.
After several more moves, we follow edge 17 to a goal node, and quit with
success. Had it failed, there was another choice from A, but we never
needed to explore it.
In quicksort, the array is partitioned into two parts and each part is sorted
independently. No additional work is required to combine the two sorted
parts.
In mergesort, the array is cut in half, and each half is sorted independently.
Then the two halves are merged.
For example, to make 42¢, choose the 25¢ coin, leaving 17¢. Then choose
the 10¢ coin, leaving 7¢. Then choose the 5¢ coin, leaving 2¢. Then choose
the 1¢ coin, leaving 1¢. Finally, choose the 1¢ coin, for a total of five coins.
The greedy method would not work if we did not have 5¢ coins. To make
42¢, the method would result in nine coins, but it could be done with six. It
also would not work if, instead of removing 5¢ coins, we also had 17¢
coins.
To find the minimum number of coins for any given coin set, we need a
dynamic programming algorithm.
function fibonacci(n):
Create an array to hold n integers.
Set the first two array values to 1.
For each remaining array location,
set the array value to the sum of the
two previous values.
Return the value in the last array location.
The reader may notice that an array isn’t necessary for this problem; it can
be done with only a few variables, but the resultant code is harder to
understand.
As noted earlier, the greedy algorithm finds an optimal solution for making
change with American coins (1, 5, 10, 25, and 50 cents). It does not work
well for every possible set of coins. For example, if the coins are 1, 3, 7, 10,
and 25 cents, the greedy algorithm for 15 cents would result in one 10¢
coin, one 3¢ coin, and two 1¢ coins, for a total of four coins; a better
solution is two 7¢ coins and one 1¢ coin, for a total of three coins.
We will consider two algorithms for solving the coin problem. The first is
basically a divide and conquer algorithm, with terrible (exponential)
running time.
To make K cents:
If there is a K-cent coin,
return 1 as the coin count for K
Otherwise, for each value i < K,
Solve for i cents.
Solve for K-1 cents.
If the sum is fewer coins for K,
Save these two solutions.
Return the combination of these
two solutions.
Again taking the example of making 15¢ from 1, 3, 7, 10, and 25 cent
coins, this would compute the best solutions for 1 and 14 cents, then for 2
and 13 cents, and so on, for all combinations that add up to 15.
If the best solution for 15¢ turns out to be 7 cents (one 7¢ coin) plus 8 cents
(one 7¢ coin and one 1¢ coin), then the algorithm would combine these to
get two 7¢ coins and one 1¢ coin.
This algorithm works. For 20¢, you may have to wait a bit for the answer;
for 50¢, it’s unfeasible.
The second solution uses dynamic programming, and it’s lightning fast. The
trick is to solve for one cent, then two cents, then three cents, all the way up
to the desired amount. As the solution is found for each value, it is stored
and never computed again.
For M from 1 to K:
If there is an M-cent coin,
that one coin is the minimum;
save 1 as the coin count for M.
Otherwise,
Store a very large coin count for M.
For each value i < M,
Look up the coin counts for i cents
and for M-i cents.
If the sum is better than the
saved coin count for M,
save this as the coin count for M.
Return the coin count for K.
The running time for this algorithm is O(KN), where K is the desired amount
and N is the number of different kinds of coins.
In the coin example, if the optimal way to make change for K involves
making change for A and for B, where K = A + B, then that is the optimal
way to make change for A and the optimal way to make change for B.
Whether a brute force algorithm is adequate for a given task depends on the
problem size and the algorithmic complexity. Discovering a randomly
generated password has exponential complexity. The traveling salesman
problem (see Section 2.18) is a typical example of a problem for which the
best-known solution requires exponential time.
Sum of subsets is another such problem. Suppose you are given a list or set
of n positive numbers, such as [22, 26, 31, 39, 43, 56], and are asked
to find a subset of the numbers that total to a certain amount, say 100. Each
number either is or is not in the solution, so there are 2n possible subsets to
try—or, in this tiny example, 26 = 64 subsets. Clearly, this is an exponential
problem.
As is often the case, pruning (see Section 10.2.5) can help considerably. For
example, if the count has reached 000111, the total is 138, and all remaining
numbers of the form xxx111 can be ignored. Other stratagems can be
employed to further reduce the number of subsets examined, but in the end,
this problem remains stubbornly exponential in difficulty.
More often, randomized algorithms are used for problems in which choices
must be made and there is no good way to make them. In these problems,
multiple attempts are made to solve the problem, making random choices,
and either a solution is found, or the program keeps track of the “best so
far” solution.
Here’s a real-life example from my teaching experience. I had my students
doing pair programming—two people working together on the same
assignment. For the first assignment, pairs were chosen randomly. (Assume,
for simplicity, that I had an even number of students.)
absolute address 2
abstract data type 38
acyclic graph 113
adjacency matrix 115
adjacency set of graph 117
adjacent vertices 113
ADT 38
algorithm 7
algorithm types 135
alpha-beta search 106
alpha cutoff 106
analysis of algorithms 7
ancestor of a node 69
angels and demons 103
arithmetic expression 42
array 2, 3
of strings 2
A-star (A*) algorithm 128
average time 8
AVL-trees 74
backtracking 98, 135
balancing brackets 52
base case 44
best-first search 126, 128
beta cutoff 106
BFS 98
Big-O 6
time comparisons 27
Big-ϴ 28
Big-Ω 28
binary logarithms 15, 16
binary search 13, 16
recursive 17
binary search tree 72
binary tree 68
as an array 78
balanced 69
balancing 73
traversal 71
blind search 126
block 86, 87
board games 95
brackets, balancing 52
branching factor 93
breadth-first search 98, 99
brute force algorithm 143
bubble sort 10
call stack 54
card number 26
card sorter 26
characteristic operation 12
circular linked list 64
coalescing blocks 88
coin counting 139, 144
college courses example 119
collision 34
common logarithms 15
component of a graph 114
connected component of a graph 114
connected graph 113
constant time 8
cons onto a list 57
contents, in a plex 132
contract 39
Control Data 6600 xii
convert letters to numbers 110
cutting in half 16
cycle 113
cyclic graph 113
dangling reference 89
data compression 81
data representation 38
data structure 1
data type 38
deletion from a binary tree 72
depth-first iterative deepening 100, 101
depth-first search 98
depth-limited searching 100
depth of a binary tree 69
depth of a node 70
deque (data structure) 50, 56
dequeue (operation) 55, 57
dereferencing 3
descendant of a node 69
destination of an edge 113
deterministic 35
DFS 98
on a graph 118
dictionary 31
digraphs 112
Dijkstra’s algorithm 121
directed graphs 112
directories 94
divide and conquer algorithm 138
DLL 57, 64
documenting code 40
don’t look down 44, 47
doubly linked list 57, 64
game trees 95
garbage collection 90
general tree 92
gif files 81
global variable 46
goal node 97, 137
goal states 102
graph 112
applications 114
searching 117
greedy algorithm 139
handle 65, 89
hashCode 33
hash function 32, 34
hash map 31, 66
as list 66
hash table 32, 66
deletions 37
as list 67
size 36
header node 62, 87
head of a list 57
heap 4, 75, 86
two meanings 75
heapify (method) 76
heap implementation 87
heap property 75
redefined 85
heapsort 75
analysis 80
complete algorithm 79
heuristic 126
measure 127
search 126
hiding the implementation 38
Huffman encoding 81
hypergraph 131
IDA* 129
implementation hiding 38
implicit stack 99
in-degree 113
indentation xvi, xvii
in-edge 113
inorder traversal 71
insertion sort 13
invariant 11
loop invariant 10
inverse tree 121
iterative-deepening A* 129
leaf 68
least cost path in a graph 121
left child of a node 68
formula 79
left-justified binary tree 70
left rotation 73
linear algebra 121
linear time 9
link 2
list 57
implementation 62
in Python 57, 65
little-o 28
localizing errors 39
local variables 43
logarithms 15
logarithms are exponents 15
lookup table 38
loop invariant 10
magic function 32
Manhattan distance 127
mark and sweep 91
Martian example 131
maze 102, 124, 136
memory leak 89
merge sort 23
missionaries and cannibals 103
modulo 33
natural logarithms 15
neighbor of a vertex 113
node 1
non-local variables 46
O(1) 9
O(log n) 9
O(n2) 10
optimistic heuristic 129
optimization problems 142
Optimizing 30
ordered tree 93
origins, in a plex 133
out-degree 113
out-edge 113
overflow
of a priority queue 81
of a queue 56
of a stack 52
quadratic time 10
queue 50, 54
quicksort 18
quit operator 53
ragged array 3
raising to a power 15
randomized algorithm 144
reachable vertex 114
recursion 41
indirect 42
infinite 45
recursive cases 45
recursive definitions 41
red-black trees 74
reference 1
reference counting 90
reheap (method) 77
relative address 2
return address 54
reverse binary tree traversals 71
right child of a node 68
formula 79
right rotation 73
root node 68
removal 77
rule of thumb 126
satisficing 30
searching
alpha-beta 106
binary 13, 16, 17
graph 117
tree 97
selection sort 14
set 116
shared subtrees 94
siblings of a node 69
sift up (method) 76
simple recursive algorithm 135
singly linked list 57
size of a binary tree 70
size of the input 7
sliding block puzzle 102
SLL 57
sort algorithm
bubble sort 10
insertion sort 13
merge sort 23
quicksort 18
selection sort 14
sorted binary tree 72
source of an edge 113
space complexity 7
spanning tree 122, 125
sparse array 119
specification 39
stable sort 14
stack 50
as linked list 60
mapped into an array 51
reversing 59
stack frames 53
state space 102
storage allocation 86
strict typing 62
string array 4
strongly connected graph 113
structure sharing 63
sublist 62
successor vertex 113
sum of subsets 143
superexponential time 28
surprise, measure of 81
table 31
tagged unions 63
tail of a list 57
targets, in a plex 133
tentative cost 121
time complexity 7
traveling salesman problem 29, 124
traversal of binary tree 71
tree, general 92
searching 97
trie 109
underflow
of queue 56
of stack 52
undirected graphs 112
Union-Find algorithm 119, 124
unique prefix property 82
variant records 63
vertex, vertices 112
virtual trees 137
yes-no question 44
zip files 81
OceanofPDF.com