0% found this document useful (0 votes)

32 views6 pages

Code Tuning

The document discusses the importance of code tuning for improving program efficiency without sacrificing maintainability. It presents various examples of code optimization techniques, such as profiling to identify performance bottlenecks and using caching to enhance memory allocation. Additionally, it emphasizes that while tuning can yield significant performance gains, it should be approached carefully to avoid diminishing returns and ensure correctness.

Uploaded by

shahbazyounas636

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views6 pages

Code Tuning

Uploaded by

shahbazyounas636

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

programming

by Jon Bentley pears

CODE TUNING
Some programmers pay too much attention to effi- quest by a quick reference to that list rather than by
ciency; by worrying too soon about little "optimiza- invoking the general storage allocator; this reduced the
tions" they create ruthlessly clever programs that are total run time of his program to just 45 percent of what
insidiously difficult to maintain. Others pay too little it had been previously (so the storage allocator now
attention; they end up with beautifully structured pro- took just 30 percent of the total time). An additional
grams that are utterly inefficient and therefore useless. benefit was that the reduced fragmentation of the mod-
One has to keep perspective on efficiency: it is just one ified allocator made more efficient use of main mem-
of many problems, but it is sometimes very important. ory.
There are usually many ways to achieve efficiency. This story illustrates the art of code tuning at its best.
Previous columns discussed high-level approaches: By spending a few hours adding about 20 lines to a
careful problem definition, algorithm design, and data 3,000-line program, Van Wyk doubled its speed without
structure selection. This column is about a low-level altering the users' view of the program or decreasing its
approach. "Code tuning" locates the expensive parts of maintainability. He used general tools to achieve the
an existing program and then modifies that code to speedup. Profiling identified the "hot spot" of his pro-
improve its performance. It's not always the right ap- gram and caching reduced the time spent there.
proach to follow and it's rarely glamorous, but it can
sometimes make a big difference in a program's per- A First Aid Quiz
formance. We'll turn now from a beginning-to-end story to a quiz
made of three composite stories. Each describes a typi-
A Typical Story cal problem that arises in several applications. The
Chris Van Wyk of AT&T Bell Laboratories and I chat- problems consumed most of the run time in their appli-
ted about code tuning early one afternoon. He then cations; the solutions use general principles.
wandered off to try it on a program. By 5:00 p.m. he The first problem arises in text processing, compilers,
had halved the run time of a 3,000-line program. macro processors, and command interpreters.
Given a textual description of a picture, his program
Problem One--Character Classification. Given a se-
produces commands to draw the picture on a photo-
quence of one million characters, classify each as an
typesetter. Although the run time for typical pictures
upper-case letter, lower-case letter, digit, or other.
was much shorter, the program took ten minutes to
draw an extremely complicated picture. Van Wyk's The obvious solution uses a complicated sequence of
first step was to profile the program by setting a com- comparisons for each character. In the ASCII character
piler switch that caused the system to report how much code this approach uses six comparisons to determine
time was spent in each procedure. Running the pro- that a particular character is of type "other," while the
gram on 10 test pictures showed that it spent almost 70 same result requires 14 comparisons for EBCDIC. Can
percent of its time in the memory allocation subrou- you do better?
tine. The key is to view a character as an index into an
His next step was to investigate the memory alloca- array of character types; most programming languages
tor. He found that the program allocated the most popu- provide a way to do this. For an example, let's assume
lar kind of record 68,000 times, while the runner-up that the character code has eight bits, so each character
was allocated only 2,000 times. Given that you knew can be viewed as an integer in the range 0..255. Type-
the majority of the program's time was spent looking Table[O..255] is initialized with code in the spirit of Fig-
through storage for a single type of record, how would ure 1 on p.92 (although the wise programmer would do
you modify it to make it faster? the job with loops).
Van Wyk solved his problem by applying the princi- The type of the character C can then be found in
ple of caching: data that is accessed most often should TypeTable[C], which replaces a complicated sequence
be kept in storage with the least access time. He modi- of tests with a single array access. This change can
fied his program to keep a list of the most common type reduce the time spent in character classification by an
of free records. He could then handle the common re- order of magnitude.

February 1984 Volume 27 Number 2 Communications of the ACM 91

Programming Pearls

for I:=0 to 255 do T y p e T a b l e [ I ] := O t h e r

T y p e T a b l e [ 'a'] :. . . . := T y p e T a b l e [ 'z'] := L C L e t t e r
T y p e T a b l e [ 'A'] :. . . . := T y p e T a b l e [ 'Z'] := U C L e t t e r
T y p e T a b l e [ '0'] := ... := T y p e T a b l e [ '9'] := D i g i t

FIGURE 1. Initialize Typetable

The next problem arises in applications ranging from in all bytes of word W, but it is just as easy and
set representation to coding theory. probably a little faster to unroll the loop as shown in
Figure 2 on p. 93.
Problem Two--Counting Bits. Given a sequence of one
(The code isolates the bytes by shifting and then "AND-
million 32-bit words, tell how many one bits are in
ing" off all but the eight low-order bits; this operation is
each word.
language- and machine-dependent) While the original
The obvious program uses a loop to do 32 shifts and solution would use over a hundred machine instruc-
logical AND's; can you think of a better way? tions, the above approach can usually be implemented
The principle is the same as before: a table whose I th in about a dozen instructions; it is not uncommon for
entry contains the number of one bits in the binary this change to result (again) in a speedup of an order of
representation of I. Unfortunately, most computers magnitude.
don't have enough room to store a table with 232 en- The final problem is typical of applications that deal
tries, and the time to initialize the four billion entries with geographic and geometric data.
would also be prohibitive. We can get around this by
trading a few additional operations for a smaller table. Problem Three--Computing Spherical Distances. The
We'll set up a table giving the counts for all eight-bit first part of the input is a set S of 5,000 points on the
bytes, and then answer the 32-bit question by summing surface of a globe; each point is represented by its
the answers to four eight-bit questions. The count table latitude and longitude. After those points are stored
is initialized with two loops that have the effect of the in a data structure of our choice, the program reads
following assignment statements (see Problem 2). the second part of the input: a sequence of 20,000
points, each represented by latitude and longitude.
C o u n t T a b l e [ 0 ] :=0 For every point in that sequence, the program must
C o u n t T a b l e [ 1 ] :=I tell which point in S is closest to it, where distance is
C o u n t T a b l e [ 2 ] :=I measured as the angle between the rays from the
C o u n t T a b l e [ 3 ] :=2 center of the globe to the two points.
. . •

Margaret Wright of Stanford University encountered a

C o u n t T a b l e [ 2 5 4 ] :=7
problem similar to this in the preparation of maps to
C o u n t T a b l e [ 255 ] :=8
summarize data on the global distribution of certain
W e could set up a four-iteration loop to count the bits genetic traits. Her straightforward solution represented

Tuning the Federal Government's COBOL Code year minimum life of the systems, they will save almost
Readers interested in the role of code tuning in large $400,000.
data processing shops should see the General Account- In an Army application, one staff-day of optimization
ing Office report entitled "Improving COBOL Applica- reduced the typical elapsed time of a program from 7,5
tions Can Recover Significant Computer Resources" hours to less than two hours; this change saved $3,500
(dated I April 1982, it is available from the National in its first year and resulted in the output being deliv-
Technical Information Service, Springfield, Virginia ered to the user more reliably.
22161 with order code PB82-198540). Tuning several The report warns that code tuning should not be
application systems within the Department of Housing done haphazardly; other considerations such as correct-
and Urban Development resulted in the following. ness and maintainability must be given their rightful
high priority. It points out that there is usually a point
of diminishing returns in tuning either an individual
Percent Per-Year Total
CPU Time Dollar Cost To program or a set of programs on a system: work beyond
Reduced Savings Optimize that point will be very difficult and have little positive
impact.
82 $37,400 $5,500
30 4,400 2,400 The recommendations in the report include the fol-
19 9,000 900 lowing: "Heads of Federal agencies should require peri-
45 45,000 1,200 odic review of the machine resource consumption of
9 7,000 9,000 COBOL applications at their installations, and, where
feasible, require action to reduce the consumption of
The optimizations cost a total of $19,000; over the four- the expensive applications."

92 Communicationsof the ACM February 1984 Volume 27 Number 2

ProgrammingPearls

W o r d C o u n t :=
CountTable[W and 11111111B]
+ CountTable[(W rshift 8) a n d 11111111B]
+ CountTable[(W rshift 16) and 11111111B]
+ CountTable[(W rshift 2q) and 11111111B]

FIGURE2. Countthe One-Bitsin W

the set S by an array of latitude and longitude values. nometric functions translate its latitude and longitude
The nearest neighbor to each point in the sequence was into x, y and z coordinates, and we then compute its
found by calculating its distance to every point in S distance to every point in S. Its distance to a point in S
using a complicated trigonometric formula involving 10 is computed as the sum of the squares of the differences
sine and cosine functions. While the program was sim- in the three dimensions, which is usually cheaper than
ple to code and produced fine maps for small data sets, computing one trigonometric function, let alone 10.
it required several hours of mainframe time to produce This method computes the correct answer because the
large maps, which was well beyond the budget. angle between two points increases monotonically with
Because I had previously worked on geometric prob- the square of their Euclidean distance.
lems, Wright asked me to try my hand at this problem. Although this approach does require additional stor-
After spending much of a weekend on it, I developed age, it yields substantial benefits. When Wright incor-
several fancy algorithms and data structures for solving porated the change into her program, the run time for
it. Fortunately (in retrospect), each would have re- complicated maps was reduced from several hours to
quired many hundreds of lines of code, so I didn't try to less than half a minute. In this case, code tuning solved
program any of them. When I described the data struc- the problem with a couple of dozen lines of code, while
tures to Andrew Appel of Carnegie-Mellon University, algorithmic and data structure changes would have re-
he had a key insight: rather than approaching the prob- quired many hundreds of lines.
lem at the level of data structures, why not use the
simple data structure of keeping the points in an array, Major Surgery--Binary Search
but tune the code to reduce the cost of computing the We'll turn from first aid to major surgery on a program
distance between points. Any ideas? that is one of the most subtle examples I know of code
The cost can be greatly reduced by changing the rep- tuning. The problem was given in the December 1983
resentation of points: rather than using latitudes and column: perform a binary search in a table of 1,O0O
longitudes, we'll represent a point's location on the sur- integers. As we go through this exercise, keep in mind
face of the globe by its x, y and z coordinates. Thus the that code tuning is usually not needed in this pro-
data structure is an array that holds each point's lati- gram--the binary search algorithm is so efficient that
tude and longitude (which may be needed for other code tuning is often superfluous. Therefore, in the De-
operations) as well as its three Cartesian coordinates. cember 1983 column we ignored efficiency entirely (at
As each point in the sequence is processed, a few trigo- the level of code tuning) and concentrated on achieving

The History of Binary Search studied it was effective for N less than a dozen (see his
Several readers of the September 1983 column pointed Exercises 11 and 23 and pp. 127-129 of Writing Efficient
out that the performance of its binary search could be Programs).
improved. David Gries of Cornell and Ed Cohen of Even though it is not particularly well suited to im-
Prime Computer, Inc. both observed that the three-way plementation on Knuth's MIX computer, the loop-un-
test of that code could be replaced by the two-way test rolled binary search program has been used quite prof-
in code like the first new binary search in this column. itably in many different contexts. I first learned of the
Denis Wilson of the University of Aberdeen wrote that code from Guy Steele of Tartan Laboratories: he reports
in the early 1970s he and the late Iain Houstoun were that it was widely used in the ITS operating system on
able to remove an increment or decrement from each a Digital Equipment Corporation PDP-10 at MIT in the
iteration of the loop in the September column by per- late 1960s (each comparison compiled into two instruc-
forming a single extra division. tions, one-and-a-half of which were executed on the
Knuth describes many techniques for increasing the average). I later found that Vic Vyssotsky of AT&T Bell
performance of binary search programs in Section 6.2.1 Laboratories had used exactly this code on an IBM 7090
of his Sorting and Searching, including all the techniques in 1961 to achieve a binary search in which each com-
I used in this column. His techniques include changes parison compiled into three machine instructions.
at the source code level as well as methods for efficient I'd like to find out more from the readers of Communi-
assembly coding. Knuth found that replacing the ter- cations about the history of efficient binary search pro-
nary comparison by a binary comparison was effective grams. Could you please communicate to me any his-
in MIX only for very large values of N he cites a tory or efficiency techniques you know beyond those in
number over 64 billion--while on some machines I Section 6.2.1 of Knuth's third volume?

February 1984 Volume27 Number 2 Communications of the ACM 93

Programming Pearls

a simple, correct, and maintainable program. Every ment of X in each iteration of the loop. The previous
now and then, though, the tuned version can make a program sometimes had to test two such outcomes,
big difference in a system. The next version of the program uses a different rep-
We'll develop the fast binary search in a sequence of resentation of a range: instead of representing L.. U by
four programs. They are subtle, but there is good reason its lower and upper values, we'll represent it by its
to follow them closely: the final program is usually two lower value L and an increment I such that L + I = U.
or three times faster than the binary search we saw in The code wiI1 ensure that I is at all times a power of
the December 1983 column, which was two; this property is easy to keep once we have it, but it
is hard to get originally (recall that the size of the array
L:=I ; U : = N
is N = 1000). The main body of the program is there-
loop
fore preceded by an assignment and i f statement to
/~ i n v a r i a n t : if T is in X,
ensure that the range being searched is initially of size
it is in X [ L . . U ] ~/
512, the largest power of two less than 1000; thus L and
if L > U t h e n
L + I are either 0..512 or 489..1001. Translating the
P:=0; break
previous program to this new representation of a range
M := (L+U) d i v 2
yields the following code.
case
X [ M ] < T: L:=M+I I:=512
X[M] = T: P:=M; break if X [ 5 1 2 ] > = T then
X [ M ] > T: U:=M-I L:=0
endloop else
L:=I000+I-512
Before reading on, can you spot any obvious waste in
w h i l e I~I do
that code?
/$ i n v a r i a n t : X[L]<T
Our development of the fast binary search will start
a n d X [ L + I ]>=T
with the modified problem of locating the first occur-
and I=2~j ~/
rence of the integer T in the integer array X[1.. N] (the
N e x t I := I d i v 2
above code might return any one of multiple occur-
if X [ L N e x t I ] < T then
rences of T). The main loop of the program is similar to
L :=L+NextI ; I :=NextI
the one above; we'll keep indices L and U into the array
else
that bracket the position of T, but we'll use the invar-
I := N e x t I
iant relation that X[L] < T, X[U] ~_ T and L < U. We'll
/~ a s s e r t I=I a n d X [ L ] < T
assume that X[0] < T and that X[N + 1] ~ T, but the
and X[L+I]>=T ~/
program will never access those extreme elements.
P:=L+I
Starting with that invariant, the first program is easy
if P > I 0 0 0 or X [ P ] ~ T t h e n P : = 0
to write.
The correctness proof of this program has exactly the
L:=0; U:=N+I
same flow as the proof of the previous program. This
w h i l e L+I # U do
code is usually slower than its predecessor, but it opens
/~ i n v a r i a n t : X[L]<T and X[U]>=T
the door for future speedups.
a n d L < U ~/
The next program is a simplification of the above,
M := (L+U) d i v 2
incorporating some optimizations that a smart compiler
if X [ M ] < T t h e n
might perform. The first i f statement is simplified, the
L:=M
variable NextI is removed, and the assignments to NextI
else
are removed from the inner i f statement.
U:=M
/~ a s s e r t L + I = U a n d X [ L ] < T I:=512; L:=0
and X[U]>=T ~/ if X [ 5 1 2 ] < T then L:=I000+I-512
P:=U w h i l e I#I do
if P > N or X [ P ] ~ T t h e n P : = 0 /~ i n v a r i a n t : X[L]<T
a n d X [ L + I ]>=T
The first statement initializes the invariant. As the loop
and I=2~j ~/
is repeated, the invariant is maintained by the i f state-
I:= I d i v 2
ment; it's easy to check that both branches maintain
if X [ L + I ] < T then
the invariant. Upon termination we know that if T is
L := L + I
anywhere in the array, then its first occurrence is in
/~ a s s e r t I=I a n d X [ L ] < T
position U (that fact is stated more formally in the as-
and X[L+I]>=T ~/
sert comment). The final two statements set P to the
P := L + I
index of the first occurrence of T in X if it is present,
if P > I 0 0 0 or X [ P ] ~ T t h e n P : = 0
and to zero if it is not present.
While this binary search solves a more difficult prob- Although the correctness argument for the code (still)
lem than the previous program, it is potentially more has exactly the same structure, we can now understand
efficient: it makes only one comparison of T to an ele- its operation on a more intuitive level as well. When

94 Communications of the ACM February 1984 Volume 27 Number 2

Programming Pearls

the first test fails (that is, when L stays zero), the pro- Save concern for efficiency for when it matters.
gram computes the bits of P in left-to-right order, most Profiling. When efficiency is important, the first step
significant bit first. is to profile the system to find out where the processor
The final version of the code is not for the faint of spends most of its time. This activity usually shows that
heart. It removes the overhead of loop control and the most of the time is going to a few hot spots and that the
division of I by two by unrolling the entire loop. In rest of the code is almost never executed. Profiling
other words, because I assumes only a few distinct val- points to the critical areas; for the rest of the program
ues in this particular problem, we can write them all we follow the wise m a x i m of, "If it ain't broke, don't fix
down in the program, and thereby avoid computing it."
them over and over again at run time. Design Levels. When the run time of the critical mod-
ules must be reduced, there are usually several ways to
L:=0
proceed. Sometimes it is fruitful to take the high-level
if X[512]<T then L:=I000+I-512
approaches of changing problem definition, algorithms,
/* assert X[L]<T and X[L+512]>=T ~/
or data structures. Sometimes it is most effective to
if X[L+256]<T then L:=L+256
work at a lower level and write in assembly language
/~ assert X[L]<T and X[L+256]>=T */
or microcode or even build special-purpose hardware.
if X[L+I28]<T then L:=L+128
Before we start to tune code, we should make sure that
if X[L+64]<T then L:=L+64
other approaches don't provide a more effective solu-
if X[L+32]<T then L:=L+32
tion.
if X[L+I6]<T then L:=L+16
The above discussion considers w h e t h e r and when to
if X[L+8]<T then L:=L+8
tune code; once we decide to do so, that still leaves the
if X[L+4]<T then L:=L+4
question of how. In my book Writing Efficient Programs I
if X[L+2]<T then L:=L+2
tried to answer that question with a list of general rules
/~ assert X[L]<T and X[L+2]>=T */
for code tuning. All the examples we've seen can be
if X[L+I]<T then L:=L+I
explained in terms of those principles; I'll do that now
/~ assert X[L]<T and X[L+I]>=T ~/
with the names of the rules in italics.
P:=L+I
if P>I000 or X[P]#T then P:=0 Van Wyk's Drawing Program. The general strategy of
Van Wyk's solution was to Exploit Common Cases; his
The correctness of this code is easy to prove by insert- particular exploitation involved caching a list of the
ing the complete string of assertions like those sur- most common kind of record.
rounding the test of X[L + 256]; once you do the two-
Problem O n e - - C h a r a c t e r Classification. The solution
case analysis to see how that i f statement behaves, all
with a table indexed by a character Precomputes A
the other i f statements fall into line.
Logical Function.
I've compared the binary search of the December
1983 column with this fine-tuned binary search, and Problem T w o - - C o u n t i n g Bits. The table of byte
the results are consistent across several machines and counts is closely related to the previous solution; it
languages: the tuned search is usually between two and Stores Precomputed Results.
three times faster. Sometimes the results can be even Problem T h r e e - - C o m p u t i n g Spherical Distances.
more dramatic: on machines that skip after a successful Storing Cartesian coordinates along with latitudes
comparison, smart compilers can produce assembly and longitudes is an example of Data Structure Aug-
code for the tuned search that is five times faster than mentation; using the cheaper Euclidean distance
the original code. rather than the angular distance Exploits An Algebraic
This derivation is an idealized account of code tuning Identity.
at its most extreme. We replaced the obvious binary Binary Search. Combining Tests reduced the number
search program (which doesn't appear to have much fat of array comparisons per inner loop from two to one,
on it) with a super-lean version that is several times Exploiting An Algebraic Identity changed representa-
faster. The program verification tools of the December tions from a lower and upper bound to a lower bound
1983 column played a crucial role in the task. Because and an increment, and Loop Unrolling expanded the
we used them, we can believe that the final program is program to remove all loop overhead.
correct; when I first saw the final code presented with-
So far we've looked upon code tuning as a means to
out verification, I looked upon it as magic for months.
reduce run time. One can also tune code for other pur-
poses, such as reducing paging or increasing a cache hit
Principles ratio. Perhaps the most common use of code tuning
The most important principle about code tuning is that
(other than reducing run time) is to reduce the space
it should be done rarely. That sweeping generalization
required by a program. Problem 4 gives a taste of that
is explained by the following considerations.
endeavor, but the big picture of space reduction will
The Role of Efficiency. Other properties of software are have to wait for a column of its own.
as important as efficiency, if not more so. Don Knuth
has observed that premature optimization is the root of Problems
much programming evil; it can compromise the correct- 1. The character classification program in the text as-
ness, functionality, and maintainability of programs. sumed that the character classes were disjoint. How

February 1984 Volume 27 Number 2 Communications of the ACM 95

ProgrammingPearls

would you write a routine to test membership in your machine when both programs are equally
overlapping character classes (such as lower case let- tuned? Is it a function of the level of tuning?
ters, upper case letters, letters, digits, and alphanu-
merics)? 7. D. B. Lomet of IBM Watson Research Center ob-
serves that hashing may solve the 10OO-integer
2. Write a code fragment that given N (a power of two), search problem more efficiently than the tuned bi-
initializes CountTable[O.. N - 1] as described in the nary search. Implement a fast hashing program and
text. compare it to the tuned binary search; how do they
compare in terms of speed and space?
3. How do the various binary search algorithms behave
if they are (against specification) applied to unsorted
arrays? Solutions to Problems in the December 1983 Column
1. To show that overflow does not occur we add the
4. In the very early days of programming, Fred Brooks conditions 1 <__L ~ N + 1 and O <__U ~ N to the
(then of Harvard, now of the University of North invariant; we can then bound L + U. The same con-
Carolina) faced the problem of representing a large ditions can be used to show that elements outside
table on a small computer. He couldn't store the the array bounds are never accessed. We can for-
entire table in an array because there was room for mally define MustBe(L, U) as X[L - 1] < T and
only a few bits for each table entry (actually, there X[U + 1] > T if we define the boundary elements
was one decimal digit available for each entry--I X[0] and X[N + 1] as we did in this column.
said that it was in the early days!). His second ap-
proach was to use numerical analysis to fit a func- 2, 6. A binary search for the first occurrence of a value
tion through the table. Although that resulted in a and a fast binary search were both given in this
column.
function that was quite close to the true table (no
3. This problem called for verifying your own pro-
entry was more than a couple of units off the true
grams.
entry) and required an unnoticeably small amount
4. The process always removes either zero or two
of memory, legal constraints meant that the approxi-
white beans from the coffee can, so it leaves invar-
mation wasn't good enough. How could Brooks get
iant the parity (oddness or evenness) of the number
the required accuracy in the limited space?
of white beans. Thus the last remaining bean is
5. A common example of code tuning speeds up se- white if and only if an odd number of white beans
quential search in an array by placing the element were in the can originally. (David Gries attributes
being searched for in a "sentinel" position at the end this problem to Carel Scholten.)
of the array. This eliminates the test of whether the
5. Because the line segments that form the rungs of the
current index is within array bounds, and typically
ladder are in increasing y-order, the ones that
reduces the run time of a sequential search program
bracket a given point can be found by binary search.
by 20 or 30 percent. How can sentinels decrease the
The basic comparison in the search tells whether the
search time in tables represented by linked lists,
point is below, on, or above a given line segment.
hash tables, or binary search trees?
The next two problems call for implementing pro-
grams to determine when one approach becomes more
efficient than another. Please let me know the results of Further Reading
your implementations; they will be reported in a later The only book I know that is devoted entirely to code
column. tuning is my own Writing Efficient Programs, published
by Prentice-Hall in 1982. The heart of the book is a 50
6. Because sequential search is simpler than binary
page discussion of the efficiency rules I mentioned
search, it is usually more efficient for small tables.
above; each is stated in general terms and then illus-
On the other hand, the logarithmic number of com-
trated by application to small code fragments and by
parisons made by binary search implies that it will
"war stories" of its application in real systems. Other
be faster than the linear time of sequential search
parts of the book discuss the role of efficiency in soft-
for large tables. The break-even point is a function of
ware systems and the application of the rules to several
how much each program is tuned. How low (or high)
important subroutines.
can you make that break-even point? What is it on

96 Communicationsof the ACM February 1984 Volume27 Number 2

417176
No ratings yet
417176
204 pages
Jon Louis Bentley - More Programming Pearls - Confessions of A Coder (1988)
No ratings yet
Jon Louis Bentley - More Programming Pearls - Confessions of A Coder (1988)
204 pages
Bentley J. More Programming Pearls (Addison-Wesley, 1988) (T) (204s) - CsAl
No ratings yet
Bentley J. More Programming Pearls (Addison-Wesley, 1988) (T) (204s) - CsAl
204 pages
Data Structures & Algorithms Intro
No ratings yet
Data Structures & Algorithms Intro
5 pages
Introduction To Programming Latest
No ratings yet
Introduction To Programming Latest
23 pages
C Program Optimization Guide
No ratings yet
C Program Optimization Guide
2 pages
9210 International Gcse Computer Science Question Paper 2 v1.0
No ratings yet
9210 International Gcse Computer Science Question Paper 2 v1.0
21 pages
Optimization of Computer Programs in C
No ratings yet
Optimization of Computer Programs in C
37 pages
Lab 1 - Intro
No ratings yet
Lab 1 - Intro
36 pages
Algorithm and Computer Science Terms
No ratings yet
Algorithm and Computer Science Terms
8 pages
Software Development Units (BooksRack - Net)
No ratings yet
Software Development Units (BooksRack - Net)
174 pages
Caie Igcse Computer Science 0478 Practical v1
No ratings yet
Caie Igcse Computer Science 0478 Practical v1
29 pages
Programming Abstractions in C++
100% (2)
Programming Abstractions in C++
682 pages
Flow-Based Programming
No ratings yet
Flow-Based Programming
251 pages
Quiz Game: Information Technology
No ratings yet
Quiz Game: Information Technology
83 pages
Mackenzie CodedCharSets
No ratings yet
Mackenzie CodedCharSets
535 pages
Problem Solving
No ratings yet
Problem Solving
6 pages
Halstead 1979
No ratings yet
Halstead 1979
54 pages
AP Computer Science Principles-EXAM1-5 Steps To A 5-MCQ-ANSWERS
No ratings yet
AP Computer Science Principles-EXAM1-5 Steps To A 5-MCQ-ANSWERS
12 pages
Problem Solving & Algorithm Notes
No ratings yet
Problem Solving & Algorithm Notes
33 pages
Softw Pract Exp - 2013 - Lemire - Decoding Billions of Integers Per Second Through Vectorization
No ratings yet
Softw Pract Exp - 2013 - Lemire - Decoding Billions of Integers Per Second Through Vectorization
29 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
12 pages
Java Course Notes CS3114 - 09212011
No ratings yet
Java Course Notes CS3114 - 09212011
344 pages
ELE 276 Note2025 Final
No ratings yet
ELE 276 Note2025 Final
88 pages
Problem Solving and Program Design
No ratings yet
Problem Solving and Program Design
30 pages
14 Tools: 14.1 The "Code Coverage" Tool
No ratings yet
14 Tools: 14.1 The "Code Coverage" Tool
6 pages
CS3114 09212011
No ratings yet
CS3114 09212011
344 pages
Form and Content in Computer Science: 1 9 6 9 Turing Award
No ratings yet
Form and Content in Computer Science: 1 9 6 9 Turing Award
24 pages
CS Revision Booklet 2
No ratings yet
CS Revision Booklet 2
13 pages
Unit 1
No ratings yet
Unit 1
13 pages
Unit 2
No ratings yet
Unit 2
8 pages
Introduction To Programming in Java An Interdisciplinary Approach Second Edition 9780134512396 0134512391 Compress Pages
No ratings yet
Introduction To Programming in Java An Interdisciplinary Approach Second Edition 9780134512396 0134512391 Compress Pages
16 pages
Software Optimization Techniques
No ratings yet
Software Optimization Techniques
49 pages
Structnotes
No ratings yet
Structnotes
196 pages
1St Unit Aop
No ratings yet
1St Unit Aop
19 pages
Rajib Mall Pose
0% (1)
Rajib Mall Pose
2,358 pages
Basic Programming Principles (2nd Edition)
No ratings yet
Basic Programming Principles (2nd Edition)
458 pages
PIC - Module 1 - Part 1
No ratings yet
PIC - Module 1 - Part 1
12 pages
DAA Notes
No ratings yet
DAA Notes
173 pages
Unit7. Algorithm Design and Problem Solving Revision Notes IXC
No ratings yet
Unit7. Algorithm Design and Problem Solving Revision Notes IXC
17 pages
(Ebook PDF) Programming, Problem Solving and Abstraction With C
No ratings yet
(Ebook PDF) Programming, Problem Solving and Abstraction With C
42 pages
Data Structures
No ratings yet
Data Structures
225 pages
I. Types of Data
No ratings yet
I. Types of Data
33 pages
TMI-Sample Session Plan
No ratings yet
TMI-Sample Session Plan
3 pages
Computer Systems Servicing: Self-Learning Module 3
No ratings yet
Computer Systems Servicing: Self-Learning Module 3
8 pages
Product Data Sheet: Tesys LRD Thermal Overload Relays - 30... 40 A - Class 10A
No ratings yet
Product Data Sheet: Tesys LRD Thermal Overload Relays - 30... 40 A - Class 10A
2 pages
Part Xe Nâng - Geni S85 PDF
No ratings yet
Part Xe Nâng - Geni S85 PDF
364 pages
Effects of Whatsapp Media On Customers Patronage of Fashionister Services
No ratings yet
Effects of Whatsapp Media On Customers Patronage of Fashionister Services
67 pages
Turning Buzz Into Gold
No ratings yet
Turning Buzz Into Gold
26 pages
Devicenet: Leoni Special Cables GMBH
No ratings yet
Devicenet: Leoni Special Cables GMBH
2 pages
Post-Tensioned Girder
100% (1)
Post-Tensioned Girder
30 pages
Erm4004-1 Series: Eastrising Technology Co., Limited
No ratings yet
Erm4004-1 Series: Eastrising Technology Co., Limited
20 pages
Hon EMEA14 Enhanced TPS Upgrade Solutions
No ratings yet
Hon EMEA14 Enhanced TPS Upgrade Solutions
22 pages
23SC1101-course Handout
No ratings yet
23SC1101-course Handout
32 pages
Controlador XC2003
No ratings yet
Controlador XC2003
58 pages
2024 CF Moto - 450sr S - SM
100% (3)
2024 CF Moto - 450sr S - SM
219 pages
Todoist Setup for Beginners
No ratings yet
Todoist Setup for Beginners
14 pages
Android App Development Test
No ratings yet
Android App Development Test
2 pages
Grade 12 Mil q1 Module 1 SHSPH
No ratings yet
Grade 12 Mil q1 Module 1 SHSPH
21 pages
Telehealth Access in Nepal Pandemic
No ratings yet
Telehealth Access in Nepal Pandemic
124 pages
Synopsis Final
No ratings yet
Synopsis Final
15 pages
Chinon Ce 4 Text Images
No ratings yet
Chinon Ce 4 Text Images
48 pages
Wiring Diagram DSE 7320 AMF
100% (1)
Wiring Diagram DSE 7320 AMF
1 page
Multiattribute Networks in CBF Systems
No ratings yet
Multiattribute Networks in CBF Systems
9 pages
Alternative Bom
No ratings yet
Alternative Bom
16 pages
Takeuchi Tb225 Parts Manual
No ratings yet
Takeuchi Tb225 Parts Manual
338 pages
G8 - 2nd Quarterly Exam
No ratings yet
G8 - 2nd Quarterly Exam
3 pages
CRM Strategies for Marketers
No ratings yet
CRM Strategies for Marketers
25 pages
Smart Meter - Wireless - Iot - Wifi - GSM Network
No ratings yet
Smart Meter - Wireless - Iot - Wifi - GSM Network
3 pages
Manual Home Theater Panasonic SC-HT680
No ratings yet
Manual Home Theater Panasonic SC-HT680
32 pages
Mould Steel for Injection Moulding
No ratings yet
Mould Steel for Injection Moulding
1 page
CC 11
No ratings yet
CC 11
3 pages
Photoshop CC 2018 Key Features Guide
No ratings yet
Photoshop CC 2018 Key Features Guide
8 pages

Code Tuning

Uploaded by

Code Tuning

Uploaded by

programming

by Jon Bentley pears

February 1984 Volume 27 Number 2 Communications of the ACM 91

for I:=0 to 255 do T y p e T a b l e [ I ] := O t h e r

FIGURE 1. Initialize Typetable

Margaret Wright of Stanford University encountered a

92 Communicationsof the ACM February 1984 Volume 27 Number 2

FIGURE2. Countthe One-Bitsin W

February 1984 Volume27 Number 2 Communications of the ACM 93

94 Communications of the ACM February 1984 Volume 27 Number 2

February 1984 Volume 27 Number 2 Communications of the ACM 95

96 Communicationsof the ACM February 1984 Volume27 Number 2

You might also like