Optimizing Selective Search in Chess
Omid David-Tabibi mail@omiddavid.com
Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel
Moshe Koppel koppel@cs.biu.ac.il
Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel
Nathan S. Netanyahu nathan@cs.biu.ac.il
arXiv:1009.0550v1 [cs.AI] 2 Sep 2010
Department of Computer Science, Bar-Ilan University, Ramat-Gan 52900, Israel, and Center for Automation
Research, University of Maryland, College Park, MD 20742
Abstract searchers, and started their own reign which has con-
tinued ever since; they have won all World Computer
In this paper we introduce a novel method for Chess Championships since 1992. Deep Blue (Ham-
automatically tuning the search parameters milton and Garber, 1997; Hsu, 1999) was probably the
of a chess program using genetic algorithms. last brute-force searcher.
Our results show that a large set of parame-
ter values can be learned automatically, such Nowadays, top tournament-playing programs use a
that the resulting performance is comparable range of methods for adding selectivity to their search.
with that of manually tuned parameters of The most popular methods include null-move prun-
top tournament-playing chess programs. ing, futility pruning (Heinz, 1998), multi-cut pruning
(Björnsson and Marsland, 1998; Björnsson and Mars-
land, 2001), and selective extensions (Anantharaman,
1991; Beal and Smith, 1995). For each of these meth-
1. Introduction ods, a wide range of parameter values can be set. For
Until the mid-1970s most chess programs attempted to example, different reduction values can be used for
perform search by mimicking the way humans think, null-move pruning, various thresholds can be used for
i.e., by generating “plausible” moves. By using exten- futility pruning, etc.
sive chess knowledge, these programs selected at each For each chess program, the parameter values for
node a few moves which they considered plausible, various selective search methods are manually tuned
thereby pruning large parts of the search tree. How- through years of experiments and manual optimiza-
ever, as soon as brute-force search programs like Tech tions. In this paper we introduce a novel method for
(Gillogly, 1972) and Chess 4.x (Slate and Atkin, automatically tuning the search parameters of a chess
1983) managed to reach depths of 5 plies and more, program using genetic algorithms (GA).
plausible move generating programs frequently lost to
these brute-force searchers due to their significant tac- In the following section, we review briefly the main
tical weaknesses. Brute-force searchers rapidly domi- methods that have been used for selective search. For
nated the computer chess field. each of these methods, we enumerate the parameters
that need to be optimized. Section 3 provides a re-
The introduction of null-move pruning (Beal, 1989; view of past attempts at automatic learning of various
Donninger, 1993) in the early 1990s marked the end parameters in chess. In Section 4 we present our au-
of an era, as far as the domination of brute-force pro- tomatic method of optimizing the parameters in ques-
grams in computer chess is concerned. Unlike other tion, which is based on the use of genetic algorithms,
forward-pruning methods which had great tactical and in Section 5 we provide experimental results. Sec-
weaknesses, null-move pruning enabled programs to tion 6 contains concluding remarks.
search more deeply with minor tactical risks. Forward-
pruning programs frequently outsearched brute-force
2. Selective Search in Chess
Appearing in Proceedings of the ICML Workshop on Ma- In this section we review several popular methods for
chine Learning and Games. Copyright 2010 by the au-
thor(s)/owner(s). selective search. All these methods work within the al-
Optimizing Selective Search in Chess
phabeta/PVS framework and introduce selectivity in rather than a fixed value could be selected for the re-
various forms. A simple alphabeta search requires the duction factor. By using R = 3 in upper parts of the
search tree to be developed to a fixed depth in each it- search tree and R = 2 in its lower parts (close to the
eration. Forward pruning methods, such as null-move leaves) pruning can be achieved at a smaller cost (as
pruning, futility pruning, and multi-cut pruning, en- null-move searches will be shallower in comparison to
able the program to prune some parts of the tree at using a fixed reduction value of R = 2) while maintain-
an earlier stage, and devote the time gained to other, ing the overall tactical strength. An in-depth review
more promising parts of the search tree. of null-move pruning and our extended null-move re-
ductions improvement can be found in (David-Tabibi
Selective extensions, on the other hand, extend cer-
and Netanyahu, 2008b).
tain parts of the tree to be searched deeper, due to
tactical considerations associated with a position in Over the years many variations of null-move pruning
question. The following subsections briefly cover each have been suggested, but the set of key parameters to
of these pruning and extension methods, and specify be determined has remained the same. These param-
which parameters should be tuned for each method. eters are: (1) the reduction value R, (2) the Boolean
adaptivity variable, and (3) the adaptivity depth for
2.1. Null-Move Pruning which the decremented value of R is applied.
Null-move pruning (Beal, 1989; David-Tabibi and Ne- 2.2. Futility Pruning
tanyahu, 2008b; Donninger, 1993) is based on the as-
sumption that “doing nothing” in every chess position Futility pruning and extended futility pruning (Heinz,
(i.e., doing a null-move) is not the best choice even if 1998) suggest pruning nodes near a leaf where the sum
it were a legal option. In other words, the best move of the current static evaluation value and some thresh-
in any position has to be better than the null-move. old (e.g., the value of a knight) is smaller than α. In
This assumption enables the program to establish a these positions, assuming that the value gained in the
lower bound α on the position by conducting a null- remaining moves until reaching the leaf is not greater
move search. The idea is to make a null-move, i.e., than the threshold, it is safe to assume that the po-
merely swap the side whose turn it is to move. (Note sition is “weak enough”, i.e., that it is worth pruning
that this cannot be done in positions where the side (as its score will not be greater than α). Naturally,
to move is in check, since the resulting position would the larger the threshold, the safer it is to apply futility
be illegal. Also, two null-moves in a row are forbid- pruning, although fewer nodes will be pruned.
den, since they result in nothing.) A regular search is
The main parameters to be set for futility pruning are:
then conducted with reduced depth R. The returned
(1) the futility depth and (2) the futility thresholds for
value of this search can be treated as a lower bound
various depths (usually up to a depth of 3 plies).
on the position’s strength, since the value of the best
(legal) move has to be better than that obtained from
the null-move search. In a negamax framework, if the 2.3. Multi-Cut Pruning
returned value is greater than or equal to the current Björnsson and Marsland’s multi-cut pruning (1998;
upper bound (i.e., value ≥ β), it results in a cutoff 2001) suggests searching the moves at a given posi-
(fail-high). Otherwise, if the value is greater than the tion to a shallower depth first, such that if several of
current lower bound (i.e., α < value ≤ β), we define them result in a cutoff, the current node is pruned
a narrower search window, as the returned value be- without conducting a full depth search. The idea is
comes the new lower bound. If the value is smaller that if there are several moves that produce a cutoff
than the current lower bound, it does not contribute at a shallower depth, there is a high likelihood that at
to the search in any way. The main benefit of the least one of them will produce a cutoff if searched to
null-move concept is the pruning obtained due to the a full depth. In order to apply multi-cut pruning only
cutoffs, which take place whenever the returned value to potentially promising nodes, it is applied only to
of the null-move search is greater than the current up- cut-nodes (i.e., nodes at which a cutoff has occurred
per bound. Thus, the best way to apply null-move previously, according to a hash table indication).
pruning is by conducting a minimal-window null-move
search around the current upper bound β, since such a The primary parameters that should be set in multi-
search will require a reduced search effort to determine cut pruning are: (1) the depth reduction value, (2) the
if a cutoff takes place. depth for which multi-cut is applied, (3) the number
of moves to search, and (4) the number of cutoffs to
Donninger (1993) was the first to suggest an adap- require.
tive rather than a fixed value for R. Experiments
conducted by Heinz in his article on adaptive null-
move pruning (1999) showed that, indeed, an adaptive
Optimizing Selective Search in Chess
2.4. Selective Extensions Temporal difference learning has been successfully ap-
plied in backgammon and checkers (Schaeffer, Hlynka,
Selective extensions (Anantharaman, 1991; Beal and and Jussila, 2001; Tesauro, 1992). Although the lat-
Smith, 1995) are used for extending potentially critical ter has also been applied to chess (Baxter, Tridgell,
moves to be searched deeper. The following is a list of and Weaver, 2000), the results show that after three
major extensions used in most programs: days of learning, the playing strength of the program
Check extension: Extend the move if it checks the was only 2150 Elo, which is a very low rating for a
opponent’s king. chess program. Block et al. (2008) reported that using
One-reply extension: Extend the move if it is the reinforcement learning, their chess program achieves
only legal move. a playing strength of only 2016 Elo. Veness et al.’s
Recapture extension: Extend the move if it is a (2009) work on bootstrapping from game tree search
recapture of a piece captured by the opponent (such improved upon previous work, but their resulting chess
moves are usually forced). program reached a performance of between 2154 to
Passed pawn extension: Extend the move if it in- 2338 Elo, which is still considered a very low rating
volves moving a passed pawn (usually to 7th rank). for a chess program. Kocsis and Szepesvári’s (2006)
Mate threat extension: Extend the move if the work on universal parameter optimization in games
null-move search returns a mate score (the idea is that based on SPSA does not provide any implementation
if doing a null-move results in being checkmated, a for chess.
potential danger lies at the horizon, so we extend the Björnsson and Marsland (2002) presented a method
search to find the threat). for automatically tuning search extensions in chess.
For each of the above extensions, fractional extensions Given a set of test positions (for which the correct
have been widely employed. These are implemented move is predetermined) and a set of parameters to be
usually by defining one ply to be a number greater than optimized (in their case, four extension parameters),
one (e.g., 1 ply = 4 units), such that several fractional they tune the values of the parameters using gradient-
extensions along a line (i.e., a series of moves from the descent optimization. Their program processes all the
root to a leaf) cause a full ply extension. For example, positions and records, for each position, the number
if a certain extension is defined as half a ply, two such of nodes visited before the solution is found. The goal
extensions must occur along a line in order to result in is to minimize the total node count over all the po-
an actual full ply extension. For each extension type, sitions. In each iteration of the optimization process,
a value is defined (e.g., assuming that 1 ply = 4 units, their method modifies each of the extension parame-
an extension has a value between 0 to 4). ters by a small value, and records the total node count
over all the positions. Thus, given N parameters to
From this brief overview of selective search, there are a optimize (e.g., N = 4), their method processes in each
number of parameters for each method which have to iteration all the positions N times. The parameter val-
be set and tuned. Currently, top tournament-playing ues are updated after each iteration, so as to minimize
programs use manually tuned values which take years the total node count. Björnsson and Marsland ap-
of trial and improvement to fine tune. In the next sec- plied their method for tuning the parameter values of
tion we review the limited success of past attempts at the four search extensions: check, passed pawn, recap-
automatic learning of the values of these search pa- ture, and one-reply extensions. Their results showed
rameters, and in Section 4 we present our GA-based that their method optimizes fractional ply values for
method for doing so. the above parameters, as the total node count for solv-
ing the test set is decreased.
3. Automatic Tuning of Search Despite the success of this gradient-descent method for
Parameters tuning the parameter values of the above four search
The selective search methods covered in the previ- extensions, it is difficult to use it efficiently for optimiz-
ous section are employed by most of the current top ing a considerably larger set of parameters, which con-
tournament-playing chess programs. They use manu- sists of all the selective search parameters mentioned
ally tuned parameter values that were arrived at after in the previous section. This difficulty is due to the
years of experiments and manual optimizations. fact that unlike the optimization of search extensions
for which the parameter values are mostly indepen-
Past attempts at automatic optimization of search pa- dent, other search methods (e.g., multi-cut pruning)
rameters have resulted in limited success. Moriarty are prone to a high interdependency between the pa-
and Miikkulainen (1994) used neural networks for tun- rameter values, resulting in multiple local maxima in
ing the search parameters of an Othello program, but the search space, in which case it is more difficult to
as they mention in their paper, their method is not apply gradient-descent optimization.
easily applicable to more complex games such as chess.
Optimizing Selective Search in Chess
In the next section we present our method for auto- poses. Each of these test positions has a predeter-
matically tuning all the search parameters mentioned mined “correct move”, which the program has to find.
in the previous section by using genetic algorithms. In each generation, each organism searches all the 879
test positions and receives a fitness score based on its
performance. As noted, instead of using the number of
4. Genetic Algorithms for Tuning of
solved positions as a fitness score, we take the number
Search Parameters of nodes the organism visits before finding the cor-
In David-Tabibi et al. (2008a; 2009; 2010) we showed rect move. We record this parameter for each position
that genetic algorithms (GA) can be used to efficiently and compute the total node count for each organism
evolve the parameter values of a chess program’s eval- over the 879 positions. Since the search cannot con-
uation function. Here we present a GA-based method tinue endlessly for each position, a maximum limit of
for optimizing a program’s search parameters. We first 500,000 nodes per position is imposed. If the organ-
describe how the search parameters are represented as ism does not find the correct move when reaching this
a chromosome, and then discuss the details of the fit- maximum node count for the position, the search is
ness function. stopped and the node count for the position is set to
500,000. Naturally, the higher the maximum limit, the
The parameters of the selective search methods which larger the number of solved positions. However, more
were covered in Section 2 can be represented as a bi- time will be spent on each position and subsequently,
nary chromosome, where the number of allocated bits the whole evolution process will take more time.
for each parameter is based on a reasonable value range
of the parameter. Table 1 presents the chromosome The fitness of the organism will be inversely propor-
and the range of values for each parameter (see Sec- tionate to its total node count for all the positions. Us-
tion 2 for a description of each parameter). Note that ing this fitness value rather than the number of solved
for search extensions fractional ply is applied, where 1 positions has the benefit of deriving more fitness in-
ply = 4 units (e.g., an extension value of 2 is equivalent formation per position. Rather than obtaining a 1-bit
to half a ply, etc.). information for solving the position, a numeric value
is obtained which also measures how quickly the posi-
Parameter Value range Bits tion is solved. Thus, the organism is not only “encour-
aged” to solve more positions, it is rewarded for finding
Null-move use 0–1 1
quicker solutions for the already solved test positions.
Null-move reduction 0–7 3
Null-move use adaptivity 0–1 1 Other than the special fitness function described
Null-move adaptivity depth 0–7 3 above, we use a standard GA implementation with
Futility depth 0–3 2 Gray coded chromosomes, fitness-proportional selec-
Futility threshold depth-1 0–1023 10 tion, uniform crossover, and elitism (the best organism
Futility threshold depth-2 0–1023 10 is copied to the next generation). All the organisms
Futility threshold depth-3 0–1023 10 are initialized with random values. The following pa-
Multi-cut use 0–1 1 rameters are used for the GA: population size = 10,
Multi-cut reduction 0–7 3 crossover rate = 0.75, mutation rate = 0.05, number
Multi-cut depth 0–7 3 of generations = 50.
Multi-cut move num 0–31 5 The next section contains the experimental results
Multi-cut cut num 0–7 3 using the GA-based method for optimization of the
Check extension 0–4 3 search parameters.
One-reply extension 0–4 3
Recapture extension 0–4 3
5. Experimental Results
Passed pawn extension 0–4 3
Mate threat extension 0–4 3 We used the Falcon chess engine in our experiments.
Total chromosome length 70 Falcon is a grandmaster-level chess program which
has successfully participated in three World Com-
puter Chess Championships. Falcon uses NegaS-
Table 1. Chromosome representation of 18 search parame-
ters (length: 70 bits).
cout/PVS search, with null-move pruning, internal
iterative deepening, dynamic move ordering (history
+ killer heuristic), multi-cut pruning, selective exten-
For the GA’s fitness function we use a similar opti-
sions (consisting of check, one-reply, mate-threat, re-
mization goal to the one used by Björnsson and Mars-
capture, and passed pawn extensions), transposition
land (2002), namely the total node count. A set of
table, and futility pruning near leaf nodes.
879 tactical test positions from the Encyclopedia of
Chess Middlegames (ECM) is used for training pur-
Optimizing Selective Search in Chess
Each organism is a copy of Falcon (i.e., has the same Match Result W% RD
evaluation function, etc.), except that its search pa- Falcon - Crafty 173.5 - 126.5 57.8% +55
rameters, encoded as a 70-bit chromosome (see Ta- Evol* - Crafty 178.5 - 121.5 59.5% +67
ble 1), are randomly initialized rather than manually Evol* - Falcon 152.5 - 147.5 51.1% +6
tuned. Evol* - RandOrg 714.0 - 286.0 71.4% +159
The results of the evolution show that the total node
count for the population average drops from 239 mil- Table 3. Falcon vs. Crafty, and Evol* vs. Crafty,
lion nodes to 206 million nodes, and the node count Falcon, and randomly initialized organisms (W% is the
for the best organism drops from 226 million nodes to winning percentage, and RD is the Elo rating difference).
199 million nodes. The number of solved positions in-
creases from 488 in the first generation to 547 in the
50th generation. For comparison, the total node count tuned one over many years of world championship level
for the 879 positions due to Björnsson and Marsland’s performance, is by itself a clear demonstration of the
optimization was 229 million nodes, and the number capabilities achieved due to the automatic evolution of
of solved positions was 508 (Björnsson and Marsland, search parameters.
2002).
The results further show that Evol* outperforms
To measure the performance of the best evolved or- Crafty, not only in terms of solving more tactical test
ganism (we call this organism Evol*), we compared positions, but more importantly in its overall strength.
it against the chess program Crafty (Hyatt, Gower, These results establish that even though the search
and Nelson, 1990). Crafty has successfully partic- parameters are evolved from scratch (with randomly
ipated in numerous World Computer Chess Champi- initialized chromosomes), the resulting organism out-
onships (WCCC), and is a direct descendent of Cray performs a grandmaster-level chess program.
Blitz, the WCCC winner of 1983 and 1986. It is fre-
quently used in the literature as a standard reference. 6. Conclusions
First, we let Evol*, Crafty, and the original manu-
In this paper we presented a novel method for au-
ally tuned Falcon process the ECM test suite with 5
tomatically tuning the search parameters of a chess
seconds per position. Table 2 provides the results. As
program. While past attempts yielded limited success
can be seen, Evol* solves significantly more problems
in tuning a small number of search parameters, the
than Crafty and a few more than Falcon.
method presented here succeeded in evolving a large
number of parameters for several search methods, in-
Evol* Falcon Crafty
cluding complicated interdependent parameters of for-
652 645 593 ward pruning search methods.
The search parameters of the Falcon chess engine,
Table 2. Number of ECM positions solved by each program
which we used for our experiments, have been man-
(time: 5 seconds per position).
ually tuned over the past eight years. The fact that
GA manages to evolve the search parameters auto-
The superior performance of Evol* on the ECM test matically, such that the resulting performance is on
set is not surprising, as it was evolved on this training par with the highly refined parameters of Falcon is
set. Therefore, in order to obtain an unbiased per- in itself remarkable.
formance comparison, we conducted a series of 300
Note that the evolved parameter sets are not nec-
matches between Evol* and Crafty, and between
essarily the best parameter sets for every chess pro-
Evol* and Falcon. In order to measure the rating
gram. Undoubtedly, running the evolutionary process
gain due to evolution, we also conducted 1,000 matches
mentioned in this paper on each chess program will
between Evol* and 10 randomly initialized organisms
yield a different set of results which are optimized
(RandOrg). Table 3 provides the results. The table
for the specific chess program. This is due to the
also contains the results of 300 matches between Fal-
fact that the performance of the search component
con and Crafty as a baseline.
of the program depends on other components as well,
The results of the matches show that the evolved pa- most importantly the evaluation function. For exam-
rameters of Evol* perform on par with those of Fal- ple, in a previous paper on extended null-move pruning
con, which have been manually tuned and refined for (David-Tabibi and Netanyahu, 2008b), we discovered
the past eight years. Note that the performance of that while the common reduction value for null-move
Falcon is by no means a theoretical upper bound pruning is R = 2 or R = 3, a more aggressive reduc-
for the performance of Evol*, and the fact that the tion value of adaptive R = 3 ∼ 4 performs better for
automatically evolved program matches the manually Falcon. It is interesting to note that our GA-based
Optimizing Selective Search in Chess
method managed to independently find that these ag- O. David-Tabibi, M. Koppel, and N.S. Netanyahu
gressive reduction values work better for Falcon. (2010). Expert-Driven Genetic Algorithms for
Simulating Evaluation Functions. Genetic Pro-
gramming and Evolvable Machines, accepted
References
for publication (online version available at
T.S. Anantharaman (1991). Extension heuristics. www.springerlink.com/content/3346t8432n718821).
ICCA Journal, 14(2):47–65.
C. Donninger (1993). Null move and deep search: Se-
J. Baxter, A. Tridgell, and L. Weaver (2000). Learning lective search heuristics for obtuse chess programs.
to play chess using temporal-differences. Machine ICCA Journal, 16(3):137–143.
Learning, 40(3):243–263.
J.J. Gillogly (1972). The technology chess program.
D.F. Beal (1989). Experiments with the null move. Ad- Artificial Intelligence, 3(1-3):145–163.
vances in Computer Chess 5, ed. D.F. Beal, pages S. Hammilton and L. Garber (1997). Deep Blue’s
65–79. Elsevier Science, Amsterdam. hardware-software synergy. IEEE Computer,
30(10):29–35.
D.F. Beal and M.C. Smith (1995). Quantification of
search extension benefits. ICCA Journal, 18(4):205– E.A. Heinz (1998). Extended futility pruning. ICCA
218. Journal, 21(2):75–83.
Y. Björnsson and T.A. Marsland (1998). Multi-cut E.A. Heinz (1999). Adaptive null-move pruning. ICCA
pruning in alpha-beta search. In Proceedings of the Journal, 22(3):123–132.
First International Conference on Computers and
Games, pages 15–24, Tsukuba, Japan, November F.-h. Hsu (1999). IBM’s Deep Blue chess grandmas-
1998. ter chips. IEEE Micro, 19(2):70–80.
R.M. Hyatt, A.E. Gower, and H.L. Nelson (1990).
Y. Björnsson and T.A. Marsland (2001). Multi-cut Cray Blitz. Computers, Chess, and Cognition,
alpha-beta-pruning in game-tree search. Theoretical eds. T.A. Marsland and J. Schaeffer, pages 227–237.
Computer Science, 252(1-2):177–196. Springer-Verlag, New York.
Y. Björnsson and T.A. Marsland (2002). Learning con- L. Kocsis and C. Szepesvári (2006). Universal parame-
trol of search extensions. In Proceedings of the 6th ter optimisation in games based on SPSA. Machine
Joint Conference on Information Sciences, pages Learning, 63(3):249–286.
446–449, Durham, NC, March 2002.
D.E. Moriarty and R. Miikkulainen (1994). Evolving
M. Block, M. Bader, E. Tapia, M. Ramirez, K. Gun- neural networks to focus minimax search. In Pro-
narsson, E. Cuevas, D. Zaldivar, R. Rojas (2008). ceedings of the 12th National Conference on Artifi-
Using reinforcement learning in chess engines. Re- cial Intelligence, pages 1371–1377. Seattle, WA, July
search in Computing Science, No. 35, pages 31–40. 1994.
O. David-Tabibi, M. Koppel, and N.S. Netanyahu J. Schaeffer, M. Hlynka, and V. Jussila (2001). Tempo-
(2008). Genetic algorithms for mentor-assisted eval- ral difference learning applied to a high-performance
uation function optimization. In Proceedings of the game-playing program. In Proceedings of the 17th
2008 Genetic and Evolutionary Computation Con- International Joint Conference on Artificial Intelli-
ference, pages 1469–1476. Atlanta, GA, July 2008. gence, pages 529–534. Seattle, WA, August 2001.
O. David-Tabibi and N.S. Netanyahu (2008). Ex- D.J. Slate and L.R. Atkin (1983). Chess 4.5 - The
tended null-move reductions. In Proceedings of the Northwestern University chess program. Chess Skill
6th International Conference on Computers and in Man and Machine, ed. P.W. Frey, pages 82–118.
Games, eds. H.J. van den Herik, X. Xu, Z. Ma, and Springer-Verlag, New York.
M.H.M. Winands, pages 205–216. Springer (LNCS G. Tesauro (1992). Practical issues in temporal differ-
5131), Beijing, China, October 2008. ence learning. Machine Learning, 8(3-4):257–277.
O. David-Tabibi, H.J. van den Herik, M. Koppel, and J. Veness, D. Silver, W. Uther, and A. Blair (2009).
N.S. Netanyahu (2009). Simulating human grand- Bootstrapping from game tree search. Advances
masters: Evolution and coevolution of evaluation in Neural Information Processing Systems 22, eds.
functions. In Proceedings of the 2009 Genetic and Y. Bengio, D. Schuurmans, J. Lafferty, C.K.I.
Evolutionary Computation Conference, pages 1483– Williams, and A. Culotta.
1489. Montreal, Canada, July 2009.