KEMBAR78
Game Theory for CS Educators | PDF | Game Theory | Mathematical Concepts
0% found this document useful (0 votes)
301 views39 pages

Game Theory for CS Educators

Nim is a two-player, zero-sum, discrete, finite, deterministic game of perfect information. Players take turns removing matches from piles until one player is forced to remove the last match, losing the game. For example, in II-Nim there are initially two piles with two matches each, and players alternate removing matches until one pile is empty.

Uploaded by

ramachandra
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
301 views39 pages

Game Theory for CS Educators

Nim is a two-player, zero-sum, discrete, finite, deterministic game of perfect information. Players take turns removing matches from piles until one player is forced to remove the last match, losing the game. For example, in II-Nim there are initially two piles with two matches each, and players alternate removing matches until one pile is empty.

Uploaded by

ramachandra
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Algorithms for

Playing and Solving


games*
Andrew W. Moore
um
Professor P l a y e r Zero-s
* Two
e Finite mes of
School of Computer Science Discret tic Ga
er m i n is
De t f o rmation
e c t In
Carnegie Mellon University Perf
www.cs.cmu.edu/~awm
awm@cs.cmu.edu Small Print

412-268-7599

Note to other teachers and users of these slides. Andrew would be delighted if you found this source
material useful in giving your own lectures. Feel free to use these slides verbatim, or to modify them to fit
your own needs. PowerPoint originals are available. If you make use of a significant portion of these
slides in your own lecture, please include this message, or the following link to the source repository of
Andrew’s tutorials: http://www.cs.cmu.edu/~awm/tutorials . Comments and corrections gratefully received.
Slide 1
Overview
• Definition of games and game terminology
• Game trees and game-theoretic values
• Computing game-theoretic values with recursive
minimax.
• Other ways to compute game-theoretic value: Dynamic
Programming copes with stalemates.
• Alpha-beta algorithm (good news.. it’s not really as fiddly
as is looks)
• Playing games in real-time
• Non-determinism

Slide 2
2-player zero-sum discrete finite
deterministic games of perfect information
What do these terms mean?
• Two player: Duh!
• Zero-sum: In any outcome of any game, Player A’s gains
equal player B’s losses. (Doesn’t mean fairness: “On average, two equal
players will win or lose equal amounts” not necessary for zero-sum.)

• Discrete: All game states and decisions are discrete values.


• Finite: Only a finite number of states and decisions.
• Deterministic: No chance (no die rolls).
• Games: See next page
• Perfect information: Both players can see the state, and
each decision is made sequentially (no simultaneous moves).

Slide 3
Which of these are: 2-player zero-sum discrete finite
deterministic games of perfect information
• Two player: Duh!

• Zero-sum: In any outcome of any


game, Player A’s gains equal player B’s
losses.

• Discrete: All game states and decisions


are discrete values.

• Finite: Only a finite number of states and


decisions.

• Deterministic: No chance (no die


rolls).

• Games: See next page

• Perfect information: Both players


can see the state, and each decision is
made sequentially (no simultaneous
moves).

Slide 4
Which of these are: 2-player zero-sum discrete finite
deterministic games of perfect information
• Two player:
astic
Duh!

fin ite oc h • Zero-sum:


d d en
t i io n
In any outcome of any

N ot S game, Player A’s gains equal player B’s H a t


rm
Info
losses.

• Discrete: All game states and decisions


are discrete values.

• Finite: Only a finite number of states and


decisions.

• Deterministic: No chance (no die


rolls).

y e r • Games: See next page

pl a
One
• Perfect information: Both players
can see the state, and each decision is
made sequentially (no simultaneous
moves).

y e r
u ltipla b ab le
M r o
Imp ior
lves v
Invo al Beha
Anim

Slide 5
Definition
A Two-player zero-sum discrete finite deterministic game of perfect information is a
quintuplet: ( S , I , Succs , T , V ) where

S = a finite set of states (note: state includes information


sufficient to deduce who is due to move)
I = the initial state
Succs = a function which takes a state as input and returns a set of
possible next states available to whoever is due to move
T = a subset of S. It is the terminal states: the set of states at
which the game is over
V = a mapping from terminal states to real numbers. It is the
amount that A wins from B. (If it’s negative A loses money
to B).

Convention: assume Player A moves first.


For convenience: assume turns alternate.
Slide 6
Nim: informal description
1. We begin with a number of piles of matches.
2. In one’s turn one may remove any number of matches from one pile.
3. The last person to remove a match loses.

In II-Nim, one begins with two piles, each with two matches…
S= ( _ , _ )-A ( _ , i )-A ( _ , ii )-A
( i , _ )-A ( i , i )-A ( i , ii )-A
( ii , _ )-A ( ii , i )-A ( ii , ii )-A

( _ , _ )-B ( _ , i )-B ( _ , ii )-B


( i , _ )-B ( i , i )-B ( i , ii )-B
( ii , _ )-B ( ii , i )-B ( ii , ii )-B

Slide 7
Nim: informal description
at es a re
o n e
he s t em
o f t k e t h
1. We begin with a number of piles s o of ematches.. Ma
m r
e t ry , ) - A ) n e v e
2. In one’s turn one smay y m mremove any
n d (ii,_
number off pi le from the pile.
matches
t
k : B y , ii ) - A a
(e . g . le
3. Thenlast
o tricperson(eto.gremove. (_ a match i p ti o nloses.
c o m m a l e n t d e s cr
A q u iv n i ca l
ally e ca n o
iviII-Nim,
trIn m e
soone begins with two matches, each with two piles…
b y ) .
state than right
r
l ar g e S= ( _ , _ )-A ( _ , i )-A ( _ , ii )-A
( i , i )-A ( i , ii )-A
( ii , ii )-A

( _ , _ )-B ( _ , i )-B ( _ , ii )-B


( i , i )-B ( i , ii )-B
( ii , ii )-B

Slide 8
II-Nim
a finite set of states (note:
S = state includes information
( _ , _ )-A ( _ , i )-A ( _ , ii )-A ( i , i )-A ( i , ii )-A ( ii , ii )-A
sufficient to deduce who is
due to move) ( _ , _ )-B ( _ , i )-B ( _ , ii )-B ( i , i )-B ( i , ii )-B ( ii , ii )-B
the initial state
I = ( ii , ii )-A
a function which takes a Succs(_,i)-A = { (_,_)-B } Succs(_,i)-B = { (_,_)-A }
Succs = state as input and returns a
set of possible next states Succs(_,ii)-A = { (_,_)-B , (_,i)-B } Succs(_,ii)-B = { (_,_)-A , (_,i)-A }
available to whoever is due
to move Succs(i,i)-A = { (_,i)-B } Succs(i,i)-B = { (_,i)-A }
Succs(i,ii)-A = { (_,i)-B (_,ii)-B (i,i)-B} Succs(i,ii)-B = { (_,i)-A , (_,ii)-A (i,i)-A }
Succs(ii,ii)-A = { (_,ii)-B , (i,ii)-B } Succs(ii,ii)-B = { (_,ii)-A , (i,ii)-A }
a subset of S. It is the
T = terminal states ( _ , _ )-A ( _ , _ )-B
Maps from terminal states
V = to real numbers. It is the V( _ , _ )-A = +1 V( _ , _ )-B = -1
amount that A wins from B.

Slide 9
S = ( _ , _ )-A ( _ , i )-A ( _ , ii )-A ( i , i )-A ( i , ii )-A ( ii , ii )-A
( _ , _ )-B ( _ , i )-B ( _ , ii )-B ( i , i )-B ( i , ii )-B ( ii , ii )-B

II-Nim Game I

Succs
=
( ii , ii )-A
Succs(_,i)-A = { (_,_)-B } Succs(_,i)-B = { (_,_)-A }

Tree
=
Succs(_,ii)-A = { (_,_)-B , (_,i)-B } Succs(_,ii)-B = { (_,_)-A , (_,i)-A }
Succs(i,i)-A = { (_,i)-B } Succs(i,i)-B = { (_,i)-A }
Succs(i,ii)-A = { (_,i)-B (_,ii)-B (i,i)-B} Succs(i,ii)-B = { (_,i)-A , (_,ii)-A (i,i)-A }
Succs(ii,ii)-A = { (_,ii)-B , (i,ii)-B } Succs(ii,ii)-B = { (_,ii)-A , (i,ii)-A }

T = ( _ , _ )-A ( _ , _ )-B
(ii ii) A V = V( _ , _ )-A = +1 V( _ , _ )-B = -1

(i ii) B (- ii) B

(- ii) A (i i) A (- i) A (- i) A (- -) A +1

(- i) B (- -) B -1 (- i) B (- -) B -1 (- -) B -1

(- -) A +1 (- -) A +1
Slide 10
S = ( _ , _ )-A ( _ , i )-A ( _ , ii )-A ( i , i )-A ( i , ii )-A ( ii , ii )-A
( _ , _ )-B ( _ , i )-B ( _ , ii )-B ( i , i )-B ( i , ii )-B ( ii , ii )-B

II-Nim Game I

Succs
=
( ii , ii )-A
Succs(_,i)-A = { (_,_)-B } Succs(_,i)-B = { (_,_)-A }

Tree
=
Succs(_,ii)-A = { (_,_)-B , (_,i)-B } Succs(_,ii)-B = { (_,_)-A , (_,i)-A }
Succs(i,i)-A = { (_,i)-B } Succs(i,i)-B = { (_,i)-A }
Succs(i,ii)-A = { (_,i)-B (_,ii)-B (i,i)-B} Succs(i,ii)-B = { (_,i)-A , (_,ii)-A (i,i)-A }
Succs(ii,ii)-A = { (_,ii)-B , (i,ii)-B } Succs(ii,ii)-B = { (_,ii)-A , (i,ii)-A }

T = ( _ , _ )-A ( _ , _ )-B
(ii ii) A V = V( _ , _ )-A = +1 V( _ , _ )-B = -1

(i ii) B (- ii) B

(- ii) A (i i) A (- i) A (- i) A (- -) A +1

(- i) B +1 (- -) B -1 (- i) B (- -) B -1 (- -) B -1

(- -) A +1 (- -) A +1
Slide 11
S = ( _ , _ )-A ( _ , i )-A ( _ , ii )-A ( i , i )-A ( i , ii )-A ( ii , ii )-A
( _ , _ )-B ( _ , i )-B ( _ , ii )-B ( i , i )-B ( i , ii )-B ( ii , ii )-B

II-Nim Game I

Succs
=
( ii , ii )-A
Succs(_,i)-A = { (_,_)-B } Succs(_,i)-B = { (_,_)-A }

Tree
=
Succs(_,ii)-A = { (_,_)-B , (_,i)-B } Succs(_,ii)-B = { (_,_)-A , (_,i)-A }
Succs(i,i)-A = { (_,i)-B } Succs(i,i)-B = { (_,i)-A }
Succs(i,ii)-A = { (_,i)-B (_,ii)-B (i,i)-B} Succs(i,ii)-B = { (_,i)-A , (_,ii)-A (i,i)-A }
Succs(ii,ii)-A = { (_,ii)-B , (i,ii)-B } Succs(ii,ii)-B = { (_,ii)-A , (i,ii)-A }

T = ( _ , _ )-A ( _ , _ )-B
(ii ii) A V = V( _ , _ )-A = +1 V( _ , _ )-B = -1

(i ii) B (- ii) B

(- ii) A (i i) A (- i) A (- i) A (- -) A +1

(- i) B +1 (- -) B -1 (- i) B +1 (- -) B -1 (- -) B -1

(- -) A +1 (- -) A +1
Slide 12
S = ( _ , _ )-A ( _ , i )-A ( _ , ii )-A ( i , i )-A ( i , ii )-A ( ii , ii )-A
( _ , _ )-B ( _ , i )-B ( _ , ii )-B ( i , i )-B ( i , ii )-B ( ii , ii )-B

II-Nim Game I

Succs
=
( ii , ii )-A
Succs(_,i)-A = { (_,_)-B } Succs(_,i)-B = { (_,_)-A }

Tree
=
Succs(_,ii)-A = { (_,_)-B , (_,i)-B } Succs(_,ii)-B = { (_,_)-A , (_,i)-A }
Succs(i,i)-A = { (_,i)-B } Succs(i,i)-B = { (_,i)-A }
Succs(i,ii)-A = { (_,i)-B (_,ii)-B (i,i)-B} Succs(i,ii)-B = { (_,i)-A , (_,ii)-A (i,i)-A }
Succs(ii,ii)-A = { (_,ii)-B , (i,ii)-B } Succs(ii,ii)-B = { (_,ii)-A , (i,ii)-A }

T = ( _ , _ )-A ( _ , _ )-B
(ii ii) A V = V( _ , _ )-A = +1 V( _ , _ )-B = -1

(i ii) B (- ii) B

(- ii) A +1 (i i) A +1 (- i) A -1 (- i) A -1 (- -) A +1

(- i) B +1 (- -) B -1 (- i) B +1 (- -) B -1 (- -) B -1

(- -) A +1 (- -) A +1
Slide 13
S = ( _ , _ )-A ( _ , i )-A ( _ , ii )-A ( i , i )-A ( i , ii )-A ( ii , ii )-A
( _ , _ )-B ( _ , i )-B ( _ , ii )-B ( i , i )-B ( i , ii )-B ( ii , ii )-B

II-Nim Game I

Succs
=
( ii , ii )-A
Succs(_,i)-A = { (_,_)-B } Succs(_,i)-B = { (_,_)-A }

Tree
=
Succs(_,ii)-A = { (_,_)-B , (_,i)-B } Succs(_,ii)-B = { (_,_)-A , (_,i)-A }
Succs(i,i)-A = { (_,i)-B } Succs(i,i)-B = { (_,i)-A }
Succs(i,ii)-A = { (_,i)-B (_,ii)-B (i,i)-B} Succs(i,ii)-B = { (_,i)-A , (_,ii)-A (i,i)-A }
Succs(ii,ii)-A = { (_,ii)-B , (i,ii)-B } Succs(ii,ii)-B = { (_,ii)-A , (i,ii)-A }

T = ( _ , _ )-A ( _ , _ )-B
(ii ii) A V = V( _ , _ )-A = +1 V( _ , _ )-B = -1

(i ii) B -1 (- ii) B -1

(- ii) A +1 (i i) A +1 (- i) A -1 (- i) A -1 (- -) A +1

(- i) B +1 (- -) B -1 (- i) B +1 (- -) B -1 (- -) B -1

(- -) A +1 (- -) A +1
Slide 14
S = ( _ , _ )-A ( _ , i )-A ( _ , ii )-A ( i , i )-A ( i , ii )-A ( ii , ii )-A
( _ , _ )-B ( _ , i )-B ( _ , ii )-B ( i , i )-B ( i , ii )-B ( ii , ii )-B

II-Nim Game I

Succs
=
( ii , ii )-A
Succs(_,i)-A = { (_,_)-B } Succs(_,i)-B = { (_,_)-A }

Tree
=
Succs(_,ii)-A = { (_,_)-B , (_,i)-B } Succs(_,ii)-B = { (_,_)-A , (_,i)-A }
Succs(i,i)-A = { (_,i)-B } Succs(i,i)-B = { (_,i)-A }
Succs(i,ii)-A = { (_,i)-B (_,ii)-B (i,i)-B} Succs(i,ii)-B = { (_,i)-A , (_,ii)-A (i,i)-A }
Succs(ii,ii)-A = { (_,ii)-B , (i,ii)-B } Succs(ii,ii)-B = { (_,ii)-A , (i,ii)-A }

T = ( _ , _ )-A ( _ , _ )-B
(ii ii) A -1 V = V( _ , _ )-A = +1 V( _ , _ )-B = -1

(i ii) B -1 (- ii) B -1

(- ii) A +1 (i i) A +1 (- i) A -1 (- i) A -1 (- -) A +1

(- i) B +1 (- -) B -1 (- i) B +1 (- -) B -1 (- -) B -1

(- -) A +1 (- -) A +1
Slide 15
Game theoretic value
Game theoretic value (also know as the minimax value) of a state is:
“the value of a terminal that will be reached assuming both players use
their optimal strategy.”
Easy to fill in the tree bottom up to find minimax values of all states:

Let D = max depth of game tree


= 1 + maximum number of
For i = D to 1
moves in any possible game
For each node n at depth i
If n is a terminal node
MMV(n) = V(n) This must’ve been
Else if Player A is due to move at node n defined because it is
at depth i+1
MMV( n ) = max MMV( n' )
n '∈Succs ( n )

Else (Player B must be due to move and..)


MMV( n ) = min MMV( n' ) Ditto
n '∈Succs ( n )
Slide 16
Game theoretic value
Game theoretic value (also know as the minimax value) of a state is:
“the value of a terminal that will be reached assuming botht he players use
their optimal strategy.”ves in
m o values
n dD
Easy to fill in the tree bottom up to find minimax bD ) of all states:
(
r b a c e O e?
c t o s p a p a c
Let D = max depth g fa n
of gameatree d s s s
i n e l e
a
For i =rD ntoc1h es tim g w ith=moves
1 + maximum number of
t h B t a k h i n in any possible game
Wi For e t h is node n at depth
each
a m e ti
ga m h e s
If n is a tterminal node
e o
dMMV(n)
a n w = V(n) This must’ve been
C Else if Player A is due to move at node n defined because its
at depth i+1
MMV( n ) = max MMV( n' )
n '∈Succs ( n )

Else (Player B must be due to move and..)


MMV( n ) = min MMV( n' ) Ditto
n '∈Succs ( n )
Slide 17
Minimax Algorithm
Is it really necessary to explicitly store the whole tree in memory? Of
course not. We can do the same trick that Depth First Search and use
only O(D) space

MinimaxValue(S)=
If (S is a terminal)
return V(S)
Else
Let { S1, S2, … Sk } = Succs(S)
Let vi = MinimaxValue(Si) for each i
If Player-to-move(S) = A
return max Vi
i∈{1, 2Kk }
else
return min Vi
i∈{1, 2Kk }

Slide 18
Questions • What if there are loops
possible in the game?
MinimaxValue(S)=
If (S is a terminal)
return V(S)
Else
Let { S1, S2, … Sk } = Succs(S)
Let vi = MinimaxValue(Si) for each i
If Player-to-move(S) = A
return max Vi
i∈{1, 2Kk }
else
return i∈{min
1, 2Kk }
Vi

• This is a depth-first search


algorithm. Would a breadth-
first version be possible?
How would it work?

Slide 19
Questions • What if there are loops
possible in the game?
MinimaxValue(S)=
If (S is a terminal) • Is our recursive-minimax
return V(S) guaranteed to succeed?
Else • Is our recursive-minimax
Let { S1, S2, … Sk } = Succs(S) guaranteed to fail?
Let vi = MinimaxValue(Si) for each i • What problems do loops
If Player-to-move(S) = A cause for our definition
of minimax value (i.e.
return max Vi
i∈{1, 2Kk } game-theoretic value)?
else • How could we fix our
return i∈{min
1, 2Kk }
Vi
recursive minimax
program?
• This is a depth-first search
algorithm. Would a breadth-
first version be possible?
How would it work?

Slide 20
Dynamic Programming
Say you have a game with N states. The length of the
game is usually l moves. There are b successors of each
state.
Minimax requires O(bl) states expanded.
This is best-case as well as worst-case (unlike DFS for
simple search problems, which in best-case could be O(l)).
What if the number of states is smaller than bl? e.g.. in
chess, bl=10120, but N= a mere 1040
Dynamic Programming is a better method in those cases, if
you can afford the memory.
DP costs only O(Nl)

Slide 21
DP for Chess Endgames
Suppose one has only, say, 4 pieces in total left on the board.
With enough compute power you can compute, for all such
positions, whether the position is a win for Black, White, or a
draw.

Assume N such positions.

1. With each state, associate an integer. A state code, so there’s a


1-1 mapping between board positions and integers from 0…N-1.
2. Make a big array (2 bits per array entry) of size N. Each element
in the array may have one of three values:
• ?: We don’t know who wins from this state
• W: We know white’s won from here
• B: We know black’s won from here

Slide 22
DP for Chess Endgames (ctd)
3. Mark all terminal states with their values (W or B)
4. Look through all states that remain marked with ?.
For states in which W is about to move:
• If all successor states are marked B, mark the current state as B.
• If any successor state is marked W, mark the current state as W.
• Else leave current state unchanged.
For states in which B is about to move:
• If all successor states are marked W, mark the current state as W.
• If any successor state is marked B, mark the current state as B.
• Else leave current state unchanged
5. Goto 4, but stop when one whole iteration of 4 produces no
changes.
6. Any state remaining at ? is a state from which no-one can force a
win.

Slide 23
Suppose you knew that the only possible outcomes of the
game were -1 and 1. What computation could be saved?
(ii ii) A

(i ii) B (- ii) B

(- ii) A (i i) A (- i) A (- i) A (- -) A +1

(- i) B (- -) B -1 (- i) B (- -) B -1 (- -) B -1

(- -) A +1 (- -) A +1

Slide 24
Suppose you knew that the only possible outcomes of the
game were -1 and 1. What computation could be saved?
(ii ii) A

(i ii) B (- ii) B

(- ii) A (i i) A (- i) A (- i) A (- -) A +1

(- i) B (- -) B -1 (- i) B (- -) B (- -) B -1

(- -) A +1 (- -) A +1

Answer: in general a lot (though not much here). If any successor is a forced
win for the current player, don’t bother with expanding further successors.
What if you didn’t know the range of possible outcome values? We’ll see
that this is an important question.
Slide 25
How can you cut-off
(ii ii) A
with arbitrary terminal
values?
(i ii) B (- ii) B

(- ii) A (i i) A (- i) A (- i) A (- -) A +3.7

(- i) B (- -) B -1.9 (- i) B (- -) B -0.3 (- -) B -8.1

(- -) A +2.4 (- -) A +0.08

Slide 26
How can you cut-off
(ii ii) A
with arbitrary terminal
values?
(i ii) B (- ii) B

(- ii) A (i i) A (- i) A (- i) A (- -) A +3.7

(- i) B (- -) B -1.9 (- i) B (- -) B -0.3 (- -) B -8.1

(- -) A +2.4 +0.08
(- -) A
u d is c ov e r s o mething that
h as n o rm a l, but when yo e r w ith th e rest of
th first se arc ’t b o th
Just do the dep ite ly n o t c h oose you, don
n s y ou r pa re n t would defin
mea n cestors.
c c e ss o rs . a n y o f y o ur a
your su u ld w orr y about, but
u s ho
’s n o t ju s t y ou r parent yo
In fact, it Slide 27
An ancestor causing cut-off
( )-a

( )-b ( )-b

+2 ( )-a ( )-a

( )-b ( )-b

( * )-a ( )-a

+1

Suppose we’ve so far done a full depth first search, expanding left-most
successors first, and have arrived at the node marked (and
discovered its value is +1).
*
What can we cut off in the rest of the search, and why?
General rule. We can be sure a node will not be visited if we’re sure
that either player has a better alternative at any ancestor of that node.
Slide 28
The general cutoff rule
( )-a
In example: let α = max(v1, v3, ?
v1 ( )-b
v5). If min(v6, v7)≤α, then we can v2
be certain that it is worthless
searching the tree from the v3 ( )-a ? ?
current node or the sibling on its
right. ( )-b ?
?
In general: if at a B-move node,
let α = max of all A’s choices v4
expanded on current path. Let β ( )-a
= min of B’s choices, including
those at current node. Cutoff is
v5
β ≤ α. ( )-b
In general: Converse rule at an v6
A-move node. ?
v7
Current
Node
Slide 29
alpha-beta pruning* (from Russell)
function Max-Value (s,α,β)
inputs:
s: current state in game, A about to play
α: best score (highest) for A along path to s
β: best score (lowest) for B along path to s
output: min(β , best-score (for A) available from s)
if ( s is a terminal state )
then return ( terminal value of s )
else for each s’ in Succ(s)
α := max( α , Min-value(s’,α,β))
if ( α ≥ β ) then return β Gujar
ey a
return α
an ks to Am an earlier
Th ti ng out rsion
function Min-Value(s’,α,β) o i n
for p re. The ve
output: max(α , best-score (for B) available from s ) e
typo h see is the
ow
if ( s is a terminal state ) you n version.
t
then return ( terminal value of s) correc
else for each s’ in Succs(s)
β := min( β , Max-value(s’,α,β))
if (β ≤ α ) then return α *Assumes moves are alternate
return β

Slide 30
How useful is alpha-beta?
What is the best possible case performance of alpha beta? Suppose
that you were very lucky in the order in which you tried all the node
successors. How much of the tree would you examine?
In the best case, the number of nodes you need to search in the tree is
O(bd/2)…the square root of the recursive minimax cost.
Questions:
Does alpha-beta behave sensibly with loops?
What can we do about large realsized games with huge numbers of
states (e.g. chess)?

Slide 31
Game-Playing and Game-Solving
Two very different activities.
So far, we have been solely concerned with finding the true game-
theoretic value of a state.
But what do real chess-playing programs do?
They have a couple of interesting features that the search and planning
problems we’ve discussed to date on this course don’t have:
⌂ They cannot possibly find guaranteed solution.
⌂ They must make their decisions quickly, in real time.
⌂ It is not possible to pre-compute a solution.
The overwhelmingly popular solution to these problems are the well-
known heuristic evaluation functions for games.

Slide 32
Eval. functions in games
An evaluation function maps states to a number. The larger the
number, the larger the true game-theoretic position is estimated to be.
Search a tree as deeply as affordable.
Leaves of the tree you search are not leaves of the game tree, but
are instead intermediate nodes.
The value assigned to the leaves are from the evaluation function.

Intuitions
Visibility: the evaluation function will be more accurate nearer the end
of the game, so worth using heuristic estimates from there.
Filtering: if we used the evaluation function without searching, we’d be
using a handful of inaccurate estimates. By searching we are
combining thousands of these estimates, & we hope, eliminating noise.
Dubious intuition. Counter-examples. But often works very well in real games.

Slide 33
Other important issues for real
game playing programs
• How to decide how far to search if you only have a fixed time to
make a decision. What’s the obvious sensible answer?
• Quiescence. What if you stop the search at a state where
subsequent moves dramatically change the evaluation?
• The solution to the quiescence problem is a sensible technique called
quiescence search.
• The horizon problem. What if s is a state which is clearly bad
because your opponent will inevitably be able to do something bad
to you? But you have some delaying tactics. The search algorithm
won’t recognize the state’s badness if the number of delaying moves
exceeds the search horizon.
• Endgames are easy to play well. How?
• Openings fairly easy to play well. How?
Slide 34
What if you think you’re certainly going to lose?
(ii ii) A

(i ii) B (- ii) B

(- ii) A (i i) A (- i) A (- i) A (- -) A +1

(- i) B (- -) B -1 (- i) B (- -) B -1 (- -) B -1

(- -) A +1 (- -) A +1

What should A do in this situation?


What heuristics/assumptions could be used to cause A to
make that decision? Two common methods.
Slide 35
Solving Games
Solving a game means proving the game-theoretic value of the start
state.
Some games have been solved. Usually by brute force dynamic
programming. (e.g. Four-in-a-row, many chess endgames)
Or brute force dynamic programming back from end of game, to create
an end-game database, in combination with alpha-beta search from the
start of the game. (Nine men’s morris)
Or mostly brute force, with some game specific analysis (Connect-4)
Checkers may not be far from being solved.

Solving a game is often very different from playing well at the game.

Slide 36
2 player zero-sum finite NONdeterministic
games of perfect information Nondeterministic
= stochastic

The search tree now includes states where neither player


makes a choice, but instead a random decision is made
according to a known set of outcome probabilities.
( )-a

( )-chance

p=0.5 p=0.5 ( )-b

( )-b ( )-b
( )-chance
+4 -20
p=0.5 p=0.2

( )-a ( )-a ( )-a

-5 +10 +3
Game theory value of a state is the expected final value if both players
are optimal.
If no loops, computing this is almost as easy as recursive minimax. Is there alpha-beta
version?

Slide 37
What you should know
• What makes a game a Two Player Zero-Sum Discrete
Finite Deterministic Game of Perfect Information
• What is the formal definition of the above
• What is a Game Tree
• What is the minimax value of a game
• What assumptions minimax makes about the game
• Minimax Search
• Alpha Beta Search
• Use of Evaluation Functions for very big games
• Why it’s easy to extend this to Two Player Zero-Sum
Discrete Finite Stochastic Game of Perfect Information

Slide 38
What you should know
• What makes a game a Two Player Zero-Sum Discrete
Finite Deterministic Game of Perfect Information
• What is the formal definition of the above
• What is a Game Tree p t ion,
ec e
• What is the minimax value of
ff i n
a g , d
game a b out
i n g b lu r t ai nty hing
e q u ir un ce v e ryt !
• What p: assumptions m es, minimax
r
g an d makes
n t … e aboutor ld the game
t U fg a i n w a al w
N•exMinimax s o c h e m a l l y e r e
c l asse Searche a k y s
n d s ” re rt in th
O ther a nd sn led “frie king pa
• ltAlpha
ru ism Beta o- c aSearch
l f o r ta
a y o u rs n e ed
hat ofysEvaluation
• wUse tem s Functions for very big games
I s
o ur A
• Why it’s easy to extend this to Two Player Zero-Sum
Discrete Finite Stochastic Game of Perfect Information

Slide 39

You might also like