Combinatorial Search
Li Yin1
March 29, 2020
1
www.liyinscience.com
ii
Combinatorial Search
So far, we have learned the most fundamental search strategies on general
data structures such as array, linked list, graph, and tree. In this chapter,
instead of searching on explicit and well defined data structures, we extend
and discuss more exhaustive search algorithms that can solve rather obscure
and challenging combinatorial problems, such as sudoku and the famous
Travels Salesman Problem. For combinatorial problems, we have to figure
out the potential search space, and rummage a solution.
0.1 Introduction
Combinatorial search problems consists of n items and a requirement to
find a solution, i.e., a set of L < N items that satisfy specified conditions or
constraints. For example, a sudoku problem where a 9 × 9 grid is partially
filled with number between 1 and 9, fill the empty spots with numbers that
satisfy the following conditions:
Figure 1: A Sudoku puzzle and its solution
1. Each row has all numbers form 1 to 9.
2. Each column has all numbers form 1 to 9.
iii
iv COMBINATORIAL SEARCH
3. Each sub-grid (3 × 3) has all numbers form 1 to 9.
This sudoku together with one possible solution is shown in Fig. 1. In this
case, we have 81 items, and we are required to fill 51 empty items with the
above three constraints.
Model Combinatorial Search Problems We can model the combina-
torial search solution as a vector s = (s0 , s1 , ..., sL−1 ), where each variable si
is selected from a finite set A, which is called the domain for each variable.
Such a vector might represent an arrangement where si contains the i-th
item of a permutation, in the combination problem, a boolean denotes if
the i-th item is selected already, or it can represent a path in a graph or a
sequence of moves in a game. In the sudoku problem, each si can choose
from a number in range [1, 9].
Problem Categories Combinatorial search problems arise in many areas
of computer science such as artificial intelligence, operations search, bioinfor-
matics, and electronic commerce. These problems typically involve finding
a grouping, ordering, or assignment of a discrete, finite set of objects that
satisfy given conditions or constraints. We introduce two well-studied types
of problems that are more likely to be NP-hard and of at least exponential
complexity:
1. Constraint Satisfaction Problems (CSP) are mathematical questions
defined as a set of variables whose state must satisfy a number of con-
straints or limitations(mathematical equations or inequations), such
as Sudoku, N-queen, map coloring, Crosswords, and so on. The size
of the search space of CSPs can be roughly given as:
O(cdL ) (1)
Where there are L variables, each with domain size d, and there are c
constraints to check out.
2. Combinatorial optimization problems consist of searching for maxima
or minima of an objective function F whose domain is a discrete but
large configuration space. Some classic examples are:
• Travelling Salesman Problems (TSP): given position (x, y) of n
different cities, find the shortest possible path that visits each
city exactly once.
• Integer Linear Programming: maximize a specified linear com-
bination of a set of integers X1 , .., Xn subject to a set of linear
constraints each of the form:
a1 X1 + ... + an Xn ≤ c (2)
0.1. INTRODUCTION v
• Knapsack Problems: Given a set of items, each with a weight
and a value, determine the number of each item to include in a
collection so that the total weight is less than or equal to a given
limit and the total value is as large as possible.
Search Strategies From Chapter Discreet Programming, we have learned
the basic enumerative combinatorics, including counting principles and knowl-
edge on permutations, combinations, partitions, subsets, and subsequences.
Combinatorial Search builds atop this subject, and together through differ-
ent search strategies such as depth-first search and best-first search, it is
able to enumerate the search space and find the solution(s) with necessary
speedup methods. In this chapter, we only discuss about complete search
and only acknowledge the existence of approximate search techniques.
Backtracking is a process of depth-first based search where it “builds”
the search tree on the fly incrementally instead of having a tree/graph struc-
ture beforehand to search through. Backtracking fits to solve combinatorial
search problems because:
1. It is space efficient for the usage of a DFS and the candidates are built
incrementally and their validity to fit a solution is checked right away.
2. It is time efficient for that some partial candidates can be pruned if the
algorithm believes that it will not lead to our final complete solution.
Because the ordering of variables s0 , ..., sL−1 can potentially affect the
size of the search space sometimes. Thus, backtracking search relies on one
or more heuristics to select which variable to consider next. Look-ahead is
one such heuristic that is preferably applied to check the effects of choosing
a given variable to evaluate or to decide the order of values to give to it.
There are other Breath-first Search based strategies that might work
better than backtracking, such as for combinatorial optimization problems,
best-first branch and bound search might be more efficient than its depth-
first counterpart.
Speedups The speedup methods are well studied in computer science,
and we list two general ways to prune unqualified or unnecessary branches
during the search of backtracking:
1. Branch and Prune: This method prunes the unqualified branches with
constraints of the problems. This is usually applied to solve constraint
restricted problems (CSPs).
2. Branch and Bound: This method prunes unnecessary branches via
comparing an estimation of a partial candidate with a found global
best solution. If the estimation states that the partial candidate will
vi COMBINATORIAL SEARCH
never lead us to a better solution, we cut off this branch. This tech-
nique can be applied to solve a general optimization problems, such
as Travel Salesman Problems (TSP), knapsack problems, and so.
0.2 Backtracking
In this section, we first introduce the technique of backtracking, and then
demonstrate it by implementing common enumerative combinatorics seen
in Chapter Discreet Programming.
0.2.1 Introduction
Backtracking search is an exhaustive search algorithm(depth-first search)
that systematically assigns all possible combinations of values to the vari-
ables and checks if these assignments constitute a solution. Backtracking is
all about choices and consequences and it shows the following two properties:
1. No Repetition and Completion: It is a systematic generating
method that enumerates all possible states exactly at most once: it will
not miss any valid solution but avoids repetitions. If there exists “cor-
rect” solution(s), it is guaranteed to be found. This property makes
it ideal for solving combinatorial problems where the search space has
to be constructed and enumerated. Therefore, the worst-case running
time of backtracking search is exponential with the length of the state
(bL , b is the average choice for each variable in the state).
2. Search Pruning: Along the way of working with partial solutions,
in some cases, it is possible for us to decide if they will lead to a valid
complete solution. As soon as the algorithm is confident to say the
partial configuration is either invalid or nonoptimal, it abandons this
partial candidate, an then “backtracks” (return to the upper level),
and resets to the upper level’s state so that the search process can
continue to explore the next branch for the sake of efficiency. This
is called search pruning with which the algorithm ends up amortizely
visiting each vertex less than once. This property makes backtracking
the most promising way to solve CSPs and combinatorial optimization
problems.
Solving sudoku problem with backtracking algorithm, each time at a level
in the DFS, it tries to extend the last partial solution s = (s0 , s1 , ..., sk ) by
trying out all 9 numbers at sk+1 , say we choose 1 at this step. It testifies
the partial solution with the desired solution:
1. If the partial solution s = (s0 , s1 , ..., sk , 1) is still valid, move on to the
next level and work on trying out sk+2 .
0.2. BACKTRACKING vii
2. If the partial solution is invalid and is impossible to lead to a complete
solution, it “backtracks” to the last level and resets the state as s =
(s0 , s1 , ..., sk ) so that it can try our other choices if there are some
left(which in our example, we will try sk+1 = 2) or keep “backtracking”
to even upper level.
The process should be way clearer once we have learned the examples in the
following subsections.
0.2.2 Permutations
Given a list of items, generate all possible permutations of these items. If
the set has duplicated items, only enumerate all unique permutations.
No Duplicates(L46. Permutations)
When there are no duplicates, from Chapter Discreet Programming, we
know the number of all permutations are:
n!
p(n, m) = (3)
(n − m)!
where m is the number of items we choose from the total n items to make
the permutations.
For example :
a = [1 , 2 , 3]
There a r e 6 t o t a l permutations :
[1 , 2 , 3] , [1 , 3 , 2] ,
[2 , 1 , 3] , [2 , 3 , 1] ,
[3 , 1 , 2] , [3 , 2 , 1]
Analysis Let us apply the philosophy of backtracking technique. We have
to build a state with length 3, and each variable in the state has three choices:
1, 2, and 3. The constraint here comes from permutation which requires that
no two variables in the state will be having the same value. To build this
incrementally with backtracking, we state with an empty state []. At first,
we have three options, we get three partial results [1], [2], and [3]. Next, we
handle the second variable in the state: for [1], we can choose either 2 or
3,getting [1, 2] and [1, 3]; same for [2], where we end up with [2, 1] and [2, 3];
for [3], we have [3, 1] and [3, 2]. At last, each partial result has only one
option, we get all permutations as shown in the example. We visualize this
incrementally building candidates in Fig. 2.
However, we only managed to enumerate the search space, but not sys-
tematically or recursively with the Depth-first search process. With DFS,
we depict the traverse order of the vertexes in the virtual search space with
red arrows in Fig. 2. The backward arrows mark the “backtracking” process,
where we have to reset the state to the upper level.
viii COMBINATORIAL SEARCH
Figure 2: The search tree of permutation
Implementation We use a list of boolean bUsed to track which item is
used in the search process. n is the total number of items, d is the depth of
the depth-first search process, curr is the current state, and ans is to save
all permutations. The following code, we generate p(n, m)
1 d e f p_n_m( a , n , m, d , used , c u r r , ans ) :
2 i f d == m: #end c o n d i t i o n
3 ans . append ( c u r r [ : : ] )
4 return
5
6 f o r i in range (n) :
7 i f not used [ i ] :
8 # g e n e r a t e t h e next s o l u t i o n from c u r r
9 c u r r . append ( a [ i ] )
10 used [ i ] = True
11 print ( curr )
12 # move t o t h e next s o l u t i o n
13 p_n_m( a , n , m, d + 1 , used , c u r r , ans )
14 #b a c k t r a c k t o p r e v i o u s p a r t i a l s t a t e
15 c u r r . pop ( )
16 used [ i ] = F a l s e
17 return
Check out the running process in the source code.
Alternative: Swapping Method We first start with a complete state,
such that s = [1, 2, 3] in our case. By swapping 1 and 2, we get [2, 1, 3] and
[2, 3, 1] can be obtained by swapping 1 and 3 on top of [2, 1, 3]. With all
permutations as leaves in the search space, the generating process is similar
to Fig. 2. We show this alternative process in Fig. 3. At first, we swap index
0 with all other indexes, including 0, 1, and 2. At the second layer, we move
on to swap index 1 with all other successive indexes, and so on for all other
layers. The Python code is as:
0.2. BACKTRACKING ix
Figure 3: The search tree of permutation by swapping. The indexes of items
to be swapped are represented as a two element tuple.
1 ans = [ ]
2 d e f permutate ( a , d ) :
3 g l o b a l ans
4 i f d == l e n ( a ) :
5 ans . append ( a [ : : ] )
6 f o r i in range (d , len ( a ) ) :
7 a[ i ] , a[d] = a[d] , a[ i ]
8 permutate ( a , d+1)
9 a[ i ] , a[d] = a[d] , a[ i ]
10 return
There is Johnson-Trotter algorithm that utilizes such swapping method,
which avoids recursion, and instead computes the permutations by an iter-
ative method.
With Duplicates(47. Permutations II)
We have already know that p(n, n) is further decided by the duplicates
within the n items. Assume we have in total of d items are repeated, and
each item is repeated xi times, then the number of all arrangements pd(n, n)
are:
p(n, n)
pd(n, n) = , (4)
x0 !x1 !...xd−1
d−1
(5)
X
w.r.t xi ≤ n
i=0
For example, when a = [1, 2, 2, 3], there are 4!
2! unique permutations, which
is 12 in total, and are listed as bellow:
x COMBINATORIAL SEARCH
[1 , 2, 2, 3] , [1 , 2, 3, 2] , [1 , 3, 2, 2] ,
[2 , 1, 2, 3] , [2 , 1, 3, 2] , [2 , 2, 1, 3] ,
[2 , 2, 3, 1] , [2 , 3, 1, 2] , [2 , 3, 2, 1] ,
[3 , 1, 2, 2] , [3 , 2, 1, 2] , [3 , 2, 2, 1]
Figure 4: The search tree of permutation with repetition
Analysis The enumeration of these all possible permutations can be ob-
tained with backtracking exactly the same as if there are no duplicates.
However, this is not efficient since it has doubled the search space with re-
peated permutations. Here comes to our first time applying the Branch and
Prune method: we avoid repetition by pruning off redundant branches.
One main advantage of backtracking is not to save all intermediate states,
thus we should find a mechanism that avoids generating these intermediate
states at the first place. One solution is that we sort all n items, making all
repeat items adjacent to each other. We know if the current intermediate
state is redundant by simply comparing this item with its predecessor: if it
equals, we move on from building state with this item to the next item in
line. The search tree of our example is shown in Fig. 4.
Implementation The implementation is highly similar to previous stan-
dard permutation code other than three different points:
1. Before the items are called by permutate, they are sorted first.
2. A simple condition check to avoid generating repeat states.
3. We used a dictionary data structure tracker which has all unique
items as keys and each item’s corresponding occurrence as values to
replace the boolean vector used for slightly better space efficiency.
The Python code is as:
1 from c o l l e c t i o n s import Counter
2 d e f permuteDup ( nums , k ) :
3 ans = [ ]
0.2. BACKTRACKING xi
4 d e f permutate ( d , n , k , c u r r , t r a c k e r ) :
5 n o n l o c a l ans
6 i f d == k :
7 ans . append ( c u r r )
8 return
9 f o r i in range (n) :
10 i f t r a c k e r [ nums [ i ] ] == 0 :
11 continue
12 i f i − 1 >= 0 and nums [ i ] == nums [ i − 1 ] :
13 continue
14 t r a c k e r [ nums [ i ] ] −= 1
15 c u r r . append ( nums [ i ] )
16
17 permutate ( d+1, n , k , c u r r [ : ] , t r a c k e r )
18 c u r r . pop ( )
19 t r a c k e r [ nums [ i ] ] += 1
20 return
21
22 nums . s o r t ( )
23 permutate ( 0 , l e n ( nums ) , k , [ ] , Counter ( nums ) )
24 r e t u r n ans
Can you extend the swap method based permutation to
handle duplicates?
Discussion
From the example of permutation, we have demonstrated how backtracking
works to construct candidates with an implicit search tree structure: the root
node is the initial state, any internal node represents intermediate states,
and all leaves are our candidates which in this case there are n! for p(n, n)
permutation. In this subsection, we want to point out the unique properties
and its computational and space complexities.
Two Passes Backtracking builds an implicit search tree on the fly, and
it does not memorize any intermediate state. It visits the vertices in the
search tree in two passes:
1. Forward pass: it builds the solution incrementally and reaches to the
leaf nodes in a DFS fashion. One example of forward pass is []− >
[1]− > [1, 2]− > [1, 2, 3].
2. Backward pass: as the returning process from recursion of DFS, it
also backtracks to previous state. One example of backward pass is
[1, 2, 3]− > [1, 2], − > [1].
xii COMBINATORIAL SEARCH
First, the forward pass to build the solution incrementally. The change of
curr in the source code indicates all vertices and the process of backtracking,
it starts with [] and end with []. This is the core character of backtracking.
We print out the process for the example as:
[] − >[1] − >[1 , 2] − >[1 , 2 , 3]−> b a c k t r a c k : [ 1 , 2 ]
backtrack : [ 1 ]
[ 1 , 3] − >[1 , 3 , 2]−> b a c k t r a c k : [ 1 , 3 ]
backtrack : [ 1 ]
backtrack : [ ]
[2] − >[2 , 1] − >[2 , 1 , 3]−> b a c k t r a c k : [ 2 , 1 ]
backtrack : [ 2 ]
[ 2 , 3] − >[2 , 3 , 1]−> b a c k t r a c k : [ 2 , 3 ]
backtrack : [ 2 ]
backtrack : [ ]
[3] − >[3 , 1] − >[3 , 1 , 2]−> b a c k t r a c k : [ 3 , 1 ]
backtrack : [ 3 ]
[ 3 , 2] − >[3 , 2 , 1]−> b a c k t r a c k : [ 3 , 2 ]
backtrack : [ 3 ]
backtrack : [ ]
Time Complexity of Permutation In the search tree of permutation
in Fig. 2, there are in total V nodes, which equals to ni=0 pkn . Because
P
in a tree the number of edges |E| is |v| − 1, making the time complexity
O(|V | + |E|) the same as of O(|V |). Since p(n, n) itself alone takes n! time,
making the permutation an NP-hard problem.
Space Complexity A standard depth-first search consumes O(bd) space
in worst-case to execute, where b is branching factor and d is the depth of
the search tree. In the combinatorial search problems, usually depth and
branching is decided by the total number of variables in the state, making
b ∼ d ∼ n. In backtracking, we have space complexity O(n2 ). However, in
normal standard DFS, the input–tree or graph data structure–is given and
not attributed to space complexity. For a NP-hard combinatorial search
problem, this input is often exponential. Backtracking search outcompetes
the standard DFS by avoiding such space consumption; it only keeps a
dynamic data structure(curr) to construct node on the fly.
0.2.3 Combinations
Given a list of n items, generate all possible combinations of these items. If
the input has duplicated items, only enumerate unique combinations.
0.2. BACKTRACKING xiii
No Duplicates (L78. Subsets )
From Chapter Discrete Programming, we list the powerset–all m-subset,
m ∈ [0, n] as:
P (n, m) n!
C(n, m) = = (6)
P (m, m) (n − m)!m!
For example, when a = [1, 2, 3], there are in total 7 m-subsets, they are:
C( 3 , 0) : [ ]
C( 3 , 1) : [ 1 ] , [ 2 ] , [ 3 ]
C( 3 , 2) : [ 1 , 2 ] , [ 1 , 3 ] , [ 2 , 3 ]
C( 3 , 3) : [ 1 , 2 , 3 ]
Figure 5: The Search Tree of Combination.
Analysis We can simply reuse the method of permutation, but with a
problem that it generates lots of duplicates. For example, P (3, 2) includes
[1, 2] and [2, 1] which are indeed the same subset. Of course, we can check
redundancy with saved m-subsets, but its not ideal. A systematical solution
that avoids duplicates all along is preferred. If we limit the items we put
into the m-subsets to be only increasing(of indexes of items or of values of
items), in which case [2, 1], [3, 1], and [3, 2] will never be generated. The
enumeration of combination through backtracking search is shown in Fig. 5.
Implementation Two modifications based on permutation code:
1. for loop: in the loop to iterate all possible candidates, we limit the
candidates to be having larger indexes only.
2. We do not have to use a data structure to track the state of each candi-
date because any candidate that has larger index is a valid candidate.
xiv COMBINATORIAL SEARCH
We use start to track the starting position of valid candidates. The code
of combination is:
1 d e f C_n_k( a , n , k , s t a r t , d , c u r r , ans ) :
2 i f d == k : #end c o n d i t i o n
3 ans . append ( c u r r [ : : ] )
4 return
5
6 f o r i in range ( s t a r t , n) :
7 c u r r . append ( a [ i ] )
8 C_n_k( a , n , k , i +1 , d+1, c u r r , ans )
9 c u r r . pop ( )
10 return
Alternative: 0 and 1 Selection We have discussed that a powerset
written as P (S). With each item either being appear or not appear in the
resulting set makes the value set {0, 1}, resulting |P (S)| = 2n . Follow this
pattern, with our given example, we can alternatively generate a powerset
like this:
s sets
1 { 1 } , {}
2 { 1 , 2 } , { 1 } , { 2 } , {}
3 { 1 , 2 , 3 } , { 1 , 2 } , { 1 , 3 } , { 3 } , { 2 , 3 } , { 2 } , { 3 } , {}
This process can be better visualized in a tree as in Fig. ??. We can see this
process results 2n leaves compared with our previous implementation which
has a total of 2n nodes is slightly less efficient. The code is as:
1 d e f p o w e r s e t ( a , n , d , c u r r , ans ) :
2 i f d == n :
3 ans . append ( c u r r [ : : ] )
4 return
5
6 # Case 1 : s e l e c t item
7 c u r r . append ( a [ d ] )
8 p o w e r s e t ( a , n , d + 1 , c u r r , ans )
9 # Case 2 : not s e l e c t item
10 c u r r . pop ( )
11 p o w e r s e t ( a , n , d + 1 , c u r r , ans )
12 return
Time Complexity The total nodes within the implicit search space of
combination shown in Fig. 5 is nk=0 Cnk = 2n , which was explained in
P
Chapter Discreet Programming. Thus, the time complexity of enumerating
the powset is O(2n ) and is less compared with O(n!) that comes with the
permutation.
Space Complexity Similarly, combination with backtracking search uses
slightly less space. But, we can still acclaim the upper bound to be O(n2 ).
0.2. BACKTRACKING xv
With Duplicates(L90. Subsets II)
Assume we have m unqiue items, and the frequency of each is marked as xi ,
with m−1
i=0 xi = n.
P
n m−1
c(n, k) = (xi + 1) (7)
X Y
k=0 i=0
For example, when a = [1, 2, 2, 3], there are 2 × 3 × 2 = 12 combinations
in the powerset, they are listed as bellow:
[ ] , [1] , [2] , [3] , [1 , 2] , [1 , 3] , [2 , 2] , [2 , 3] ,
[1 , 2 , 2] , [1 , 2 , 3] , [2 , 2 , 3] ,
[1 , 2 , 2 , 3]
However, counting c(n, k) with duplicates in the input replies on the specific
input with specific distribution of these items. We are still able to count by
enumerating with backtracking search.
Analysis and Implementation The enumeration of the powerset with
backtracking search is the same as handling the iterations of choice in the
enumeration of permutation with duplicates. We first sort our items in
increasing order of the values. Then we replace the for loop from the above
code with the following code snippet to handle the repetition of items from
the input:
1 f o r i in range ( s t a r t , n) :
2 i f i − 1 >= s t a r t and a [ i ] == a [ i − 1 ] :
3 continue
4 ...
0.2.4 More Combinatorics
In this section, we supplement more use cases of backtracking search in the
matter of other types of combinatorics.
All Paths in Graph
For a given acyclic graph, enumerate all paths from a starting vertex s. For
example, for the graph shown in Fig. 6, and a starting vertex 0, print out
the following paths:
0 , 0−>1, 0−>1−>2, 0−>1−>2−>5, 0−>1−>3, 0−>1−>4, 0−>2, 0−>2−>5
Analysis The backtracking search here is the same as how to apply a DFS
on an explicit graph, with rather one extra point: a state path which might
have up to n items ( the total vertices of a graph). In the implementation,
the path vector will dynamically be modified to track all paths constructed
as the go of the DFS. The code is offered as:
xvi COMBINATORIAL SEARCH
Figure 6: Acyclic graph
1 d e f a l l _ p a t h s ( g , s , path , ans ) :
2 ans . append ( path [ : : ] )
3 for v in g [ s ] :
4 path . append ( v )
5 a l l _ p a t h s ( g , v , path , ans )
6 path . pop ( )
You can run the above code in the Goolge Colab to see how it works on our
given example.
Subsequences(940. Distinct Subsequences II)
Given a string, list all unique subsequences. There may or may not exist
duplicated characters in the string. For example, when s =0 1230 , there are
in total 7 subsequences, which are:
'', '1 ' , '2 ' , '3 ' , '12 ' , '13 ' , '23 ' , '123 '
When s =0 12230 which comes with duplicates, there are 12 subsequences:
' ' , '1 ' , '2 ' , '3 ' , '12 ' , '13 ' , '22 ' , '23 ' ,
'122 ' , '123 ' , '223 ' ,
'1223 '
Analysis From Chapter Discrete Programming, we have explained that
we can count the number of unique subsequences through recurrence relation
and pointed out the relation of subsquences with subsets(combinations).
Let the number of unique subsequences of a sequence as seq(n) and the
number of unique subsets of a set as set(n) with n items in the input.
All subsequences are within subsets, and the subsequence set has larger
cardinality than subsets, |seq(n)| ≥ |set(n)|. From the above example, we
can also see that when there are only unique items in the sequence or when
there are duplicates but all duplicates of an item are adjacent to each other:
• The cardinality of subsequences and subsets equals, |seq(n) = set(n)|.
0.2. BACKTRACKING xvii
• The subsequences and subsets share the same items when the ordering
of the subsequences are ignored.
This indicates that the process of enumerating subsequences is almost the
same as of enumerating a powerset. This should give us a good start.
Implementation However, if we change the ordering of the duplicated
characters in the above string as s =0 12320 , there are in total 14 subse-
quences instead:
' ' , '1 ' , '2 ' , '3 ' , '12 ' , '13 ' , '23 ' , '22 ' , '32 ' ,
'123 ' , '122 ' , '132 ' , '232 ' ,
'1232 '
Figure 7: The Search Tree of subsequences.The red circled nodes are redun-
dant nodes. Each node has a variable s to indicate the starting index of
candidates to add to current subsequence. i indicate the candidate to add
to the current node.
Therefore, our code to handle duplicates should differ from that of a pow-
erset. In the case of powerset, the algorithm first sorts items so that all
duplicates are adjacent to each other, making the checking of repetition as
simple as checking the equality of item with its predecessor. However, in
a given sequence, the duplicated items are not adjacent most of the time,
we have to do things differently. We draw the search tree of enumerating
all subsequences of string “1232” in Fig. 7. From the figure, we can observe
that to avoid redundant branches, we simply check if a current new item in
the subsequence is repeating by comparing it with all of its predecessors in
range [s, i]. The code for checking repetition is as:
xviii COMBINATORIAL SEARCH
1 def check_repetition ( start , i , a) :
2 f o r j in range ( s t a r t , i ) :
3 i f a [ i ] == a [ j ] :
4 r e t u r n True
5 return False
And the code to enumerate subsequences is:
1 d e f s u b s e q s ( a , n , s t a r t , d , c u r r , ans ) :
2 ans . append ( ' ' . j o i n ( c u r r [ : : ] ) )
3 i f d == n :
4 return
5
6 f o r i in range ( s t a r t , n) :
7 i f check_repetition ( start , i , a) :
8 continue
9 c u r r . append ( a [ i ] )
10 s u b s e q s ( a , n , i +1, d+1 , c u r r , ans )
11 c u r r . pop ( )
12 return
0.2.5 Backtracking in Action
So far, we have applied backtracking search to enumerate combinatorics.
In this section, we shall see how backtracking search along with search
pruning speedup methods solve two types of challenging NP-hard problems:
Constraint Satisfication Problems (CSPs) and Combinatorial Optimization
Problems.
As we have briefly introduced the speedup methods needed to solve larger
scale of CSPs and COPs. For example, assume within the virtual search tree,
the algorithm is currently at level 2 with state s = [s0 , s1 ]. If there are c
choices for state s1 , and if one choice is testified to be invalid, this will prune
off 1c of the whole search space. In this section, we demonstrate backtracking
search armored with Branch and Prune method solving CSPs and Branch
and Bound solving COPs.
0.3 Solving CSPs
Officially, a constraint satisfaction problem(CSP) consists of a set of n vari-
ables, each denoted as si , i ∈ [0, n − 1]; their respective value domains, each
denoted as di ; and a set of m constraints, each denoted as cj , j ∈ [0, m−1]. A
solution to a CSP is an assignment of values to all the variables such that no
constraint is violated. A binary CSP is one in which each of the constraints
involves at most two variables. A CSP can be represented by a constraint
graph which has a node for each variable and each constraint, and an arc
connecting variable nodes contained in a constraint to the corresponding
constraint node.
0.3. SOLVING CSPS xix
We explain a few strategies from the CSP-solver’s arsenal that can po-
tentially speedup the process:
1. Forward Checking: The essential idea is that when a variable X from
si is instantiated with a value x from its domain di , the domain of each
future uninstantiated variable Y is examined. If a value y is found such
that X = x conflicts with Y = y, then y is temporarily removed from
the domain of Y .
2. Variable Ordering: The order in which variables are considered while
solving a CSP method can have a substantial impact on the search
space. One effective ordering is always select the next variable with
the smallest remaining domain. In a dynamic variable ordering, the
order of variables is determined as the search progresses, and often
goes with forward checking which keeps updating the uninstantiated
variables’ domains. Selecting variable with the minimal domain first
can pinpoint the solution quickly given the fact that the branch is still
early on, and branch pruning at this stage is more rewarding. Another
reasoning is that each step, when we are multiplying di to the cost, we
are adding the least expensive one, making this a greedy approach.
Sudoku (L37)
Figure 8: A Sudoku puzzle and its solution
A Sudoku grid shown in Fig. 8 is a n2 × n2 grid, arranged into n n × n
mini-grids each containing the values 1, ..., n such that no value is repeated
in any row, column (or mini-grid).
Search Space First, we analyze the number of distinct states in the search
space which relies on how we construct the intermediate states and our
knowledge in Enumerative combinatorics. We discuss two different formu-
lations on 9 × 9 grid:
xx COMBINATORIAL SEARCH
1. For each empty cell in the puzzle, we create a set by taking values
1, ..., 9 and removing from it those values that appear as a given in the
same row, column, or mini-grid as that cell. Assume we have m spots
and the corresponding candidate set of each spot is ci , and initial cost
estimation can be obtained which is:
m−1
T (n) = (8)
Y
ci
i=0
2. Each row can be presented by a 9-tuples, there will be 9 rows in to-
tal, resulting 9 9-tuples to represent the search state. With ci as the
number of non-given values in the i-th 9-tuples, there are ci ! ways of
ordering these values by permuting.The number of different states in
the search space is thus:
8
T (n) = ci ! (9)
Y
i=0
The two different ways each takes a different approach to formulate the state
space, making its corresponding backtracking search differs too. We mainly
focus on the first formulation with backtracking search.
Speedups Assume we have known all empty spots(variables) to fill in
and we construct the search tree using backtracking. In our source code, we
did an experiment comparing the effect of ordering variables with minimal
domain first rule with arbitrary ordering. The experiment shows that the
first method is more than 100 times faster than the second solving the our
exemplary Sudoku puzzle. Therefore, we decide to always select the variable
that has the least domain set to proceed next in the backtracking.
Further, we apply forward checking, for the current variable and a value
we are able to assign, we recompute all the remaining empty spots’ domain
sets, and use the updated domain sets to decide:
• If this assigment will lead to empty domain for any of other remaining
spots, and if so, we terminate the search and backtrack.
• The spot to select next time with the ordering rule we choose.
Implementation We set aside three vectors of length 9, row_state, col_state,
and block_state to track the state of all 9 rows, columns, and grids. The
list has set() data structures as items, saving the numbers filled already in
that row, col, and grid respectively. Two stages in the implementation:
1. Initialization: We scan the whole each spot in the 9 × 9 grid to record
the states of the filled spots and to find all empty spots that waiting to
0.3. SOLVING CSPS xxi
be filled in. With (i, j) to denote the position of a spot, it corresponds
to row_state[i], col_state[j], and block_state[i//3][j//3]. We
also write two functions to set and reset state with one assignment in
the backtracking. The Python code is as follows:
1 from copy import deepcopy
2 c l a s s Sudoku ( ) :
3 d e f __init__ ( s e l f , board ) :
4 s e l f . org_board = deepcopy ( board )
5 s e l f . board = deepcopy ( board )
6
7 def i n i t ( s e l f ) :
8 s e l f .A = s e t ( [ i f o r i i n r a n g e ( 1 , 1 0 ) ] )
9 s e l f . row_state = [ s e t ( ) f o r i i n r a n g e ( 9 ) ]
10 s e l f . col_state = [ s e t ( ) f o r i in range (9) ]
11 s e l f . block_state = [ [ s e t ( ) f o r i in range (3) ] f o r i in
range (3) ]
12 self . unfilled = []
13
14 f o r i in range (9) :
15 f o r j in range (9) :
16 c = s e l f . org_board [ i ] [ j ]
17 i f c == 0 :
18 s e l f . u n f i l l e d . append ( ( i , j ) )
19 else :
20 s e l f . row_state [ i ] . add ( c )
21 s e l f . c o l _ s t a t e [ j ] . add ( c )
22 s e l f . b l o c k _ s t a t e [ i / / 3 ] [ j / / 3 ] . add ( c )
23
24 def set_state ( s e l f , i , j , c ) :
25 s e l f . board [ i ] [ j ] = c
26 s e l f . row_state [ i ] . add ( c )
27 s e l f . c o l _ s t a t e [ j ] . add ( c )
28 s e l f . b l o c k _ s t a t e [ i / / 3 ] [ j / / 3 ] . add ( c )
29
30 def reset_state ( s e l f , i , j , c ) :
31 s e l f . board [ i ] [ j ] = 0
32 s e l f . row_state [ i ] . remove ( c )
33 s e l f . c o l _ s t a t e [ j ] . remove ( c )
34 s e l f . b l o c k _ s t a t e [ i / / 3 ] [ j / / 3 ] . remove ( c )
2. Backtracking search with speedups: In the initialization, we have an-
other variable A used as the domain set of the current processing spot.
To get the domain set according to the constraints, a simple set opera-
tion is executed as: A−(row_state[i]|col_state[j]|block_state[i//3][j//3]).
In the solver, each time, to pick a spot, we first update all remaining
spots in the unfilled and then choose the one with minimal domain.
This process takes O(n) which is trivial compared with the cost of the
searching, with 9 for computing domain set of a single spot, 9n for n
spots, and adding another n to 9n to choose the one with the smallest
size. The solver is implemented as:
xxii COMBINATORIAL SEARCH
1 d e f _ret_len ( s e l f , a r g s ) :
2 i , j = args
3 o p t i o n = s e l f .A − ( s e l f . row_state [ i ] | s e l f . c o l _ s t a t e [ j
] | s e l f . b l o c k _ s t a t e [ i //3 ] [ j / / 3 ] )
4 return len ( option )
5
6 def solve ( s e l f ) :
7 i f l e n ( s e l f . u n f i l l e d ) == 0 :
8 r e t u r n True
9 # Dynamic v a r i a b l e s o r d e r i n g
10 i , j = min ( s e l f . u n f i l l e d , key = s e l f . _ret_len )
11 # Forward l o o k i n g
12 o p t i o n = s e l f .A − ( s e l f . row_state [ i ] | s e l f . c o l _ s t a t e [ j
] | s e l f . b l o c k _ s t a t e [ i //3 ] [ j / / 3 ] )
13 i f l e n ( o p t i o n ) == 0 :
14 return False
15 s e l f . u n f i l l e d . remove ( ( i , j ) )
16 for c in option :
17 s e l f . set_state ( i , j , c )
18 i f s e l f . solve () :
19 r e t u r n True
20 # Backtracking
21 else :
22 s e l f . reset_state ( i , j , c )
23 # Backtracking
24 s e l f . u n f i l l e d . append ( ( i , j ) )
25 return False
0.4 Solving Combinatorial Optimization Problems
Combinatorial optimization is an emerging field at the forefront of combi-
natorics and theoretical computer science that aims to use combinatorial
techniques to solve discrete optimization problems. From a combinatorics
perspective, it interprets complicated questions in terms of a fixed set of
objects about which much is already known: sets, graphs, polytopes, and
matroids. From the perspective of computer science, combinatorial opti-
mization seeks to improve algorithms by using mathematical methods either
to reduce the size of the set of possible solutions or to make the search itself
faster.
Genuinely, the inner complexity of a COP is at least of exponential, and
its solutions fall into two classes: exact methods and heuristic methods. In
some cases, we may be able to find efficient exact algorithms with either
greedy algorithms or dynamic programming technique such as finding the
shortest paths on a graph can be solved by the Dijkstra (greedy) or Bellman-
Ford algorithms(dynamic programming) to provide exact optimal solutions
in polynomial running time. For more complex problems, COP can be
mathematically formulated as a Mixed Linear Programming(MILP) model
and which is generally solved using a linear-programming based branch-and-
0.4. SOLVING COMBINATORIAL OPTIMIZATION PROBLEMS xxiii
bound algorithm. But, in other cases no exact algorithms are feasible, and
the following randomized heuristic search algorithms though we do not cover
in this section should be applied:
1. Random-restart hill-climbing.
2. Simulated annealing.
3. Genetic Algorithms.
4. Tabu search.
Model Combinatorial Optimization Problems It is a good prac-
tice to formulate COPs with mathematical equations/inequations, which
includes three steps:
1. Choose the decision variables that typically encode the result we are
interested in, such that in a superset problem, each item is a variable,
and each variable includes two decisions: take or not take, making its
value set as 0, 1.
2. Express the problem constraints in terms of these decision variables
to specify what the feasible solutions of the problem are.
3. Express the objective function to specify the quality of each solution.
There are generally many ways to model a COP.
Branch and Bound Branch and bound (BB, B&B, or BnB) is an al-
gorithm design paradigm for discrete and combinatorial optimization prob-
lems, as well as mathematical optimization. A branch-and-bound algorithm
consists of a systematic enumeration of candidate solutions by means of
state space search: the set of candidate solutions is thought of as forming a
rooted tree with the full set at the root. The algorithm explores branches
of this tree, which represent subsets of the solution set. Before enumerating
the candidate solutions of a branch, the branch is checked against upper
and lower estimated bounds on the optimal solution, and is discarded if it
cannot produce a better solution than the best one found so far by the algo-
rithm. “Branching” is to split problem into a number of subproblems, and
“bounding” is to find an optimistic estimation of the best solution to the
the subproblems to either maximize the upper bound or minimize the lower
bound. To get the optimistic estimation, we have to relax constraints. In this
section, we will exemplify both the minimization(TSP) and maximization
problem(Knapsack).
xxiv COMBINATORIAL SEARCH
Search Strategies In practice, we can apply different search strategies to
enumerate the search space of the problem, such as depth-first, best-first,
and least-discrepancy search. The way of how each listed strategy is applied
in the combinatorial optimization problems is:
• Depth-First: it prunes when a node estimation is worse than the best
found solution.
• Best-First: it selects the node with the best estimation among the
frontier set to expand each time. Worst scenario, the whole search tree
have to be saved as long the best estimation is extremely optimistic
and not a single branch is pruned in the process.
• Least-Discrepancy: it trusts a greedy heuristic, and then move away
from the heuristic in a very systematic fashion.
In this section, we discuss exact algorithms using Branch and Bound
with a variation of search strategies. During the interviews, questions that
have polynomial exact solutions are more likely to appear, proving your mas-
tery of dynamic programming or greedy algorithms design methodologies.
However, it is still good to discuss this option.
0.4.1 Knapsack Problem
Given n items with weight and value indicated by two vectors W and V
respectively. Now, given a knapsack with capacity c, maximize the value of
items selected into the knapsack with the total weight being bounded by c.
Each item can be only used at most once. For example, given the following
data, the optimal solution is to choose item 1 and 3, with total weight of 8,
and optimal value of 80.
c = 10
W = [5 , 8 , 3]
V = [ 4 5 , 48 , 35]
Search Space In this problem, xi denotes each item, and wi , vi for its
corresponding weight and value, with i ∈ [0, n − 1]. Each item can either
be selected or left behind, indicating xi ∈ 0, 1. The selected items can not
exceed the capacity, making n−1 w x ≤ c. And we capture the total value
P
Pn−1 i=0 i i
of the selected items as i=0 vi xi . Putting it all together:
n−1
max (10)
X
vi xi
v,x
i=0
n−1
s.t. (11)
X
wi x i ≤ c
i=0
xi ∈ 0, 1 (12)
0.4. SOLVING COMBINATORIAL OPTIMIZATION PROBLEMS xxv
With each variable having two choices, our search space is as large as 2n .
Branch and Bound To bound the search, we have to develop a heuristic
function to estimate an optimistic–maximum–total value a branch can lead
to.
In the case of knapsack problem, the simplest estimation is summing up
the total values of selected items so far, and estimate the maximum value
by adding the accumulated values of all remaining unselected items along
the search.
A tighter heuristic function can be obtained with constraint relax-
ation. By relaxing the condition of simply choose {0, 1} to [0, 1], that a
fraction of an item can be chosen at any time. By sorting the items by
the value per unit wvii , then a better estimate can be obtained by filling
the remaining capacity of knapsack with unselected items, with larger unit
value first be considered. A branch is checked on the optimal solution so far
against the lower estimated bound in our case, and is discarded if it cannot
produce a better solution than the best one found so far by the algorithm.
Both heuristic functions are more optimistic compared with the true value,
but the later is a tighter bound, being able to prune more branches along
the search and making it more time efficient. We demonstrate branch and
bound with two different search strategies: DFS(backtracking) and Best-
First search.
Depth-First Branch and Bound
Figure 9: Depth-First Branch and bound
xxvi COMBINATORIAL SEARCH
We set up a class BranchandBound to implement this algorithm. First,
in the initiation, we add additional wvii to mark each item’s value per unit,
and sort these items by this value in decreasing order. Second, we have a
function estimate which takes three parameters: idx as start index of the
remaining items, curval is the total value based on all previous decision,
and left_cap as the left capacity of the knapsack. The code snippet is:
1 import heapq
2
3 c l a s s BranchandBound :
4 d e f __init__ ( s e l f , c , v , w) :
5 s e l f . best = 0
6 self .c = c
7 s e l f . n = len (v)
8 s e l f . i t e m s = [ ( v i / wi , wi , v i ) f o r _, ( v i , wi ) i n enumerate (
z i p ( v , w) ) ]
9 s e l f . i t e m s . s o r t ( key=lambda x : x [ 0 ] , r e v e r s e=True )
10
11 d e f e s t i m a t e ( s e l f , idx , c u r v a l , l e f t _ c a p ) :
12 est = curval
13 # u s e t h e v/w t o e s t i m a t e
14 f o r i i n r a n g e ( idx , s e l f . n ) :
15 r a t i o , wi , _ = s e l f . i t e m s [ i ]
16 i f l e f t _ c a p − wi >= 0 : # u s e a l l
17 e s t += r a t i o ∗ wi
18 l e f t _ c a p −= wi
19 e l s e : # use part
20 e s t += r a t i o ∗ ( l e f t _ c a p )
21 left_cap = 0
22 return est
In the Depth-first search process, it consists of two main branches: one
considering to choose the current item, and the other to handle the case
while the item is not selected. For the first branch, it has to be bounded
by the capacity and the comparison of the best found solution against to
the estimation. Additional status is to assist to visualize the process of
the search, which tracks the combination of items. The process is shown in
Fig. 9. And the code is as:
1 d e f d f s ( s e l f , idx , e s t , v a l , l e f t _ c a p , s t a t u s ) :
2 i f i d x == s e l f . n :
3 s e l f . b e s t = max( s e l f . b e s t , v a l )
4 return
5 p r i n t ( status , val , left_cap , e s t )
6
7 _, wi , v i = s e l f . i t e m s [ i d x ]
8 # Case 1 : c h o o s e t h e item
9 i f l e f t _ c a p − wi >= 0 : # prune by c o n s t r a i n t
10 # Bound by e s t i m a t e , i n c r e a s e v a l u e and volume
11 i f est > s e l f . best :
12 s t a t u s . append ( True )
13 n e s t = s e l f . e s t i m a t e ( i d x +1, v a l+v i , l e f t _ c a p − wi )
14 s e l f . d f s ( i d x +1, n e s t , v a l+v i , l e f t _ c a p − wi , s t a t u s )
0.4. SOLVING COMBINATORIAL OPTIMIZATION PROBLEMS xxvii
15 s t a t u s . pop ( )
16
17 # Case 2 : not c h o o s e t h e item
18 i f est > s e l f . best :
19 s t a t u s . append ( F a l s e )
20 n e s t = s e l f . e s t i m a t e ( i d x +1 , v a l , l e f t _ c a p )
21 s e l f . d f s ( i d x +1, n e s t , v a l , l e f t _ c a p , s t a t u s )
22 s t a t u s . pop ( )
23 return
Best-First Branch and Bound
Within Best-First search, we use priority queue with the estimated value,
and each time the one with the largest estimated value within the frontier
set is expanded first. Similarly, with branch and bound, we prune branch
that has estimated value that would never surpass the best solution up till
then. The search space is the same as in Fig. 9 except that the search process
is different from depth-first. In the implementation, the priority queue is
implemented with a min-heap where the minimum value is firstly popped
out, thus we use the negative estimated value to make it always pop out the
largest value conveniently instead of write code to implement a max-heap.
1 def bfs ( s e l f ) :
2 # t r a c k v a l , cap , and i d x i s which item t o add next
3 q = [( − s e l f . e s t i m a t e ( 0 , 0 , s e l f . c ) , 0 , s e l f . c , 0 ) ] #
estimate , val , left_cap , idx
4 s e l f . best = 0
5 while q :
6 e s t , v a l , l e f t _ c a p , i d x = heapq . heappop ( q )
7 e s t = −e s t
8 _, wi , v i = s e l f . i t e m s [ i d x ]
9 i f i d x == s e l f . n − 1 :
10 s e l f . b e s t = max( s e l f . b e s t , v a l )
11 continue
12
13 # Case 1 : c h o o s e t h e item
14 n e s t = s e l f . e s t i m a t e ( i d x + 1 , v a l + v i , l e f t _ c a p − wi )
15 i f nest > s e l f . best :
16 heapq . heappush ( q , (− n e s t , v a l + v i , l e f t _ c a p − wi , i d x
+ 1) )
17
18 # Case 2 : not c h o o s e t h e item
19 nest = s e l f . estimate ( idx + 1 , val , left_cap )
20 i f nest > s e l f . best :
21 heapq . heappush ( q , (− n e s t , v a l , l e f t _ c a p , i d x + 1 ) )
22 return
0.4.2 Travelling Salesman Problem
Given a set of cities and the distances between every pair, find the shortest
possible path that visits every city exactly once and returns to the origin
xxviii COMBINATORIAL SEARCH
Figure 10: A complete undirected weighted graph.
city. For example, with the graph shown in Fig. 10, such shortest path is
[0, 1, 3, 2, 0] with a path weight 80.
Search Space In TSP, a possible complete solution is a Hamiltonian cycle,
a graph cycle that visits each vertex exactly once. Since it is a cycle, it does
not matter where it starts. For convenience, we choose vertex 0 as the
origin city. Therefore, in our example, our path starts and ends at 0, and
the remaining n − 1 vertices between will be a permutation of these vertices,
making the complexity as (n − 1)!.
Because this is a complete graph, it might be tempting to apply back-
tracking on the graph to enumerate all possible paths and find and check
possible solutions. However, this path searching will build a n − 1-ary search
(n−1)n −1
tree with height equals to n − 1, making the complexity as n−2 , which
is larger than the space of permutation among n − 1 items. Therefore, in
our implementation, we apply backtracking to enumerate all permutations
of n − 1 vertices and check its corresponding cost.
Speedups Since we only care about the minimum cost, then any partial
result that has cost larger than the minimum cost of all known complete
solutions can be prunned. This is the Branch and bound method that we
have introduced that is often used in the combinatorial optimization.
Implementation We built the graph as a list of dictionaries, each dictio-
nary stores the indexed vertex’s other cites and its corresponding distance
as key and value respectively. Compared with standard permutation with
backtracking, we add four additional variables: start to track the starting
vertex, g to pass the graph to refer the distance information, mincost to
save the minimum complete solution so far found, and cost to track the
current partial path’s cost. The code is shown as:
1 d e f t s p ( a , d , used , c u r r , ans , s t a r t , g , mincost , c o s t ) :
2 i f d == l e n ( a ) :
0.5. EXERCISES xxix
3 # Add t h e c o s t from l a s t v e r t e x t o t h e s t a r t
4 c = g [ curr [ −1]][ start ]
5 c o s t += c
6 i f c o s t < mincost [ 0 ] :
7 mincost [ 0 ] = c o s t
8 ans [ 0 ] = c u r r [ : : ] + [ s t a r t ]
9 return
10
11 for i in a :
12 i f not used [ i ] and c o s t + g [ c u r r [ − 1 ] ] [ i ] < m i n c o s t [ 0 ] :
13 c o s t += g [ c u r r [ − 1 ] ] [ i ]
14 c u r r . append ( i )
15 used [ i ] = True
16 t s p ( a , d + 1 , used , c u r r , ans , s t a r t , g , mincost , c o s t )
17 c u r r . pop ( )
18 c o s t −= g [ c u r r [ − 1 ] ] [ i ]
19 used [ i ] = F a l s e
20 return
TSP is a NP-hard problem, and there is no known polynomial time solution
so far.
Other Solutions
Whenever we are faced with optimization, we are able to consider the other
two algorithm design paradigm–Dynamic Programming and Greedy Algo-
rithms. In fact, the above two problems both have its corresponding dynamic
programming solutions: for knapsack problem, polynomial solution is pos-
sible; for TSP, though it is still of exponential time complexity, it is much
better than O(n!). We will further discuss these two problems in Chapter
Dynamic Programming.
0.5 Exercises
1. 77. Combinations
2. 17. Letter Combinations of a Phone Number
3. 797. All Paths From Source to Target
4. N-bit String: enumerate all n-bit strings with backtracking algorithm,
for example:
n = 3 , a l l 3− b i t s t r i n g s a r e :
0 0 0 , 0 0 1 , 0 1 0 , 0 1 1 , 1 0 0 , 1 0 1 , 1 1 0 , 111
5. 940. Distinct Subsequences II
6. N-queen
xxx COMBINATORIAL SEARCH
7. Map-coloring
8. 943. Find the Shortest Superstring (hard). Can be moduled as trav-
eling salesman problem and dynamic programming