EC3051E: SEARCH
Application: route finding
❑ Objective: shortest? fastest? most
scenic?
❑ Actions: go straight, turn left, turn
right
Real-world examples
❑Traveling salesman problem
❑Robot motion planning
❑Route-finding problems (Google maps)
❑VLSI layout: position million of components and connections on a chip
to minimize area, shorten delays.
❑Protein folding: sequence of amino acids that will fold into a 3D protein
with the right properties to cure some disease.
➢AlphaFold2
Search
❑Agents work towards a goal.
❑Agent function: Action or series of action that lead to the goal.
❑Search through possible solutions.
Key: need to consider future consequences of an action!
Example
Agent in Arad. Goal: Reach Bucharest
Problem Formulation
❑Initial State: State in which agent starts
➢Arad
❑State space: All states reachable from initial state by any sequence of
actions.
❑Actions: Given a particular state s, ACTIONS(s) returns the set of actions
that can be executed in state s
❑Transition model: Result(s, a) returns the state that results from doing
action a in state s
➢ Result(Arad, Go(Zerind) ) = Zerind
❑Goal Test: Determine if the state is a goal test
❑Path cost (Performance measure): C(s, a, s’) Cost of taking action a in state
s and reaching s’
Trees: Data structure used for search
Search Tree
❑Root: Initial state
❑Branches: Actions
❑Nodes: resulting from actions
❑Expand: A function that given a node, creates all children nodes
General Tree search
➢Frontier
◦ Set of leaf nodes available
for expansion at given point
General Tree Search: Example
How to handle repeated states?
Graph Search
❑Augment the Tree search
with Explored set
➢Set which “remembers”
every expanded node
Search Strategy
❑A strategy is defined by picking the order of node expansion
❑Performance measures:
➢Completeness: Does it always find a solution if one exists?
➢Time complexity: How long does it take to find a solution?
➢Space complexity: How much memory is needed to perform the search?
➢Optimality: Does the strategy find the optimal solution.
More on performance measures
❑b: Maximum number of
successor nodes of any node
❑d: Depth of the shallowest
goal node
❑m: maximum length of any
path in the state space
Breadth first search (BFS)
❑Explores every node at a given depth before next depth of nodes
❑Complete: Yes (If b is finite)
❑Time complexity: Let each node have b successors and d be the solution depth
➢𝑏 + 𝑏 2 + 𝑏 3 + ⋯ + 𝑏 𝑑 = Ο(𝑏𝑑 )
❑Space complexity: Ο 𝑏 𝑑−1 in the explored set and Ο 𝑏 𝑑 in the frontier
Tutorial: Asymptotic order of growth - Asymptotic upper bound — big-Oh O( )
𝑇(𝑛)
Analytical definition. T(n) is O( f (n)) if lim sup <∞
𝑛→∞ 𝑓(𝑛)
BFS: Pseudo-code
❑Implemented using a FIFO
queue for the frontier
How bad is BFS?
❑Exponential time complexity
b=10; 1 million nodes per second; 1,000 bytes per node.
Depth first search (DFS)
DFS: Pseudo-code
❑Implemented using
LIFO
DFS: Performance Measures
❑Complete: Yes (Finite state space)
❑Time complexity: 𝑏 + 𝑏 2 + 𝑏 3 + ⋯ + 𝑏 𝑚 = Ο(𝑏𝑚 )
➢Notice we have m instead of d
➢bad if m is much larger than d
❑Space complexity: Storage of only Ο(𝑏𝑚)!
Space complexity: DFS
❑Ο(𝑏𝑚)
❑Backtracking DFS:
Generate only one successor
at the frontier
➢Ο(𝑚)
Variants of DFS: Depth limited search (DLS)
❑Limit m to a predetermined length 𝑙
❑Completeness: No if 𝑙 < 𝑑
❑Useful if 𝑙 can be determined from prior knowledge
❑Eg. In the Romania example, there are 20 cities, so 𝑙 = 19
❑Time complexity: Ο(𝑏 𝑙 )
❑Space complexity: Ο(𝑏𝑙)
Iterative deepening DFS (IDS)
❑Idea: Iteratively increase the search limit until the depth of the shallowest
solution d is reached.
❑Applies DLS with increasing limits.
❑The algorithm will stop if a solution is found or if DLS returns a failure
Tradeoff in IDS
❑Space complexity: Ο(𝑏𝑑) (Same as DFS)
❑Time complexity
➢Bottom level (at depth d) are generated once
➢Above bottom level twice
➢1 𝑏 𝑑 + 2 𝑏 𝑑−1 + … + 𝑑 𝑏 = Ο(𝑏𝑑 ) (Same as BFS)
Uniform cost search
❑Cost of search graph has not been considered before.
❑Uniform cost: BFS will find optimal solution
❑Idea: Modify BFS to use path cost (past cost + link cost)
❑Implementation: Store the frontier as a priority queue
❑ Assumptions
➢edge costs are always positive
➢all edge costs are >= b, where b is some minimum non-zero cost.
Pseudo-code
Example
❑Steps
1. Choose between Fagras (99)and
Rimnicu Vilcea (80)
2. Expand Rimnicu
3. Pathcost(Pitesti) = 80+97= 177
4. Least cost node = Fagaras
5. Expand Fagaras
6. Pathcost (Bucharest) = 310
7. Least cost node = Pitesti
8. Expand Pitesti
9. Pathcost(Bucharest) = 177 + 101 =
278
Performance measure
❑Complete: Yes
❑Optimal: Yes (See next slide)
❑Time:
➢Suppose 𝐶 ∗ : cost of the optimal solution
➢Every action costs at least ε
𝐶∗
➢The effective depth is roughly
ϵ
𝐶∗
➢Ο(𝑏 ) (Could be much greater than BFS)
ϵ
𝐶∗
❑Space: All nodes until we find optimal sol: Ο(𝑏 )
ϵ
Optimality of UCS
❑Theorem: Once UCS expands a node s, it has found the optimal path
from the start node to s
❑Proof: Inductive argument
Informed search
❑Problem specific information (domain knowledge).
❑Use a heuristic to guide the search methodology.
Greedy strategy
❑Assumption: There exists a heuristic function h(n)
❑h(n) is an “estimate” of the cost from state n to the goal
❑Eg. Route-finding problem in Romania
➢ℎ𝑆𝐿𝐷 (𝑛): Straight line distance from any location to Bucharest
❑Greedy search: Expand the node that “appears” to be closest to the goal.
Greedy search: Pseudo-code
❑Implementation: Similar to
UCS, except we use
heuristic instead of path
cost.
Example
Performance measure
❑Complete: No
❑Time and space complexity: Ο(𝑏 𝑚 )
A* search
❑Heuristic f(n) = g(n) + h(n)
➢g(n): path cost from start to node n
➢h(n): estimated cost of cheapest solution from n
❑Advantage: Complete and optimal (under certain conditions)
Example
Example (cont)
Admissible Heuristics
❑An admissible heuristic never overestimates the cost to reach the goal,
that is it is optimistic
❑A heuristic is admissible if
❑ℎ∗ (𝑛): true cost to reach the goal from node n
❑ℎ𝑆𝐿𝐷 (𝑛) Straight line distance heuristic is admissible
➢Shortest distance between two points
Consistent heuristic
❑A heuristic is consistent if for every node n and every successor node n’
❑General Triangle Inequality
❑Consistent => Admissible
Optimality of A*
❑if h(n) is consistent, then the values of f(n) along any path are nondecreasing.
❑whenever A∗ selects a node n for expansion, the optimal path to that node has
been found.
➢Proof by contradiction: there would have to be another frontier node n′ on
the optimal path from the start node to n
❑Hence, the first goal node selected for expansion must be an optimal solution
because f is the true cost for goal nodes (which have h=0) and all later goal
nodes will be at least as expensive.
Performance measure
❑Complete: Yes (finite)
❑Time: exponential
❑Space: keeps every node in memory, the biggest problem
❑Optimal: Yes!
Exercise
Other exercises: Ex 3.3, 3.9 (Russell and Norvig)
Thank you