Lecture 03: Heuristic Search on Trees

TBA

Contents

1 Overview 1 2 Search Problem 1 3 Best-First Search 2 4 Best-g Search 3 5 Best-h Search 3 6 Search 4

A

6.1 Admissibility ........................................ 4

6.2 Optimality ......................................... 5

6.3 Completeness ........................................ 6 7 Examples 6 8 Summary 8

1 Overview

In this lecture, we expand our notion of basic search problems to incorporate costs. We generalize the notion of optimalitydiscussedinthepreviouslecturefromdepth(i.e., all edges are of cost 1) to edges with arbitrary costs. In this extended framework, we introduce heuristic functions, which estimatethe cost of reaching agoal nodefroma stateinthe search space. Based ontheseheuristic functions, we present several examples of heuristic search algorithms: best-first search, including greedy search and Asearch, and iterative deepening Asearch. Note that unlike blind search algorithms, which are equally applicable to any basic search problem, heuristic search algorithms make use of domain-specific knowledge.

2 Search Problem

A search problem is a 5-tuple X,S,G,δ,c, where

  • X,S,G,δ,is a basic search problem
  • c : X × X R is a cost function

For all y δ(x),c(x,y)denotes the cost of reaching y from x. Now given path {n0,...,ni,ni+1,...,nk+1}, where n0 S, nk+1 = n, and ni+1 δ(ni ) for all 0 i k, we write g(n) to denote the cost of reaching node n:

k

g(n)= c(ni,ni+1) (1) i=0

Examples of cost functions include: g(n)=depth(n)and g(n)=distance(n).

Best-First Search

The main idea of the best-first search class of algorithms is to expand the lowest-cost node on the fringe, according to some evaluation function e : X R.

Best-First(X,S,G,δ,c,e)

Inputs search problem
evaluation function e
Output (pathto) goal node
Initialize O = S is the priority queue of open nodes

while (O is not empty) do

  1. delete node n O s.t. e(n)is minimal
  2. if n G, return(pathto) n
    1. for all m δ(n)
      1. compute e(m)
      2. insert m into O with priority e(m)

fail

Table 1: Best-First Search. Best-g search is the special case of best-first search in which e = g. Best-h search is the special case of best-first search in which e = h.Asearch is the special case of best-first search in which e = f = g + h.

BFS is the special case of best-first search in which the evaluation function e(n) = depth(n) for node n. For this choice of evaluation function, the complexity of best-first search in the worst-case is that of BFS: exponential in the depth of the goal for both time and space.

Best-first search can also visit nodes in depth-first search order: e.g., let e(n)= 0, for all n X. For this choice of evaluation function, best-first search is not complete; nor is it optimal.

Best-g Search

The main idea of best-g search is to expand the lowest-cost node on the fringe, according to cost function g , defined in Equation 1. (See Figure 3.)

Best-g is complete, except in search spaces that contain infinitely many nodes n with g(n) <g

(e.g., an infinite path with finite cost), where g is the optimal cost.

Best-g is also optimal: that is, it is guaranteed to find the lowest-cost goal, whenever g is a monotonically, nondecreasing function of depth: i.e., for all n X, for all m δ(n),g(m)g(n). Monotonicity is violated whenever c(n,m)< 0 for some node n and its successor m.

Figure 1 depictstwo search trees. Inboth spaces, S is the start state, Y and Z aregoal nodes, and Z is optimal. The search tree on the LHS contains an infinite path of finite cost. Best-g search never reaches either goal node. The search tree on the RHS contains an edge of negative cost. Best-g search proceeds directly to the suboptimal goal node Y.

Figure1:(LHS) Asearch spacethatcontainsaninfinitepath of finitecost. (RHS)Asearch space that contains an edge of negative cost. In both search spaces, S is the start state, Y and Z are goal nodes, and Z is optimal.

Best-h Search

The main idea of best-h search is to expand the lowest-cost node on the fringe, according to some domain-specific heuristic function h. Heuristics are used to guide the search process. The degree of optimality of best-h search depends on the quality of the heuristic, as do any completeness guarantees. (See Figure 4.)

A heuristic function h : X R computes an estimate of the distance from node n to a goal node. In the sliding tiles puzzle, one heuristic function h1(n)is simply the number of misplaced tiles. A second heuristic function h2(n) is the Manhattan distance: i.e., the number of moves required to place each tile correctly, summed over all misplaced tiles.

1 3 5
7 2 4
6 8
1 2 3
4 5 6
7 8

Figure 2: (LHS) Start State. (RHS)Goal State. h1(n)=6 and h2(n)=10.

Figure 2 depicts an arbitrary state n and the goal of the 8-puzzle—the sliding tiles puzzle with 8 tiles. In this state n, there are 6 misplaced tiles, and the Manhattan distance evaluates to 10 (h2(3)=1, h2(5)=2, h2(7)=1, h2(2)=1, h2(4)=2, h2(6)=3).

Exercise: Give other examples of heuristics for the sliding tiles puzzle.

6A Search

Let f(n)= g(n)+ h(n), whereg(n) is the cost of reaching node n from the start state and h(n) is a heuristic estimate of the distance from node n to the nearest goal node. The main idea of Asearch is to expand the lowest-cost node on the fringe, according to the evaluation function f. (See Figure 5.) Like best-g and best-h searches, Ais a special case of the best-first search algorithm. Nonetheless, we present the Aalgorithm in its entirely in Table 2.

A(X,S,G,δ,c,h)

Inputs search problem
heuristic function h
Output (pathto) optimal goal node
Initialize O = S is the priority queue of open nodes

while (O is not empty) do

  1. delete node n O s.t. f(n)is minimal
  2. if n G, return(pathto) n
    1. for all m δ(n)
      1. compute h(m)
      2. g(m)= g(n)+c(m,n)
      3. f(m)= g(m)+h(m)
      4. insert m into O with priority f(m)

fail

Table 2: ASearch.

6.1 Admissibility

Let h(n) be the true cost from node n to the nearest goal node. A heuristic function h(n) is said to be admissible iff h(n)h(n), for allnodes n. In other words, admissible heuristics are

optimistic: in minimization problems, admissible heuristics never overestimate the distance to a goal; in maximization problems, admissible heuristics never underestimate the value of a goal.

The sample heuristics h1 and h2 in the sliding tiles puzzle are both admissible. The heuristic function h1 is admissible since it requires at least one move to move each misplaced tile to its correct position. The heuristic function h2 is admissible since, more accurately, it requires at least the Manhattan distance to move each misplaced tile to its correct position.

The most useful admissibleheuristics are those which most closely approximate h(n)without going over. An admissible heuristic h dominates an alternative admissible heuristic h iff h(n)h (n) for all nodes n. Intuitively, a dominant heuristic is more informed than the heuristic it dominates. For example, the Manhattan distance h2 dominates h1.

Given two admissible heuristics h and h ′′ , it need not be the case that one dominate the other. In this case, we can construct a composite heuristic of the form h(n)= max{h (n),h ′′ (n)}. The new heuristic h is admissible and it dominates the individual heuristics h and h ′′ .

Exercise: Prove this claim.

One “heuristic” for constructing admissible heuristics is to remove one or more of the problem’s constraints. In the sliding tiles puzzle, moves are constrained in three ways: a tile can only be moved into the blank space; a tile must be moved along the grid; and, a tile can only be moved intoanadjacentcell. If werelaxonly the firstconstraint,thisyieldstheManhattandistance(h2). If we relax the first and the second constraints, this yields another heuristic function—Euclidean distance—call it h . If we relax all three constraints, this yields the heuristic function h1. Clearly, h2 dominates h dominates h1, since h2 enforces more constraints than h ; and, h dominates h1, since h enforces more constraints than h1.

6.2 Optimality

We now prove that Asearch is optimal, assuming the heuristic function h is nonnegative and admissible.

Definition: For some ǫ 0, a heuristic function h is said to be ǫ-admissible iff h(n)h(n)+ǫ, for all nodes n.

Theorem: If h is nonnegative and ǫ-admissible, then Asearch is ǫ-optimal: i.e., if Areturns

goal node m , then g(m )g(n )+ǫ, where n is an optimal goal node. ∗∗

Proof: Suppose Areturns goal node m before it returns optimal goal node n . It follows that

∗∗

there exists node n (possibly n itself) on the priority queue that is also on the path to n s.t.

f(m )f(n). But then m is ǫ-optimal(i.e., g(m )g(n )+ǫ), bythe following reasoning:

g(m ) g(m )+h(m ) since h is nonnegative

= f(m ) by definition

f(n) by assumption

= g(n)+h(n) by definition

g(n)+h(n)+ǫ by ǫ-admissibility

= g(n )+ǫ distance to optimal goal

Corollary: If the heuristic function h is nonnegative and admissible, then Asearch is optimal.

Proof: Let ǫ =0.

As long as we assume that all costs are nonnegative (i.e., c(n,m) 0, for all nodes n and their successors m), then any reasonable heuristic h will also be nonnegative. Why admit negative heuristic values if costs themselves are never negative?

Note also that nothing in the algorithm’s specification precludes running Asearch with an inadmissible heuristic. Indeed, it is often implemented in this way in the interest of time, but then it is not guaranteed to return an optimal goal.

6.3 Completeness

Asearch is complete in search spaces that do not contain infinitely many nodes n with f(n)<f

(e.g., an infinite path of finite value), where fis the f-cost of an optimal goal.

7 Examples

Figure 3: Sample search tree, labeled with costs g. Boxes indicate goal nodes. Best-g returns the optimal goal node G.

Best-g Search The tree shown in Figure 3 has cost function g(n) = depth(n). Best-g on this search space is precisely BFS: it finds the optimal goal node G. Nodes are expanded as follows: A0, B1C1D1, C1D1E2F2, D1E2F2G2, E2F2G2, F2G2, G2H3I3, goal!

Best-h Search The tree depicted in Figure 4 has cost function h(n). Best-h search returns the suboptimalgoal node H in this example. Thepriorityqueueis maintained asfollows: A0, B1C1D1, E2F2C1D1, F2C1D1, H3I3C1D1, goal!

Aand IDASearch The tree depicted in Figure 5 has cost function f(n)= g(n)+ h(n). Asearch returns the optimal goal node G in this example. Nodes are expanded as follows: A0, B1C2D3, E2C2F3D3, C2F3D3, G2F3D3, goal! Or, if ties are broken otherwise, nodes could be expanded in an alternative order: A0, B1C2D3, C2E2D3F3, E2G2D3F3, G2D3F3, goal! Since h is admissible, Ais optimal. IDAexpands nodes as follows, for β = 1: f = 0: A0; f = 1: A0B1; f =2: A0, B1C2, E2C2, C2, G2, goal!

Figure 4: Sample search tree, labeled with heuristic values h. Boxes indicate goal nodes. Best-h search returns the suboptimal goal node H.

Figure 5: Sample search tree, labeled with costs g and heuristic values h. Boxes indicate goal nodes. Asearch returns the optimal goal node G.

Summary

Criteria Time Space Completeness Optimality Best-g O(bd ): BFS, ifg =depth O(bd ): BFS, ifg =depth YES, if there do not exist -many nodes n s.t. g(n)<g YES, if g is monotonically nondecreasing in depth

Criteria Time Space Completeness Optimality Best-h O(bd ): BFS, ifh =depth O(bd ): BFS, ifh =depth NO, if nodes are visited in DFS order NO, if nodes are visited in DFS order

Criteria Time Space Completeness Optimality

AO(bd ): BFS, ifg =depth and h =0 O(bd ): BFS, ifg =depth and h =0

YES, if there do not exist -many nodes n s.t. f(n)<f

YES, if h is nonnegative and admissible, which makes sense if c 0, which implies g is monotonically nondecreasing in depth