Lo que Dios no puede hacer - Vida Sin Límites-clifford Goldstein

The most widely-known form of best-first search is called A∗ search (pronounced “A-star A∗SEARCH

to get from the node to the goal: f (n) = g(n) + h(n) .

Sinceg(n) gives the path cost from the start node to node n, and h(n) is the estimated cost of the cheapest path fromn to the goal, we have

f (n) = estimated cost of the cheapest solution through n .

Thus, if we are trying to find the cheapest solution, a reasonable thing to try first is the node with the lowest value ofg(n) + h(n). It turns out that this strategy is more than just reasonable: provided that the heuristic functionh(n) satisfies certain conditions, A∗search is both complete and optimal. The algorithm is identical to UNIFORM-COST-SEARCH except that A∗usesg + h instead of g.

Conditions for optimality: Admissibility and consistency

The first condition we require for optimality is that h(n) be an admissible heuristic. An ADMISSIBLE

HEURISTIC

admissible heuristic is one that never overestimates the cost to reach the goal. Becauseg(n) is the actual cost to reach n, and f (n) = g(n) + h(n), we have as immediate consequence thatf (n) never overestimates the true cost of a solution through n.

Admissible heuristics are by nature optimistic, because they think the cost of solving the problem is less than it actually is. An obvious example of an admissible heuristic is the straight-line distance hSLD that we used in getting to Bucharest. Straight-line distance is

admissible because the shortest path between any two points is a straight line, so the straight line cannot be an overestimate. In Figure 3.24, we show the progress of an A∗ tree search for Bucharest. The values ofg are computed from the step costs in Figure 3.2, and the values of hSLDare given in Figure 3.22. Notice in particular that Bucharest first appears on the frontier

at step (e), but it is not selected for expansion because its f -cost (450) is higher than that of Pitesti (417). Another way to say this is that there might be a solution through Pitesti whose cost is as low as 417, so the algorithm will not settle for a solution that costs 450.

A second, slightly stronger condition called consistency (or sometimes monotonicity) CONSISTENCY

MONOTONICITY is required only for the graph-search version of A∗. A heuristich(n) is consistent if, for every noden and every successor n′ _of_{n generated by any action a, the estimated cost of reaching}

the goal from n is no greater than the step cost of getting to n′ plus the estimated cost of reaching the goal fromn′:

h(n) ≤ c(n, a, n′) + h(n′) .

This is a form of the general triangle inequality, which stipulates that each side of a triangle TRIANGLE

INEQUALITY

cannot be longer than the sum of the other two sides. Here, the triangle is formed byn, n′_,

and the goalGnclosest ton. For an admissible heuristic, the inequality makes perfect sense:

if there were a route from n to Gnvian′that was cheaper thanh(n), that would violate the

property thath(n) is a lower bound on the cost to reach Gn.

It is fairly easy to show (Exercise 3.37) that every consistent heuristic is also admissible. Consistency is therefore a stricter requirement than admissibility, but one has to work quite hard to concoct heuristics that are admissible but not consistent. All the admissible heuristics we discuss in this chapter are also consistent. Consider, for example, hSLD. We know that

Section 3.5. Informed (Heuristic) Search Strategies 97

(a) The initial state (b) After expanding Arad

(c) After expanding Sibiu

Arad Sibiu Timisoara 447=118+329 Zerind 449=75+374 393=140+253 Arad 366=0+366

(d) After expanding Rimnicu Vilcea

(e) After expanding Fagaras

(f) After expanding Pitesti

Zerind Arad Sibiu Arad Timisoara Rimnicu Vilcea Fagaras Oradea 447=118+329 449=75+374 646=280+366415=239+176 671=291+380 413=220+193 Zerind Arad Sibiu Timisoara 447=118+329 449=75+374 Rimnicu Vilcea

Craiova Pitesti Sibiu 526=366+160417=317+100 553=300+253 Zerind Arad Sibiu Arad Timisoara Sibiu Bucharest Fagaras Oradea

Craiova Pitesti Sibiu

447=118+329 449=75+374 646=280+366 591=338+253 450=450+0 526=366+160 417=317+100553=300+253 671=291+380 Zerind Arad Sibiu Arad Timisoara Sibiu Bucharest Oradea

Craiova Pitesti Sibiu

Bucharest Craiova Rimnicu Vilcea

418=418+0 447=118+329 449=75+374 646=280+366 591=338+253 450=450+0 526=366+160 553=300+253 615=455+160 607=414+193 671=291+380 Rimnicu Vilcea

Fagaras Rimnicu Vilcea

Arad Fagaras Oradea 646=280+366 415=239+176 671=291+380

Figure 3.24 Stages in an A∗search for Bucharest. Nodes are labeled withf = g + h. The h values are the straight-line distances to Bucharest taken from Figure 3.22.

the general triangle inequality is satisfied when each side is measured by the straight-line distance, and that the straight-line distance between n and n′ _{is no greater than} _{c(n, a, n}′_).

Hence,hSLD is a consistent heuristic.

Optimality of A*

As we mentioned earlier, A∗ has the following properties: the tree-search version ofA∗ _is

optimal ifh(n) is admissible, while the graph-search version is optimal if h(n) is consistent. We will show the second of these two claims, since it is more useful. The argument essentially mirrors the argument for the optimality of uniform-cost search, withg replaced by f —just as in the A∗algorithm itself.

The first step is to establish the following: if h(n) is consistent, then the values of f (n) along any path are nondecreasing. The proof follows directly from the definition of consistency. Supposen′ _{is a successor of}_{n; then g(n}′_{) = g(n) + c(n, a, n}′_{) for some a, and}

we have

f (n′) = g(n′) + h(n′) = g(n) + c(n, a, n′) + h(n′_{) ≥ g(n) + h(n) = f(n) .}

The next step is to prove that wheneverA∗ _{selects a node}_{n for expansion, the optimal path}

to that node has been found. Were this not the case, there would have to be another frontier node n′ on the optimal path from the start node to n, by the graph separation property of Figure 3.9; because f is nondecreasing along any path, n′ _{would have lower}_{f -cost than n}

and would have been selected first.

From the two preceding observations, it follows that the sequence of nodes expanded by A∗using GRAPH-SEARCH is in nondecreasing order off (n). Hence, the first goal node selected for expansion must be an optimal solution, becausef is the true cost for goal nodes (which haveh = 0) and all later goal nodes will be at least as expensive.

The fact thatf -costs are nondecreasing along any path also means that we can draw

contours in the state space, just like the contours in a topographic map. Figure 3.25 shows CONTOURS

an example. Inside the contour labeled 400, all nodes have f (n) less than or equal to 400, and so on. Then, because A∗expands the frontier node of lowest f -cost, we can see that an A∗search fans out from the start node, adding nodes in concentric bands of increasingf -cost. With uniform-cost search (A∗ search using h(n) = 0), the bands will be “circular” around the start state. With more accurate heuristics, the bands will stretch toward the goal state and become more narrowly focused around the optimal path. If C∗ _{is the cost of the}

optimal solution path, then we can say the following: • A∗expands all nodes withf (n) < C∗.

• A∗might then expand some of the nodes right on the “goal contour” (wheref (n) = C∗) before selecting a goal node.

Completeness requires that there be only finitely many nodes with cost less than or equal to C∗_{, a condition that is true if all step costs exceed some finite}_{ǫ and if b is finite.}

Notice that A∗ expands no nodes with f (n) > C∗—for example, Timisoara is not expanded in Figure 3.24 even though it is a child of the root. We say that the subtree below Timisoara is pruned; becausehSLDis admissible, the algorithm can safely ignore this subtree

Section 3.5. Informed (Heuristic) Search Strategies 99 O Z A T L M D C R F P G B U H E V I N 380 400 420 S

Figure 3.25 Map of Romania showing contours atf = 380, f = 400 and f = 420, with

Arad as the start state. Nodes inside a given contour havef -costs less than or equal to the

contour value.

while still guaranteeing optimality. The concept of pruning—eliminating possibilities from consideration without having to examine them—is important for many areas of AI.

One final observation is that among optimal algorithms of this type—algorithms that extend search paths from the root and use the same heuristic information—A∗ is optimally efficient for any given heuristic function. That is, no other optimal algorithm is guaran- OPTIMALLY

EFFICIENT

teed to expand fewer nodes than A∗(except possibly through tie-breaking among nodes with f (n) = C∗_{). This is because any algorithm that does not expand all nodes with}_{f (n) < C}∗

runs the risk of missing the optimal solution.

That A∗search is complete, optimal, and optimally efficient among all such algorithms is rather satisfying. Unfortunately, it does not mean that A∗is the answer to all our searching needs. The catch is that, for most problems, the number of states within the goal contour search space is still exponential in the length of the solution. The details of the analysis are beyond the scope of this book, but the basic results are as follows. For problems with constant step costs, the growth in runtime is analyzed in terms of the the absolute error or the relative ABSOLUTE ERROR

error of the heuristic. The absolute error is defined as_{∆ ≡ h}∗_{−h, where h}∗is the actual cost RELATIVE ERROR

of getting from the root to the goal, and the relative error is defined as_{ǫ ≡ (h}∗_{− h)/h}∗. For a state space that is a tree, the time complexity of A∗is exponential in the absolute error, i.e., O(b∆_{). For constant step costs, we can write this as O(b}ǫd_{), where d is the solution depth.}

For almost all heuristics in practical use, the absolute error is at least proportional to the path cost h∗_{, so}_{ǫ is constant or growing and the time complexity is exponential in d. We can}

also see the effect of a more accurate heuristic: O(bǫd) = O((bǫ)d), so the effective branching factor (defined more formally in the next section) isbǫ.

many states withf (n) < C∗even if the absolute error is bounded by a constant. For example, consider a simplified version of the vacuum world where the agent can clean up any square for unit cost without even having to visit it: in that case, squares can be cleaned in any order. WithN initially dirty squares, there are 2N states where some subset has been cleaned, and all of them are on an optimal solution path—and hence satisfyf (n) < C∗_{even if the heuristic}

has an error of 1.

The complexity of A∗often makes it impractical to insist on finding an optimal solution. One can use variants of A∗ that find suboptimal solutions quickly, or one can sometimes design heuristics that are more accurate but not strictly admissible. In any case, the use of a good heuristic still provides enormous savings compared to the use of an uninformed search. In Section 3.6, we will look at the question of designing good heuristics.

Computation time is not, however, A∗’s main drawback. Because it keeps all generated nodes in memory (as do all GRAPH-SEARCHalgorithms), A∗ usually runs out of space long before it runs out of time. For this reason, A∗ is not practical for many large-scale problems. Recently developed algorithms have overcome the space problem without sacrificing optimality or completeness, at a small cost in execution time. We discuss these next.

3.5.3 Memory-bounded heuristic search

The simplest way to reduce memory requirements for A∗ is to adapt the idea of iterative deepening to the heuristic search context, resulting in the iterative-deepening A∗ (IDA∗) al- ITERATIVE-

DEEPENING A∗

gorithm. The main difference between IDA∗and standard iterative deepening is that the cutoff used is thef -cost (g + h) rather than the depth; at each iteration, the cutoff value is the small- est f -cost of any node that exceeded the cutoff on the previous iteration. IDA∗ _{is practical}

for many problems with unit step costs and avoids the substantial overhead associated with keeping a sorted queue of nodes. Unfortunately, it suffers from the same difficulties with real- valued costs as does the iterative version of uniform-cost search described in Exercise 3.24. This section briefly examines two more recent memory-bounded algorithms, called RBFS and MA∗.

Recursive best-first search (RBFS) is a simple recursive algorithm that attempts to RECURSIVE

BEST-FIRST SEARCH

mimic the operation of standard best-first search, but using only linear space. The algorithm is shown in Figure 3.26. Its structure is similar to that of a recursive depth-first search, but rather than continuing indefinitely down the current path, it uses the f limit variable to keep track of the f -value of the best alternative path available from any ancestor of the current node. If the current node exceeds this limit, the recursion unwinds back to the alternative path. As the recursion unwinds, RBFS replaces the f -value of each node along the path with backed-up value—the bestf -value of its children. In this way, RBFS remembers the BACKED-UP VALUE

f -value of the best leaf in the forgotten subtree and can therefore decide whether it’s worth reexpanding the subtree at some later time. Figure 3.27 shows how RBFS reaches Bucharest. RBFS is somewhat more efficient than IDA∗, but still suffers from excessive node regeneration. In the example in Figure 3.27, RBFS first follows the path via Rimnicu Vilcea, then “changes its mind” and tries Fagaras, and then changes its mind back again. These mind changes occur because every time the current best path is extended, there is a good chance

Section 3.5. Informed (Heuristic) Search Strategies 101

function RECURSIVE-BEST-FIRST-SEARCH( problem ) returns a solution, or failure

return RBFS( problem, MAKE-NODE(problem.INITIAL-STATE),_∞)

function RBFS( problem, node, f limit ) returns a solution, or failure and a newf -cost limit if problem.GOAL-TEST(node.STATE) then return SOLUTION(node)

successors← [ ]

for each action in problem.ACTIONS(node.STATE) do add CHILD-NODE( problem , node, action) into successors

if successors is empty then return failure,_∞

for each s in successors do /* update f with value from previous search, if any */ s.f ← max(s.g + s.h, node.f ))

loop do

best_{← the lowest f-value node in successors} if best.f > f limit then return failure, best.f

alternative_{← the second-lowest f-value among successors} result, best._{f ← RBFS(problem, best, min( f limit, alternative))} if result _{6= failure then return result}

Figure 3.26 The algorithm for recursive best-first search.

that itsf -value will increase—h is usually less optimistic for nodes closer to the goal. When this happens, particularly in large search spaces, the second-best path might become the best path, so the search has to backtrack to follow it. Each mind change corresponds to an iteration of IDA∗, and could require many reexpansions of forgotten nodes to recreate the best path and extend it one more node.

Like A∗ tree search, RBFS is an optimal algorithm if the heuristic function h(n) is admissible. Its space complexity is linear in the depth of the deepest optimal solution, but its time complexity is rather difficult to characterize: it depends both on the accuracy of the heuristic function and on how often the best path changes as nodes are expanded.

IDA∗ and RBFS suffer from using too little memory. Between iterations, IDA∗retains only a single number: the current f -cost limit. RBFS retains more information in memory, but it uses only linear space: even if more memory were available, RBFS has no way to make use of it. Because they forget most of that they have done, both algorithms may end up reexpanding the same states many times over. Furthermore, they suffer the potentially exponential increase in complexity associated with redundant paths in graphs (see Section 3.3).

It seems sensible, therefore, to use all available memory. Two algorithms that do this are MA∗(memory-bounded A∗) and SMA∗(simplified MA∗). We will describe SMA∗, which MA*

SMA* is—well—simpler. SMA∗proceeds just like A∗, expanding the best leaf until memory is full. At this point, it cannot add a new node to the search tree without dropping an old one. SMA∗ always drops the worst leaf node—the one with the highest f -value. Like RBFS, SMA∗

then backs up the value of the forgotten node to its parent. In this way, the ancestor of a forgotten subtree knows the quality of the best path in that subtree. With this information, SMA∗regenerates the subtree only when all other paths have been shown to look worse than

Zerind Arad

Sibiu

Arad Fagaras Oradea

Craiova Sibiu

Bucharest Craiova Rimnicu Vilcea

Zerind Arad Sibiu Arad Sibiu Bucharest Rimnicu Vilcea Oradea Zerind Arad Sibiu Arad Timisoara Timisoara Timisoara

Fagaras Oradea Rimnicu Vilcea

Craiova Pitesti Sibiu

646 415 671 526 553 646 671 450 591 646 671 526 553 418 615 607 447 449 447 447 449 449 366 393 366 393 413 413 417 415 366 393 415 450 417 Rimnicu Vilcea Fagaras 447 415 447 447 417

(a) After expanding Arad, Sibiu, and Rimnicu Vilcea

(c) After switching back to Rimnicu Vilcea and expanding Pitesti

(b) After unwinding back to Sibiu and expanding Fagaras

447 447 ∞ ∞ ∞ 417 417 Pitesti

Figure 3.27 Stages in an RBFS search for the shortest route to Bucharest. Thef -limit

value for each recursive call is shown on top of each current node, and every node is labeled with itsf -cost. (a) The path via Rimnicu Vilcea is followed until the current best leaf (Pitesti)

has a value that is worse than the best alternative path (Fagaras). (b) The recursion unwinds and the best leaf value of the forgotten subtree (417) is backed up to Rimnicu Vilcea; then Fagaras is expanded, revealing a best leaf value of 450. (c) The recursion unwinds and the best leaf value of the forgotten subtree (450) is backed up to Fagaras; then Rimnicu Vilcea is expanded. This time, because the best alternative path (through Timisoara) costs at least 447, the expansion continues to Bucharest.

the path it has forgotten. Another way of saying this is that, if all the descendants of a noden are forgotten, then we will not know which way to go from n, but we will still have an idea of how worthwhile it is to go anywhere fromn.

Section 3.5. Informed (Heuristic) Search Strategies 103

The complete algorithm is too complicated to reproduce here,8but there is one subtlety worth mentioning. We said that SMA∗expands the best leaf and deletes the worst leaf. What if all the leaf nodes have the same f -value? To avoid selecting the same node for deletion and expansion, SMA∗ expands the newest best leaf and deletes the oldest worst leaf. These coincide when there is only one leaf, but in that case, the current search tree must be a single path from root to leaf that fills all of memory. If the leaf is not a goal node, then even if it is on an optimal solution path, that solution is not reachable with the available memory. Therefore, the node can be discarded exactly as if it had no successors.

SMA∗ is complete if there is any reachable solution—that is, if d, the depth of the shallowest goal node, is less than the memory size (expressed in nodes). It is optimal if any optimal solution is reachable; otherwise it returns the best reachable solution. In practical terms, SMA∗ is a fairly robust choice for finding optimal solutions, particularly when the state space is a graph, step costs are not uniform, and node generation is expensive compared to the overhead of maintaining the frontier and explored set.

On very hard problems, however, it will often be the case that SMA∗is forced to switch back and forth continually among many candidate solution paths, only a small subset of which can fit in memory. (This resembles the problem of thrashing in disk paging systems.) Then THRASHING

the extra time required for repeated regeneration of the same nodes means that problems that would be practically solvable by A∗, given unlimited memory, become intractable for SMA∗. That is to say, memory limitations can make a problem intractable from the point of view of computation time. Although there is no theory to explain the tradeoff between time and memory, it seems that this is an inescapable problem. The only way out is to drop the optimality requirement.

In document Vida Sin Límites-clifford Goldstein (página 85-89)