Simple Search Techniques

Next: Depth first vs Up: Basic Search Techniques Previous: Basic Search Techniques

Simple Search Techniques

The simplest two search techniques are known as depth first search and breadth first search. These are described in [Rich &Knight, pp 38..] and [Luger &Stubblefield pp92]. These two techniques can be described by looking at how the search tree (representing all the possible paths from the start state) will be traversed. In this case the search tree looks much like the map itself:

From the start town (or start state) a1 there are two possible ways to go - to a2 or to a4. From a2 you can only go to a3, but from a4 there is a choice of a5 or a8, and so on. a2 and a4 are successors of a1, a3 is a successor of a2 and so on.

In breadth first search you would search for a route by trying towns in the following order: a1, then a2 and a4, then a3, a5 and a8, then a6, which is the goal state. So, we will first be trying paths of length 1, then 2, then 3 and so on. The algorithm for breadth first search uses a list (or agenda) of nodes (in this case, towns) to be further explored in the search. Initially the node list is just the initial state (ie, [a1]). Search involves repeatedly taking the first node off the agenda, finding possible successors of the node, and putting them on the end of the agenda. This continues until either the first node on the agenda is the goal or target state, or until the agenda is empty. If the first node is the goal state the algorithm will signal success (e.g., return TRUE). Otherwise the search fails.

So, in our example, initially the agenda is:

[a1]

We take a1 off the agenda, find that possible successors are a2 and a4 and put them on the agenda, which is now:

[a2, a4]

a2 is taken off the agenda, has successor a3, so the new agenda is:

[a4, a3]

a4 has two successors, a5 and a8. These are put on the end of the agenda:

[a3, a5, a8]

a3 doesnt have any successors, so is just removed from the agenda:

[a5, a8]

a5 has two successors, added to the end of the agenda:

[a8, a6, a7]

[We could vary our algorith so that it checked to see if a node was a goal state as soon as it was added to the node list, but we'll assume that we only check when nodes are taken off.]

Anyway, a8 has no successors, so the agenda is:

[a6, a7]

a6 is the goal state, so we exit with success.
As it stands the algorithm either signals success or failure. In general we may want it to return the succesful path that allows us to get from the initial to goal state. One way to do this would be to make an item in the agenda be a path rather than a state - we can still check for success by, for example, checking whether the first element of the path is the goal state, and we can modify our search algorithm to return the path once a successful one is found. This is what is done in exercise 5. We'd start with an agenda like:

[[a1]]

(ie, one item in the agenda, which is a path with one item in). Then after finding a1's successor's we'd have:

[[a2,a1], [a4,a1]]

and so on. At each stage we'd check whether the first item in the first path (a2 in the agenda above) is the goal state.

In depth first search you keep on going down one path until you get to a dead end. You then back up to try alternatives. The order of nodes searched in our example would be: a1, a2, a3, a4, a5, a6, a7, a8. The algorithm for depth first can again be described in terms of how we manipulate an agenda, and is the same as for bread first except that successor nodes are added to the front of the agenda. I'll leave it as an exercise to verify that this algorithm will result in the search sequence above.

Both search techniques can also be applied if we want to reason backwards from the goal state to the initial state, rather than forwards from initial state to goal. In this example it would be much more efficient to reason backwards. There is only one possible path to consider (a6, a5, a4, a1). However, this is just because we simplified the problem artificially to illustrate search techniques. In general it will be less clear whether forward or backward search will be better. (Consider searching for a route from the CS dept to your home - you could search all the routes starting from your home, or all those starting from the department. Which is better will depend whether you live at a major junction of routes, or in an isolated village where there is only one sensible route out.). More generally the choice of direction of search depends on the branching factor (e.g., number of routes out) in each direction.

Both depth first and breadth first search algorithms are made a little more complex if you are searching general graphs rather than trees, especially if those graphs involve loops (so if we're not careful we may end up going back to where we came from). If we naively applied the above algorithm, and used depth first search, then we might get stuck in an infinite loop and never find the solution. One way to avoid this (used in exercise 5) is to make the items in the agenda be paths rather than nodes, and whenever you try to extend that path you check that your successor node does not already exist in that path (ie, that you dont have a loop). If it does, you don't consider that successor.

This approach is OK, and sorts out the problem of infinite loops, but it may not be very efficient for searching certain graphs. Consider the following simple graph.

It doesn't involve loops, but if we use the above algorithm to search for a route from A to H we will do some redundant work. In particular, we will explore all the paths from D onward that first went through B, and then again all the paths from D onward that first went through C. What we want to be able to do is to spot when we get to C that we have already checked out everything leading from D, so that can be ignored (and we go straight to H). (Try using the above algorithm on this graph to convince yourself of this problem). For bigger graphs this may be a really big problem, and the naive search algorithm will be inadequate.

To deal with this we need two lists of nodes - an open node list which is like the agenda we used above, and a closed node list, which contains all the nodes we've searched, and is used to avoid repeating ourself (and also avoids going round in circles). Now, when we take a node of our open node list we put it onto the closed node list so it won't be forgotten. When we obtain the successors of a node we only add to the open node list those of them that don't already appear on the open or closed node lists. The breadth first graph search algorithm can now be stated as follows:

Start with open = [initial-state] and closed = [].
While open [] do:
1. Remove the first node from open - call it X.
2. If X is a goal state, return with success.
3. Put X on the closed list.
4. Generate all the successor nodes of X.
5. Check whether any of the successors are on the open or closed lists. Put any nodes which are NOT on these lists onto the back of the open node list.

For depth first search we would put new nodes on the front of the open node list, but otherwise its the same. Note that in the general algorithm we can have more than one goal state.
Suppose now that we have a more complex (and realistic) map involving possible loops:

Successors of a (town) node are just those that are directly connected to it by a road. We'll now assume that we can go either way up a road, so our successors will be all those towns directly connected by a road.

Suppose we are searching for a route from a2 to a7, using (this time) depth first search. Initially we have:

open = [a2] closed = []

a2 is taken off the open list, put on the closed list, and its sucessors generated: a4 and a1. Neither a4 nor a1 are on the open/closed lists, so are added to the front of the open list:

open = [a4, a1] closed = [a2]

(The algorithm does not define what order a4 and a1 will be put on the list - Normally the first discovered successor will be at the front of the list)
Now a4 is taken off the open list, added to the closed list, and its successors generated: a2 and a6. a2 is on the closed list, so just a6 is added to the front of the open list:

open = [a6, a1] closed = [a4, a2]

a6's successors are a4 and a5. a4 is on closed, so the new lists are:

open = [a5, a1] closed = [a6, a4, a2]

a5 has successors a6, a7 and a3, with a6 already on closed:

open = [a7, a3, a1] closed = [a5, a6, a4, a2]

Now, the first node on open is the goal state, so the algorithm will return with success. However, we can see what would happen if it continued the search (maybe to find other possible paths). a7 would have no successors not on closed, so would just be removed from open (and put on closed). a3 has two successors, a1 and a5, but these are both already on open/closed, so a3 is just put on closed. a1 has two sucessors, a2 and a3, but these are already on closed, so the search ends as open is now empty.
Using graph search versions of depth or breadth first search will involve more memory, though will usually result in searching less nodes. It also involves more overhead when we search each node as we check to see if we've met it before. So, although it is important for problems when there are lots of ways of getting to the same state, for some problems it may be better to stick with the simple tree searching algorithms, even if this occasionally results in some redundant search if we reach a node that has been explored before.

The graph search algorithm can of course be extended so that the final path(s) from initial to goal state can be easily returned - you just have paths on the open and closed lists, rather than nodes. However, things get tricky if you want to obtain all the possible paths from an initial to final state. Then, when you find a node you've searched before, you need to somehow record that there's another path to it. We won't worry too much about this though.

Depth first vs Breadth first Search

Next: Depth first vs Up: Basic Search Techniques Previous: Basic Search Techniques

alison@
Fri Aug 19 10:42:17 BST 1994