Depth First Search

- Search begins from start vertex v

- Visiting implies printing of nodes vertex field.

- Select an unvisited vertex w from v's adjacency list.

- Carry out depth first search on w.

- The current position on adjacency list is preserved by placing it on stack.

- Eventually, search reaches a vertex u that has no unvisited vertex on its adjacency list.

- For adjacency list, Time complexity is O(E).

- For adjacency matrix, it is O(N2).

The basic object of the traversal is to visit each node at least once, but the algorithm will actually give us much more than that. The DFS algorithm is quite famous and can be used to solve a variety of graph problems. Three main concepts are involved:

1) Colouring nodes.

As the algorithm proceeds, nodes are coloured, reflecting the passage of time:

Colouring the nodes acts as a memory device - it is like dropping bread crumbs in a maze in order to know where they've already been.

2) Keeping time.

Each node u has two times associated with it: the discovery time d[u] and the finish time f[u]. The discovery time corresponds to the colour change from white to grey and the finish time corresponds to the colour change from grey to black. Time starts at 1, and with each labelling of an event (discovery or finish) the time is incremented by one. One drawback of this is that with many nodes, the storage required for all of these times becomes huge; however, even if they are not stored, the times are useful as abstract concepts to aid us in understanding the algorithm.

3) DFS tree.

Each connected component of the graph can be made into a tree by removing some of the edges. Each node u in the tree will have a pointer to its parent p[u], making this a parent-pointer tree.

The DFS Algorithm

This fragment performs an initialization and starts the depth-first-search of a connected graph:

for all nodes u do

{

colour[u]=white;

p[u]=NULL;

}

time = 0;

for all nodes u do

if colour[u] == white then DFS(u);

This is the algorithm itself, which performs the depth-first search: DFS(u):

visit(u);

time = time + 1;

d[u] = time;

colour[u] = grey;

for all nodes v adjacent to u do

{

if colour[v] == white then

{

p[v] = u;

DFS(u);

}

}

time = time + 1;

f[u] = time;

colour[u] = black;

Example Traversal

For a convention, the neighbours in the following example are considered in alphabetical order. The traversal starts at the I.

    1. I turns grey, d[I] = 1

    2. E turns grey, d[E] = 2

    3. E has two neighbours, B and F. By the convention B is visited first: B turns grey, d[B] = 3

    4. Similarly, A, C, & D are visited

    5. D has 4 neighbours: A, C, G, H. A and C are grey so G is visited first, then F.

    6. All of F's neighbours are grey so F is finished: f[F] = 9, F turns black.

    7. Similarly G is finished

    8. H is visited because we have not finished visiting all of D's neighbours

    9. The remaining nodes are finished in the reverse order, backtracking along the original path, and ending with I.

The resultant parent-pointer tree has its root at I, since this is the first node visited. Each new node visited becomes the child of the most recently-visited node. Also, note that while I is the first node to turn grey, it is the last node to turn black (within the main connected component). This is due to the recursion - each of I's neighbours must be processed and finished (turned black) before I can be finished (turned black). As well, all edges of the graph which are not included in the tree (not used in the traversal) are between a node and its ancestor (because all neighbours of a node are visited from that node); these are known as vertical edges. This property of the DFS tree contrasts it with the BFS tree.

The maximum time index in the example is 18, equal to 2 times the number of nodes. This is because time is incremented exactly when a discovery time or a finish time is assigned, and each node is assigned one of each during its lifetime. The time should not be confused with the complexity of the algorithm.