CS211: Design and Analysis of Algorithms

DFS uses a stack while BFS uses a queue. We shall discuss how to use lists and arrays to represent graphs., and discuss the trade-offs between the two different representations.

Representing Graphs. Using an adjacency list or adjacency matrix, are ways we can use to represent a graph. We shall focus on using an adjacency list. n = number of nodes, m = number of edges We shall use these two as parameters of our running times form now on. How do we know if O(n^2) is faster than O(m^3)? Is it important to know the relationship between the two. WE aim to implement the basic graph search algorithms in time O(m+n). This is referred to as linear time.With connected graphs, O(m+n) is the same as O(m), since m >= n-1.

Adjacency matrix. The simplest way to represent a graph. Every entry is either 1 or 0. If the graph is undirected, its adj matrix is symmetric. An adj matrix allows us to check in O(1) time if a given edge is present in the graph. Disadvantages of an adj matrix are: it takes O(n^2) space, and it needs to examine all edges incident to a given node v which may take linear time.

Adjacency list. Works best on sparse graphs.(those with many fewer n^2 edges) It is an array of lists containing nodes(in an array) of a graph and their edges in lists. For an undirected graph, each edge occurs on two adjacency lists. Checking if a particular edge, say (u,v) is in the list takes degree O(Nv). We basically have to check the list of u to see if v is in the list. It requires only O(m+n) space. Page 88 The bound O(m+n) is always better than O(n^2), and it is much better with sparse graphs with m much smaller than n^2. Exploring the graph is easier since you can access a nodes neighbors when you reach it in the array.

Queues and Stacks.

A queue: A set in which we can extract elements in, FIFO(First-in, First-out) A Stack: A set in which we can extract elements in, LIFO(Last-in, First-out) Both of these can be implemented via a doubly linked list.

Implementing BFS. The adjacency list data structure is ideal for implementing BFS. The algorithm examines the edges leaving a given node one by one. When we are exploring the edges of u and reach v, we check if v has been previously discovered. We thus maintain an array Discovered of length n and set Discovered[v] = true as soon as our search first sees v.To maintain the nodes in a layer Li, we have a list L[i] for each i=0,1,2,3… The algorithm for BFS is on Page 90. The BFS algorithm runs in O(m+n) time if the graph is given by an adjacency list representation(Proven on page 91). We described the algorithm using up to n separate lists L[i] for each layer Li. We can implement the algorithm using a single list L that we maintain as a queue. In this way, the algorithm processes nodes in the order they are first discovered. In this way all nodes will appear in the queue in order according to their layers.

Implementing DFS. We presented DFS as a recursive procedure earlier. However, it is almost identical to BFS, except that DFS uses a stack instead of a queue. Essentially, the recursive structure of DFS can be viewed as pushing nodes onto a stack for later processing, while moving onto more freshly discovered nodes. As we stated earlier, BFS and DFS are different in a sense that once BFS starts exploring a node u in layer Li, we added all its newly discovered neighbors to the next layer and so on till none is found undiscovered. DFS, in contrast, is more impulsive. When it explores a node u, it scans the neighbors of u until it finds the first not-yet-explored node v(if any), and then shifts attention to exploring v. We first add all the nodes adjacent to u to our list of nodes to be considered. Then, we proceed to exploring a neighbor node to u, v.We do so in stack order so that we can explore u before exploring v. We use an array Explored, similar to Discovered of BFS, but we only set Explored[v] to true when we scan v's incident edges. The full algorithm is on page 93. DFS is underspecified since the adjacency list of a node being explored can be processed in any order. Note that the node v may be in the stack multiple times as it can be adjacent to multiple nodes u that we explore. Thus we have to keep track of the parent of each node v by simply overwriting the value of its parent every time we add a new copy of v to the stack. The main step in the algorithm is to add and delete nodes to and from the stack S, which takes O(1) time. The running time of the DFS algorithm is O(n+m) if the graph is give by the adjacency list.

Finding the set of all connected components. We can use either DFS or BFS to find all connected component of a graph as we discussed earlier. Although the running times of BFS and DFS are O(m+n), where n is the total number of nodes, and m the total number of edges, both of these algorithms spend work only on edges and nodes in the connected component containing the starting node. Thus, although the above algorithm may run a number of times, it only spends a constant amount of work on a given edge or node on the iteration when the connected component it belongs to is under consideration. Hence the overall running time is still O(m+n).

This section has a scale of readability of 8.5/10.