This chapter introduces basic graph definitions and algorithms.
A graph consists of a collection of nodes (V) and edges (E). Directed graphs have “one-way” edges to indicate asymmetric relationships while undirected graphs have “two-way” edges to indicate symmetric relationships. Some examples of graphs include transportation, communication, information and social networks.
A path in a graph is a sequence of nodes where each consecutive pair is connected by an edge. A path is simple if all the vertices are distinct. A path is a cycle if the first and last node are the same. We say a graph is connected if there is a path from one node to every other. The distance from one node to another is the length of the shortest path between them, measured in number of edges.
A graph is a tree if it is connected and does not contain a cycle. Trees represent hierarchies.
Theorem: Any two of the following imply the third (i) G is connected, (ii) G does not contain a cycle, (iii) G has n-1 edges
Readability: 8
In this section, we want to determine if there is a path from node s to node t. One way to do this is using Breadth First Search (BFS). In BFS we find all the nodes connected to a given layer before moving on. If we start at s, we know there is a path to t if t is in some layer j and that in this case, the distance from s to t is j. We also know BFS produces a BFS tree, rooted at s and that if two nodes are connected by an edge, they will be at most one layer apart in the tree. The set R produced by BFS is the connected component of G containing s.
Another way to determine if there is a path is using Depth First Search (DFS). DFS follows a path until it reaches a dead-end and then backtracks. DFS visits the same nodes as BFS, but in a different order. It produces a DFS tree rooted at s that tends to be narrow. If two nodes are connected in our graph, then one is ancestor of the other in the DFS tree.
Both BFS and DFS produce the connected component containing s. It is useful to recognize that for two nodes, their connected components are either identical or disjoint.
Readability: 8
Now we need to implement graphs, DFS and BFS. Graphs can be implemented as an adjacency matrix or an adjacency list. The adjacency matrix representation requires O(n^2) space while the adjacency list representation requires only O(m+n) space.
We can use stacks and queues to help implement BFS and DFS. Queues support first in first out access. Stacks support last in first out access. For BFS, we can manage each list of the nodes on a layer as either a stack or a queue. BFS runs in O(m+n) time. In DFS, we need to use a stack to process the nodes since we will need to backtrack. DFS also runs in O(m+n) time.
Readability: 8
We can use BFS to test whether a graph is bipartite. Recall, a graph is bipartite if you can color its nodes red and blue so every edge has one red end and one blue end.
To see if a graph is bipartite, we will pick a node to start and color it red. Then, we will color all of its neighboring nodes (layer 1) blue and all those node's neighbors (layer 2) red, etc. This is exactly like BFS, except with an extra color array.
Note that if there is no edge of G joining two nodes in the same layer, then G is bipartite. If there is an edge of edge of G joining two nodes of the same layer, G contains an odd length cycle and is not bipartite.
Readability: 8
In this section, we begin to consider directed graphs. We represent directed graphs with two adjacency lists - one of the nodes to which s has edges and one of the nodes from which it has edges.
BFS and DFS are almost the same as in undirected graphs. Both discover the set of nodes reachable from s, but in directed graphs, it is not guaranteed that we can get back to s from these nodes. If there is a path from s to t and from t to s, then we say s and t are mutually reachable. We say a directed graph is strongly connected if every pair of nodes are mutually reachable.
The strong component containing s in a directed graph is the set of all v such that s and v are mutually reachable. The strong component is similar to the connected component for an undirected graph. Like the connected component, strong components are either identical or disjoint.
Readability: 8
A special type of directed graph that contains no cycles is a directed acyclic graph (DAG). DAGs can be used to represent precedence relations or dependencies. One natural problem relating to DAGs is how to create a topological ordering (an ordering in which all edges point forward) from a DAG. This is the equivalent to finding the order in which we may do tasks which depend on each other. It is easy to see if a graph has a topological ordering, it is a DAG, but it is not obvious every DAG has a topological ordering.
In fact, every DAG does have a topological ordering. We can find it by finding a node with no incoming edges, ordering it next in our topological ordering, deleting it from our graph, and repeating until we have added every node. This algorithm has O(m+n) run time. (p 103)
Readability: 8