This is an old revision of the document!
Table of Contents
Chapter 3
3.1 Basic Definitions and Applications
- Graph: encodes pairwise relationships among a set of objects; consists of a collection V of nodes and a collection E of edges
- Directed Graph: consists of a set of nodes V and a set of directed edges E' (each e' is an ordered pair - u (tail) and v (head) are not interchangeable)
Graph Examples:
- Transportation Networks: airport nodes, flight path edges
- Communication Networks: connection of computers
- Information Networks: World Wide Web and links to various pages
- Social Networks: people nodes, friendships edges
- Dependency Networks: interdependencies among a collection of objects (i.e.-course offerings w/prerequisites
Paths and Connectivity:
- Simple Path: all vertices are distinct from one another
- Cycle: the sequence of nodes “cycles back to where it begins
- Directed Path/Cycle: the sequence of nodes in the path or cycle must respect the directionality of edges
- Connected Undirected Graph: for every pair of nodes u and v, there is a path from u to v
- Strongly Connected Directed Graph: for every two nodes, there is a path in both directions
- Short Path: route with as few “hops” as possible
Trees:
- Connected undirected graph that does not have a cycle
- w is a descendant of v if v lies on the path from the root to w
- x is a leaf if it has no descendants
- Hierarchical
- Every n-node tree has exactly n-1 edges
- Any two of the following statements implies the third
- G is connected
- G does not contain a cycle
- G has n-1 edges
Personal Thoughts
This section was pretty simple, as it reviewed past concepts that were taught in multiple other classes. I like to have examples for applications of the various structures, so I appreciated the graph examples listed in this section. I think this section set a good foundation for working with graphs and trees. Readability: 10 Interesting: 8
3.2 Basic Definitions and Applications
Breadth-First Search
- add nodes one “layer” at a time
- start with s, include all nodes that are joined to s by an edge
- Therefore, Lj+1 consists of all nodes that do not belong to an earlier layer and that have an edge to a node in layer Lj
- For each j >= 1, layer Lj produced by BFS consists of all nodes at distance exactly j from s. There is a path from s to t if and only if t appears in the same layer.
- Produces a tree rooted at s on the set of nodes reachable from s.
BFS Execution:
- Starting from node 1, Layer L<sub>1</sub> consists of the nodes {2,3}
- Layer L<sub>2</sub> is then grown by considering the nodes in layer L<sub>1</sub> in order.For each node, the incident edges are examined. If an edge connects to a node that has already been discovered, it isn't added to the BFS.
- Consider the nodes in the next layer in order.
- When no new nodes are left to be discovered, the algorithm terminates.
Exploring a Connected Component
- R = {s}, where R is the connected component of G containing s
- Find an edge (u,v) where u is in R, but v is not. → Add v to r
- The set R produced at the end of the algorithm is precisely the connected component of G containing s.
- Note: algorithm is underspecified, no specific order for choosing edges
Depth-First Search
- Explores G by going as deeply as possible and only retreating when necessary.
- Start by declaring all nodes to be not explored
- For a given recursive call DFS(u), all nodes that are marked “Explored” between the invocation and end of this recursive call are descendants of u in T.
Similarities and Differences between DFS and BFS
- Both build connected components containing s
- Both create a natural rooted tree T on the component containing s
- DFS typically visits the same set of nodes in a different order than BFS
- DFS probes down long paths unlike, so the tree will have a very different structure than one from BFS
- DFS tree is narrow and deep; BFS tree is short and bushy
The Set of All Connected Components
- For any two nodes s ant t in a graph, their connected components are either identical or disjoint
3.3 Implementing Graph Traversal Using Queues and Stacks
Representing Graphs
There are two basic ways to represent graphs:
- Adjacency Matrix: nXn matrix
- Adjacency List
- With at most one edge between any pair of nodes, the number of edges m can be at most (n choose 2) ⇐ n².
- Connected graphs: at least m >= n-1 edges
Advantages and disadvantages of Adjacency Matrix vs. List
- The adjacency matrix representation of a graph requires O(n²) space, while the adjacency list representation requires only O(m+n) space.
- Matrix: allows O(1) time to check if a given edge exists in the graph, but representation takes O(n²) space and checking for edges takes considerable O(n) time
- List: works better for sparse graphs, there is an array containing a list of all nodes adjacent to node v, requires O(m+n) space, length of all lists in 2m → O(m)
- Degree of a node v is the number of incident edges it has
Queues and Stacks
- Order of element selection is crucial in DFS and BFS
- Queue: set from which we extract elements in FIFO order
- Stack: set from which we extract elements in LIFO order
- Use a doubly linked list as representations
- Queue: new element is added to the end of the lists as the last element
- Stack: new element is added to the first position
- Insertions are O(1)
Implementing Breadth-First Search
- Use an adjacency list
- Array of Discovered nodes, length n
- Can use either a queue or stack in the below algorithm since order in each layer doesn't matter.
- This algorithm runs in O(m+n) time when using the adjacency list representation.
- Algorithm can be implemented using a single list L maintained as a queue; nodes are processed in the order they are first discovered, hence all nodes in a layer are processed as a contiguous sequence
Implementing Depth-First Search
- Nodes are processed in a stack, rather than in a queue.
- Explores a node u, scans neighbors of u, shifts attention to first not-yet explored node v, explores v
- Set array Explored[v] to be true when we scan v's incident edges
- Algorithm is underspecified: adjacency list of a node being explored can be processed in any order
- Have an array parent; set parent[v] = u when node v is added to stack; when node u (but not s) is marked as Explored, edge can be added to the tree
- Adding and deleting nodes from stack is O(1)
- Total number of nodes added to S: O(m+n)
Finding the Set of All Connected Components
- O(m+n): although algorithm may run BFS/DFS multiple times, only a constant amount of work on a given edge or node is done in the iteration when the connected component it belongs to is under consideration
3.4 Testing Bipartiteness: An Application of Breadth-First Search
- Bipartite Graph: Node set V can be partitioned into sets X and Y in such a way that every edge has one end in X and the other end in Y
- Examples: cycles of odd lengths or graphs that contain an odd cycle are not bipartite
Designing the Algorithm
- Assume graph G is connected
- Identical procedure as BFS, move outward from s coloring the nodes
- Implement this on top of BFS, taking BFS implementation and adding an extra array Color over the nodes (colors are assigned based on even/odd)
- Runtime: O(n)
Analyzing the Algorithm
- Let G be a connected graph, and let L1, L2,…be the layers produced by BFS starting at node s. Then exactly of of the following two things must hold:
- There is no edge of G joining two nodes of the same layer: Bipartite
- There is an edge of G joining two nodes of the same layer: Not Bipartite
3.5 Connectivity in Directed Graphs
- Directed Graph: edge goes from u to v (asymmetric relationship)
Representing Directed Graphs
- Use an adjacency list representation where each node has two lists associated with it
- Consists of nodes to which it has edges
- Consists of nodes from which it has edges
The Graph Search Algorithms
- BFS & DFS are very similar if they are in undirected graphs
- BFS: start at node s, define a first layer of nodes w/ all those to which s has an edge, define a second layer…, etc.
- Nodes in layer j are precisely those for which the shortest path from s has exactly j edges
- Running time: O(m+n)
- DFS: also linear, recursively launches a depth-first search in order for each node to which u has an edge
Strong Connectivity
- A graph is strongly connected if for every two nodes u and v, there is a path from u to v and a path from v to u: mutually reachable
- Linear time algorithm to test if a directed graph is strongly connected
- For any two nodes s and t in a directed graph, their strong components are either identical or disjoint
- if two nodes s and t are mutually reachable, we claim that the strong components containing s and t are identical
- for any node v, if s and v are mutually reachable, then t and v are also mutually reachable
- Compute the strong components for all nodes: O(m+n)
3.6 Directed Acyclic Graphs and Topological Ordering
- Undirected graph with no cycles: each of its connected components is a tree
- Directed Graph with no cycles is a directed ayclic graph (DAG)
The Problem
- DAGs can be used to encode precedence relations or dependencies (i.e.-prerequisites among tasks)
- Node for each task, and a directed edge (i,j) whenever i must be done before j
- No cycles allowed as there would be no way to do any of the tasks then
- Topological Ordering of G: ordering of its nodes so that for every edge, the first node is less than the second node (all edges point “forward” in the ordering)
- If G has a topological ordering, then G is a DAG
Designing and Analyzing the Algorithm
- Every DAG has a topological ordering
- First node cannot have any incoming edges (violation of topological ordering)
- Thus, in every DAG G, there is a node with no incoming edges
- If this is not true, then there is a cycle, so it is not a DAG
- Identifying node v with no incoming edges & deleting it from G: O(n)
- Algorithm runs for n iterations, running time: O(n²)
- Can achieve a running time of O(m+n) using the same algorithm, if we iteratively delete nodes with no incoming edges.
- Declare a node to be “active” if it has not yet been deleted by the algorithm
- for each node w, the number of incoming edges that w has from active nodes
- the set S of all active noes in G that have no incoming edges from other active nodes
- In the beginning, all nodes are active (initialize both of above)
- Each iteration, selects a node v from set S and deletes it
- Go through all nodes w to which v had an edge, subtract one from the number of active incoming edges for w
- If zero, add w to S
- This leads to constant work per edge
