This is an old revision of the document!

Chapter 3

3.1 Basic Definitions and Applications

Graph: encodes pairwise relationships among a set of objects; consists of a collection V of nodes and a collection E of edges
Directed Graph: consists of a set of nodes V and a set of directed edges E' (each e' is an ordered pair - u (tail) and v (head) are not interchangeable)

Graph Examples:

Transportation Networks: airport nodes, flight path edges
Communication Networks: connection of computers
Information Networks: World Wide Web and links to various pages
Social Networks: people nodes, friendships edges
Dependency Networks: interdependencies among a collection of objects (i.e.-course offerings w/prerequisites

Paths and Connectivity:

Simple Path: all vertices are distinct from one another
Cycle: the sequence of nodes “cycles back to where it begins
Directed Path/Cycle: the sequence of nodes in the path or cycle must respect the directionality of edges
Connected Undirected Graph: for every pair of nodes u and v, there is a path from u to v
Strongly Connected Directed Graph: for every two nodes, there is a path in both directions
Short Path: route with as few “hops” as possible

Trees:

Connected undirected graph that does not have a cycle
w is a descendant of v if v lies on the path from the root to w
x is a leaf if it has no descendants
Hierarchical
Every n-node tree has exactly n-1 edges
Any two of the following statements implies the third
1. G is connected
2. G does not contain a cycle
3. G has n-1 edges

Personal Thoughts

This section was pretty simple, as it reviewed past concepts that were taught in multiple other classes. I like to have examples for applications of the various structures, so I appreciated the graph examples listed in this section. I think this section set a good foundation for working with graphs and trees. Readability: 10 Interesting: 8

3.2 Basic Definitions and Applications

Breadth-First Search

add nodes one “layer” at a time
start with s, include all nodes that are joined to s by an edge
- Therefore, L_j+1 consists of all nodes that do not belong to an earlier layer and that have an edge to a node in layer L_j
For each j >= 1, layer L_j produced by BFS consists of all nodes at distance exactly j from s. There is a path from s to t if and only if t appears in the same layer.
Produces a tree rooted at s on the set of nodes reachable from s.

 BFS Execution:
   - Starting from node 1, Layer L<sub>1</sub> consists of the nodes {2,3}
   - Layer L<sub>2</sub> is then grown by considering the nodes in layer L<sub>1</sub> in order.For each node, the incident edges are examined. If an edge connects to a node that has already been discovered, it isn't added to the BFS.
   - Consider the nodes in the next layer in order. 
   - When no new nodes are left to be discovered, the algorithm terminates.

Exploring a Connected Component

R = {s}, where R is the connected component of G containing s
Find an edge (u,v) where u is in R, but v is not. → Add v to r
The set R produced at the end of the algorithm is precisely the connected component of G containing s.
Note: algorithm is underspecified, no specific order for choosing edges

Depth-First Search

Explores G by going as deeply as possible and only retreating when necessary.

Start by declaring all nodes to be not explored
For a given recursive call DFS(u), all nodes that are marked “Explored” between the invocation and end of this recursive call are descendants of u in T.

Similarities and Differences between DFS and BFS

Both build connected components containing s
Both create a natural rooted tree T on the component containing s
DFS typically visits the same set of nodes in a different order than BFS
DFS probes down long paths unlike, so the tree will have a very different structure than one from BFS
DFS tree is narrow and deep; BFS tree is short and bushy

The Set of All Connected Components

For any two nodes s ant t in a graph, their connected components are either identical or disjoint

3.3 Implementing Graph Traversal Using Queues and Stacks

Representing Graphs

There are two basic ways to represent graphs:

Adjacency Matrix: nXn matrix
Adjacency List

With at most one edge between any pair of nodes, the number of edges m can be at most (n choose 2) ⇐ n².
Connected graphs: at least m >= n-1 edges

Advantages and disadvantages of Adjacency Matrix vs. List

The adjacency matrix representation of a graph requires O(n²) space, while the adjacency list representation requires only O(m+n) space.
Matrix: allows O(1) time to check if a given edge exists in the graph, but representation takes O(n²) space and checking for edges takes considerable O(n) time
List: works better for sparse graphs, there is an array containing a list of all nodes adjacent to node v, requires O(m+n) space, length of all lists in 2m → O(m)

Degree of a node v is the number of incident edges it has

Queues and Stacks

Order of element selection is crucial in DFS and BFS
Queue: set from which we extract elements in FIFO order
Stack: set from which we extract elements in LIFO order
Use a doubly linked list as representations
- Queue: new element is added to the end of the lists as the last element
- Stack: new element is added to the first position
- Insertions are O(1)

Implementing Breadth-First Search

Use an adjacency list
Array of Discovered nodes, length n

Can use either a queue or stack in the below algorithm since order in each layer doesn't matter.

This algorithm runs in O(m+n) time when using the adjacency list representation.
Algorithm can be implemented using a single list L maintained as a queue; nodes are processed in the order they are first discovered, hence all nodes in a layer are processed as a contiguous sequence

Implementing Depth-First Search

Nodes are processed in a stack, rather than in a queue.
Explores a node u, scans neighbors of u, shifts attention to first not-yet explored node v, explores v
Set array Explored[v] to be true when we scan v's incident edges
Algorithm is underspecified: adjacency list of a node being explored can be processed in any order

Have an array parent; set parent[v] = u when node v is added to stack; when node u (but not s) is marked as Explored, edge can be added to the tree
Adding and deleting nodes from stack is O(1)
Total number of nodes added to S: O(m+n)

Finding the Set of All Connected Components

O(m+n): although algorithm may run BFS/DFS multiple times, only a constant amount of work on a given edge or node is done in the iteration when the connected component it belongs to is under consideration

3.4 Testing Bipartiteness: An Application of Breadth-First Search

Bipartite Graph: Node set V can be partitioned into sets X and Y in such a way that every edge has one end in X and the other end in Y
Examples: cycles of odd lengths or graphs that contain an odd cycle are not bipartite

Designing the Algorithm

Assume graph G is connected
Identical procedure as BFS, move outward from s coloring the nodes
Implement this on top of BFS, taking BFS implementation and adding an extra array Color over the nodes (colors are assigned based on even/odd)
Runtime: O(n)

Analyzing the Algorithm

Let G be a connected graph, and let L₁, L₂,…be the layers produced by BFS starting at node s. Then exactly of of the following two things must hold:
1. There is no edge of G joining two nodes of the same layer: Bipartite
2. There is an edge of G joining two nodes of the same layer: Not Bipartite

3.5 Connectivity in Directed Graphs

Directed Graph: edge goes from u to v (asymmetric relationship)

Representing Directed Graphs

Use an adjacency list representation where each node has two lists associated with it
1. Consists of nodes to which it has edges
2. Consists of nodes from which it has edges

The Graph Search Algorithms

BFS & DFS are very similar if they are in undirected graphs
BFS: start at node s, define a first layer of nodes w/ all those to which s has an edge, define a second layer…, etc.
Nodes in layer j are precisely those for which the shortest path from s has exactly j edges
Running time: O(m+n)
DFS: also linear, recursively launches a depth-first search in order for each node to which u has an edge

Strong Connectivity

A graph is strongly connected if for every two nodes u and v, there is a path from u to v and a path from v to u: mutually reachable
Linear time algorithm to test if a directed graph is strongly connected
For any two nodes s and t in a directed graph, their strong components are either identical or disjoint
- if two nodes s and t are mutually reachable, we claim that the strong components containing s and t are identical
- for any node v, if s and v are mutually reachable, then t and v are also mutually reachable
- Compute the strong components for all nodes: O(m+n)

3.6 Directed Acyclic Graphs and Topological Ordering

Undirected graph with no cycles: each of its connected components is a tree
Directed Graph with no cycles is a directed ayclic graph (DAG)

The Problem

DAGs can be used to encode precedence relations or dependencies (i.e.-prerequisites among tasks)
Node for each task, and a directed edge (i,j) whenever i must be done before j
No cycles allowed as there would be no way to do any of the tasks then
Topological Ordering of G: ordering of its nodes so that for every edge, the first node is less than the second node (all edges point “forward” in the ordering)
If G has a topological ordering, then G is a DAG

Designing and Analyzing the Algorithm

Every DAG has a topological ordering
First node cannot have any incoming edges (violation of topological ordering)
- Thus, in every DAG G, there is a node with no incoming edges
- If this is not true, then there is a cycle, so it is not a DAG

Identifying node v with no incoming edges & deleting it from G: O(n)
Algorithm runs for n iterations, running time: O(n²)
Can achieve a running time of O(m+n) using the same algorithm, if we iteratively delete nodes with no incoming edges.
- Declare a node to be “active” if it has not yet been deleted by the algorithm
  - for each node w, the number of incoming edges that w has from active nodes
  - the set S of all active noes in G that have no incoming edges from other active nodes

In the beginning, all nodes are active (initialize both of above)
Each iteration, selects a node v from set S and deletes it
Go through all nodes w to which v had an edge, subtract one from the number of active incoming edges for w
If zero, add w to S
This leads to constant work per edge

Table of Contents