This is an old revision of the document!

Chapter 3: Graphs

Section 1: Basic Definitions and Applications

Summary:

Graphs are useful in discrete math, and they show up EVERYWHERE if you're looking at things that way! A graph (a.k.a. Undirected Graph) consists of a collection V of nodes (a.k.a. vertexes) and a collection E of edges (each edge joins two nodes and is represented by a two-element subset of V). Edges indicate a symmetric relationship. Directed graphs have asymmetric relationships (directed edges represented by ordered pairs of nodes). Examples of graphs: transportation networks (ex. airlines - airports are nodes, nonstop flights are edges), communication networks (ex. computer networks modeled by a node for each machine and an edge for direct physical links), information networks, (ex. web pages as nodes and directed edges when there's a link from one to another), social networks (people are nodes, friendships are edges), dependency networks (ex. directed graphs to show courses that have prerequisites) … etc. A path in an undirected graph is a sequence of nodes such that each consecutive pair is joined by an edge. A cycle is a path that can come back to the same node. Same for directed graphs, but the path has to consider the directionality of the edges. If a graph is connected, that means that there's a path from every node to every other node. Directed graphs are strongly connected if there's a path from every node to and from every other node. Distance between two nodes is the minimum number of edges in the path from one node to the other. A graph is a tree if it's connected and doesn't contain a cycle. Deleting any edge will disconnect a tree. Rooting a tree is like grabbing one node and letting the rest of the tree hang down. Parent is the node that directly precedes a node in the path from the root to some other node. A node is its parent's child. Ancestor/descendant means that a node is somewhere in the path from a node to the root. A node is a leaf if it doesn't have any descendants. Rooted trees encode the notion of a hierarchy (ex. employees are nodes and report to the employee at their parent node). Rooting a tree can make it easy to answer some questions … how many edges are there in a tree? Well every node except the root has exactly one edge to its parent so the tree has one less edge than it has nodes. In fact, any two of these implies the third: G is connected, G does not contain a cycle, G has n-1 edges.

Insights and Questions:

Seems like pretty basic stuff. It's interesting to learn about the different examples of things that can be represented by graphs/trees (but I don't think the book did a very good job of listing the most interesting ones … we mentioned food chains and pixels on a screen in class and those seem way more interesting than the ones listed … maybe not as useful, though).

Readability:

I thought it was pretty easy to read, but it would have been more clear if they'd fully discussed undirected graphs and then gone into directed graphs instead of flip flopping between the two.

Section 2: Graph Connectivity and Graph Traversal

Summary:

Given two nodes, s and t, how do we efficiently figure out if there is a path between them (s-t connectivity a.k.a. maze-solving problem)?

1. Breadth-First Search: explore outward from s in all possible directions, adding nodes one “layer” at a time. First layer is all nodes directly connected to s. Second layer is any nodes connected directly to any of the first layer nodes (that haven't already been explored, etc. until no new nodes are encountered. We “flood” the graph starting at s with a “wave” that expands to visit all of the nodes that it can possibly reach. The layer a node is in is equal to its distance from s (i.e. BFS determines shortest paths from s to any node that it can reach). It also produces a tree rooted at s (“Breadth-first search tree”). If there's an edge between two nodes, then they must either be in neighboring layers or in the same layer (the book gives a proof of this and some more formal definitions and algorithms for BFS … these can also be found in our notes).

When you do searches like this, the set of nodes that end up in the tree are the “connected component” of G containing s. Actually the order you visit the nodes doesn't matter, just that you continue growing the set R of connected components until there are no more edges leading out of R.

2. Depth-First Search: since the order you explore the nodes/edges doesn't matter, here's another pattern for exploring them. You start at node s and try the first edge out of it, leading to v. Then you follow the first node out of v and continue until you get to a “dead end.” Then you backtrack until you get to a node with an edge to an unexplored node. This algorithm, depth-first search, goes as deep as possible into the graph and retreats when necessary. DFS visits nodes in a very different order than BFS (usually). It also produces a tree rooted at s, but it's narrower and deep instead of short and fat (does not contain shortest paths). Nontree elements can only connect ancestors to descendants (book proves this).

The set of all connected components - there's a connected component associated to each node (contains all of the nodes that are reachable from that node), but what's the relationship between these connected components? They're either identical or disjoint (book gives a proof & we went over it in class, so it's in the notes/slides).

Insights and Questions:

The book's way of drawing DFS graphs looked like how we started to draw topological graphs. I'm interested to see how/if we use DFS to obtain that graph.

Readability:

I didn't like the way they represented DFS trees (graphically). I understood it, and it looks a lot like what we started to discuss at the end of the last class with directed graphs and their topological order, but I think for a first-time look at DFS trees, it's easier to conceive of them other ways.

Section 3: Implementing Graph Traversal Using Queues and Stacks

Summary:

This section goes into implementation details of graphs and BFS and DFS. Graphs can be represented with an adjacency matrix (an N by N matrix where the cell contains a 1 if there's an edge between the row nummber-th and column number-th nodes) and an adjacency list (a list of length N where each element is a list of all of the nodes that the i-th node has an edge to). For discussing running times with this, we talk about them in terms of the number of nodes (n) and the number of edges (m). It's not always clear what better running times are because it depends on the relationship between m and n. Aim for running times is O(m+n), which is considered linear because that's the amount of time it takes to just read the input. For connected graphs, O(m+n) is the same as O(m) because m >= n-1. Adjacency matrix takes nxn space, is simple, takes O(n) time to find all edges, but that's not super efficient for graphs that are sparsely edge-ed. Adjacency lists only take O(m+n) space (the book goes into proofs of this). The book says the adjacency list is generally better, more intuitive, and that's what it'll use. Algorithms for graph-processing often have an inner step of processing a set of elements. Store those in a linked list, but what order should we consider them in? Two natural options are to keep them in a queue (see them in first-in-first-out order) or a stack (see them in first-in-last-out order). BFS is implemented using a queue. Each time a node is discovered, it's added to a queue, and the algorithm always processes the edges out of the node that's currently first in the queue (the book goes into more specific algorithms for BFS and proves that it runs in O(m+n) time. DFS is implemented using a stack, but is otherwise almost identical to BFS. There are also some differences in how nodes are “discovered” and “explored” (DFS is more “impulsive” about exploring) and there's (again) an algorithm and proofs that this is explores in the same way as described in the previous section and that it runs in O(m+n) time (around p. 93).

Insights and Questions:

I actually find it EASIER to think about BFS and DFS in terms of stacks and queues than abstractly.

Readability:

I thought it was very readable. It's awesome how consistent they are with the order of explaining things from chapter to chapter and section to section (i.e. BFS before DFS always).

Section 4: Testing Bipartiteness: An Application of Breadth-First Search

Summary:

Insights and Questions:

Readability:

Section 5: Connectivity in Directed Graphs

Summary:

Insights and Questions:

Readability:

Section 6: Directed Acyclic Graphs and Topological Ordering

Summary:

Insights and Questions:

Readability:

Table of Contents