This is an old revision of the document!

Chapter 3: Graphs

Section 1: Basic Definitions and Applications

A graph is a data structure that provides a way of “encoding pairwise relationships among a set of objects”. It contains a set of nodes, which can also be called vertices, and collection of edges, where each edge joins a pair of nodes. If you are creating a graph where edges indicate a “symmetric relationship” between the nodes they connect, you are creating an undirected graph. However, if you want to code asymmetric relationships, you want to use a directed graph, where each edge has a head and a tail node. Graphs are useful for representing a number of things, which include: Transportation networks, communication networks, information networks, social networks, and dependency networks. Transportation networks can be represented by nodes as the cities or particular locations, and the edges representing the paths of travel between the locations. Communication networks utilize the nodes as devices, and the edges as representing the link between them. Information networks represent the webpages as the nodes, and hyperlinks between two webpages as the edges. Social networks represent people as the nodes, and social connections between them as the edges. Dependency networks represent elements as the nodes, and dependencies/requisites as the edges. “One of the fundamental operations in a graph is traversing a sequence of nodes connected by edges”. A simple path is a path in which all vertices are distinct from each other, meaning no vertex is visited more than once. A cycle is a path that begins at a node and ends at the same node where the nodes visited in the path are only visited once. A connected graph is a graph that maintains a path between every pair of vertices. The distance between two nodes is the minimum number of edges that must be traversed to get from one node to another. A graph can be defined as a tree if it is connected and does not contain a cycle. Trees look very similar to the heaps discussed in the previous chapter where each node has a parent and descendants. Rooted trees are trees that have a designated root node. If g is an undirected graph with n nodes, then if it has two of these three characteristics it automatically has the third, and thus means that it is a tree: 1) G is connected, 2) G does not contain a cycle, 3) G has n-1 edges.

Section 2: Graph Connectivity and Graph Traversal

One of the main questions regarding graphs is the problem determining s-t connectivity. This problem looks at a two nodes, s and t, within a graph and attempts to determine whether or not there is a path between them. For smaller graphs, it is very easy to just look at them to attempt to find a path, however the goal is to obtain an algorithm that will solve the problem when dealing with very large graphs. There are two main algorithms that are used to solve this problem, which are breadth-first search (BFS) and depth-first search (DFS). Breadth-first search starts at the designated start node, and explores outwards in all directions, moving along layer by layer. Each layer of nodes is a specific distance away from the start node, and are connected to the start node through nodes contained in the previous layers. All nodes that are one edge away from the start node are included in layer 1. All nodes that are 1 layer away from nodes in layer 1 AND are not included in layer 1 are then included in layer 2. This process repeats until all nodes in the graph are contained within a layer. This search algorithm can be thought of as a “flood”, where we slowly move outwards in all directions passing over nodes. 'For each j >= 1, layer Lj produced by BFS consists of all nodes at distance exactly j from s. There is a path from s to t if and only if t appears in some layer'. One property of a BFS is that it produces a tree as its result, with s as the root node. 'Let T be a breadth-first search tree, let x and y be nodes in T belonging to layers Li and Lj respectively, and let (x, y) be an edge of G. Then i and j differ by at most 1.' 'The set R produced at the end of the BFS algorithm is precisely the connected component of G containing S.' The BFS determines the distance between a node and s by result of what layer the node is located in. The depth-first search takes the approach if exploring a path all the way to the very end, and then backtracking until it finds another unexplored path to follow and repeats the process using a recursive form. Once a node has been explored, it is marked to that the DFS knows where it has already traversed. DFS also produces a tree with s as the root, however the tree has a much different structure. The BFS tree is very wide but short, whereas the DFS tree is very skinny but tall. 'Let T be a depth-first search tree, let x and y be nodes in T, and let (x, y) be an edge of G that is not an edge of T. Then one of x or y is an ancestor of the other.' 'For any two nodes s and t in a graph, their connected components are either identical or disjoint.' That is, they can either be divided into two separate graphs or are connected in some way. This section was very easy to read and understand, given I have prior experience with these algorithms, as well as talking about them extensively in class. I'd give this section a 9/10.

Section 3: Implementing Graph Traversal Using Queues and Stacks

In this chapter, we see that BFS and DFS essentially only differ in that one uses a queue and the other uses a stack. There are two main ways of representing graphs: as adjacency matrices, and as adjacency lists. Graphs have two main inputs, which are nodes and edges. When discussing the run time of algorithms that deal with graphs, we denote run times using the edges and nodes. Adjacency matrices are nxn matrices where a 1 denotes an edge between a row/column, and a 0 means there is no edge. This representation allows is to check in constant time whether a particular edge exists in the graph. However this representation takes O(n^2) space, and takes O(n) to see find all edges incident to a particular node. Adjacency lists word better for 'sparse graphs', which have fewer than n^2 edges. In adjacency lists, there is an array of lists that contain all the edges for each node. Adjacency lists take up O(m+n) space, take O(degree(u)) to see if an edge exists, and takes O(m) to find all the edges incident to a node. 'A queue is a set from which we extract elements in first-in, first-out (FIFO) order (select elements in the same order in which they were added. A stack is a set from which we extract elements in last-in, first-out (LIFO) order (select elements in the opposite order in which they were added). In both structures, you select the first element in the list, however in queues you add to the end of the list and in stacks you add to the front of the list. The adjacency list structure is ideal for the BFS algorithm, as for each node visited you want to see what nodes it is connected to. 'The BFS algorithm runs in O(m+n) if the graph is given by the adjacency list representation.' For the BFS either a queue or a stack can be used. For the DFS, you use a stack to maintain the set of nodes, which maintains its recursive structure. 'The DFS algorithm runs in O(m+n) if the graph is represented by an adjacency matrix.' This section was quite easy to traverse and understand. I would give it an 8/10.

Section 4: Testing Bipartiteness: An Application of Breadth-First Search

'A bipartite graph is a graph where the node set can be partitioned into sets X and Y in such a way that ever edge has one end in X and the other end in Y.' A way to restate this is a graph that can have its nodes split into two colored groups where no node in one group has an edge to another node in that group. Every edge of the graph should have a node of one color on one end and a node of a different color on the other end. One characteristic of a bipartite graph is that it cannot contain an odd cycle. A BFS of an algorithm must have 1 of the following two characteristics: 'there is no edge of G joining two nodes of the same layer', or 'there is an edge of G joining two nodes in the same layer'. If the first condition is true, then the graph can be colored in such a way that the definition of bipartiteness is true, however if the second condition is true then the graph has an odd length cycle and therefore is not bipartite. I have prior experience with graphs and conditions for bipartite graphs, so this section was very easy to understand. Talking about in class before reading this section also made it very easy to follow and understand. I'd give it a 9/10.

Table of Contents

Chapter 3: Graphs

Section 1: Basic Definitions and Applications

Section 2: Graph Connectivity and Graph Traversal

Section 3: Implementing Graph Traversal Using Queues and Stacks

Section 4: Testing Bipartiteness: An Application of Breadth-First Search

Section 5

Section 6