This is an old revision of the document!
Table of Contents
Chapter 3: Graphs
Section 1: Basic Definitions and Applications
A graph is a data structure that provides a way of “encoding pairwise relationships among a set of objects”. It contains a set of nodes, which can also be called vertices, and collection of edges, where each edge joins a pair of nodes. If you are creating a graph where edges indicate a “symmetric relationship” between the nodes they connect, you are creating an undirected graph. However, if you want to code asymmetric relationships, you want to use a directed graph, where each edge has a head and a tail node. Graphs are useful for representing a number of things, which include: Transportation networks, communication networks, information networks, social networks, and dependency networks. Transportation networks can be represented by nodes as the cities or particular locations, and the edges representing the paths of travel between the locations. Communication networks utilize the nodes as devices, and the edges as representing the link between them. Information networks represent the webpages as the nodes, and hyperlinks between two webpages as the edges. Social networks represent people as the nodes, and social connections between them as the edges. Dependency networks represent elements as the nodes, and dependencies/requisites as the edges. “One of the fundamental operations in a graph is traversing a sequence of nodes connected by edges”. A simple path is a path in which all vertices are distinct from each other, meaning no vertex is visited more than once. A cycle is a path that begins at a node and ends at the same node where the nodes visited in the path are only visited once. A connected graph is a graph that maintains a path between every pair of vertices. The distance between two nodes is the minimum number of edges that must be traversed to get from one node to another. A graph can be defined as a tree if it is connected and does not contain a cycle. Trees look very similar to the heaps discussed in the previous chapter where each node has a parent and descendants. Rooted trees are trees that have a designated root node. If g is an undirected graph with n nodes, then if it has two of these three characteristics it automatically has the third, and thus means that it is a tree: 1) G is connected, 2) G does not contain a cycle, 3) G has n-1 edges.
Section 2: Graph Connectivity and Graph Traversal
One of the main questions regarding graphs is the problem determining s-t connectivity. This problem looks at a two nodes, s and t, within a graph and attempts to determine whether or not there is a path between them. For smaller graphs, it is very easy to just look at them to attempt to find a path, however the goal is to obtain an algorithm that will solve the problem when dealing with very large graphs. There are two main algorithms that are used to solve this problem, which are breadth-first search (BFS) and depth-first search (DFS). Breadth-first search starts at the designated start node, and explores outwards in all directions, moving along layer by layer. Each layer of nodes is a specific distance away from the start node, and are connected to the start node through nodes contained in the previous layers. All nodes that are one edge away from the start node are included in layer 1. All nodes that are 1 layer away from nodes in layer 1 AND are not included in layer 1 are then included in layer 2. This process repeats until all nodes in the graph are contained within a layer. This search algorithm can be thought of as a “flood”, where we slowly move outwards in all directions passing over nodes. 'For each j >= 1, layer Lj produced by BFS consists of all nodes at distance exactly j from s. There is a path from s to t if and only if t appears in some layer'. One property of a BFS is that it produces a tree as its result, with s as the root node. 'Let T be a breadth-first search tree, let x and y be nodes in T belonging to layers Li and Lj respectively, and let (x, y) be an edge of G. Then i and j differ by at most 1.' 'The set R produced at the end of the BFS algorithm is precisely the connected component of G containing S.' The BFS determines the distance between a node and s by result of what layer the node is located in. The depth-first search takes the approach if exploring a path all the way to the very end, and then backtracking until it finds another unexplored path to follow and repeats the process using a recursive form. Once a node has been explored, it is marked to that the DFS knows where it has already traversed. DFS also produces a tree with s as the root, however the tree has a much different structure. The BFS tree is very wide but short, whereas the DFS tree is very skinny but tall. 'Let T be a depth-first search tree, let x and y be nodes in T, and let (x, y) be an edge of G that is not an edge of T. Then one of x or y is an ancestor of the other.' 'For any two nodes s and t in a graph, their connected components are either identical or disjoint.' That is, they can either be divided into two separate graphs or are connected in some way.
