Table of Contents
Lee's Journal
Preface:
The goal the authors have is to teach the reader the following: how to precisely describe a problem, how to recognize which design principle is appropriate for a particular problem, and how to develop efficient algorithms. The authors list areas that use algorithmic design principles such as network flow, artificial intelligence, and data mining. According to the authors, algorithms are not only an area within computer science, but a powerful tool to apply to other areas in computer science as well. The Preface provided a clear overview of the subject and goals of the course, and is a good choice as our first reading assignment. I would rate the Preface as an 7/10.
Chapter 1: Introduction to Representative Problems
The first example the authors use to accomplish the above mentioned goals is the Stable Matching Program. The Stable Matching Program originated in 1962 with two economists, David Gale and Lloyd Shapely. They were concerned with developing a recruiting process that was self-enforcing. A self-enforcing process would encourage the applicants and companies to keep their decisions final resulting in stability. The goal of the algorithm is to produce stable matchings, matchings with no instabilities,i.e. participants will not changes his/her current situation. In order for Gale and Shapely to design an algorithm to solve this problem, they focus on a simpler model which focuses on matching members of a set in marriages. The authors of the book then analyse the algorithm to ensure it works using direct proofs. The authors then discuss other famous problems such as the Interval Scheduling and Bipartite Matching, both of which are special cases of the Independent Set Problem. The Independent Set Problem is an example of a NP-complete problem. NP-complete problems are difficult to solve, but whose solutions can easily be verified. The last example they discuss is the Competitive Location problem which is an example PSPACE-complete problems. PSPACE-complete problems are both difficult to solve and verify. I enjoyed reading Chapter 1, particularly when NP-complete and PSPACE-complete problems were mentioned. I would rate the reading as an 8/10.
Chapter 2: Basics of Algorithm Analysis
2.1 - 2.2 The goal of algorithm analysis is to determine if an efficient solution exist to solve a given problem. In the first section of Chapter 2, the authors propose various definitions of “efficiency”. They ultimately decide that an algorithm should be described as “efficient” only if it has a worst-case running time that is polynomial; this is due to the fact that problems with a polynomial running time algorithms have running times proportional to n, nlog(n), n^2, or n^3 and problems with no known polynomial running time algorithm are very difficult to solve. The authors then discuss asymptotic upper, lower, and tighter bounds. The asymptotic upper bound is the worst-case performance and the lower bound is the best-case performance. A performance is described as “tight” if the upper bound and lower bound are the same, which results in a performance that is optimal. I thought 2.2 was a little more difficult to understand. I would rate the reading as a 5/10.
2.3 - 2.5 The authors then begin to discuss how to implement the Gale-Shapley Algorithm using either lists or arrays. The trade-off of implementing the algorithm with lists only is that although insertion and deletion are done in O(1), one cannot find the ith element of the list in O(1) unlike arrays. Arrays, however, cannot perform insertion and deletion in O(1). As a result, the authors decide to implement the preference lists with arrays and the list of free men with a linked list. The authors then discuss some common running times. A linear running time, O(n), is a constant times the size of the input. An example of an algorithm with a linear running time is searching for a particular element within a list. This is due to the fact that the worst-case scenario is that the element could be the nth element within the list. A logarithmic running time, O(nlog n), is arises when algorithm splits its input into two equal pieces and computes each of the pieces recursively resulting in a O(log n) running time; combining the split input occurs in linear time. A quadratic running time, O(n^2), arises usually with nested loops or when an algorithm is iterating over two n-sized inputs. An exponential running time, O(n^k), arises when an algorithm performs some computation for every nth node with a size of k. Running times help algorithm designers determine if their design is an improvement over the less elegant brute-force algorithm. The authors then discuss another data structure, a heap, and use it to implement another data structure known as a priority queue. In a priority queue, the element with the highest priority is added to the head of the queue. A heap is an intuitive data structure to implement the priority queue, since the root of the heap always has the smallest key. If an element is added to the heap, then the functions heapify-up will compare the new element's key with its parent and if the child's key is smaller than the the parent's the function will swap the places of the parent and child and recur up. If an element is deleted from the heap then the smaller of its children will fill the “hole” that resulted from its absence. The algorithm will then call either heapify-up if the child is smaller than its parent or heapify-down if it is larger. I thought Chapter was pretty easy to understand. I would rate it as an 8/10.
Chapter 3: Graphs
Graphs are useful expressions of discrete problems. Thus, the authors of the book discuss graph connectivity and traversal. There are two algorithms that perform graph traversal:breadth-first search and depth-first search. In breadth-first search, the algorithm initially traverses all the nodes directly connected to the root node and appends them to a conceptual layer labelled Layer 1. From there, all nodes not already appended in some layer L, and connected to a node in the L, will be grouped into a layer L+1. Consequently, a breadth-first tree will have a more full appearance than the depth-first search tree. In depth-first search, the algorithm will simply explore one child of the parent node, until it reaches a leaf; from there it backtracks and explores the descendants of the next unexplored node. The queue is used to implement breadth-first search and a stack is used to implement depth-first search. In section 3.4, the authors then discuss how to test if a graph is bipartite using breadth-first search. The properties of a bipartite graph are the following: Each of the nodes in a layer are connected to nodes in the another layer, and that there is no odd cycle. Because breadth-first search explores each of the nodes in a layer, the algorithm will be able to determine if two nodes of the same layer are connected. In section 3.5, breadth-first search is explained using directed graphs. Graph traversal algorithms with undirected graphs are very similar to traversing directed graphs, except the algorithm that performs the latter has the extra functionality of being able to traverse in reverse. In the final section, the authors discuss Directed Acyclic Graphs (DAG), which nominally and structurally contain no cycles. Also a DAG will necessarily have a topological ordering and a graph with a topological ordering will be a DAG. A topological ordering is a directed graph in which a node will have no incoming edges. I found the chapter very interesting. I would rate it as a 9/10.
Chapter 4: Greedy Algorithms
The chapter begins with a quote from the 1980s film Wall Street. In the film, Michael Douglas, espouses the benefits of greed. This film reference sets up a good introduction to the idea of greedy algorithms. If a greedy algorithm can create an optimal solution for a problem, then that implies that the structure of the problem is such that making locally optimal decisions will result in an optimal solution.
Section 4.1: Interval Scheduling
In section 4.1, the concept of greedy algorithms are introduced. Greedy Algorithms are algorithms that, with each iteration, it performs optimally. Consequently, greedy algorithms are always the optimal solution. An example problem, in which, greedy algorithms provide the optimal solution, is the Interval Scheduling Problem. The goal of the Interval Scheduling Problem is to find the largest subset of some n jobs that can be scheduled simultaneously without conflicts. The best metric to use in the algorithm is the time job with the earliest end time. The algorithm can be found on pg 118. To prove that “Greedy stays ahead”, let O be an optimal solution to the Scheduling Problem. If A, the set returned from the greedy algorithm, is optimal it will have the same number of elements as O. Since the set A is the result of a greedy algorithm, A will necessarily be optimal, since at each stage of the algorithm, the job added to A will have an earlier start time than the job added to O. Proof is on page 120. To implement the algorithm sort all requests by finish time and label them. Place the start time in an array. During the while loop, the algorithm inserts the job, that has the earliest finish time without conflicting with the previously inserted jobs, in A. The sorting of the jobs and array initialization takes in O(n log n). The latter operation is only O(1). Thus, the algorithm has a run time of O(n log n).
Section 4.2: Scheduling To Minimize Lateness: Exchange Argument
In section 4.2, greedy algorithms are applied to a variation of the Scheduling Problem, in which the goal, is to schedule the jobs such that no conflicts will result and to minimize the lateness. In contrast to the original Scheduling Problem, the jobs in the variation will not have a start and end time, but will only have a duration and a deadline. To produce an optimal solution, the greedy algorithm will schedule the jobs in order of ascending deadline. The algorithm can be found on pg 127-128. The above algorithm will produce an optimal solution,which can be proved with the exchange argument. To prove a greedy algorithm's optimality with the exchange argument, suppose there is an optimal solution set O that is different from the set returned from the greedy algorithm. If O has an inversion, then there is a pair of jobs, i and j, such that swapping them will not increase the overall lateness. If swapping continues, then the differences between O and A will be eliminated, but the overall lateness remains the same. Thus, proving A is an optimal solution.
Section 4.4: Shortest Paths in a Graph
In section 4.4, Dijkstra's algorithm, a greedy algorithm that determines the shortest path from one node to another, is discussed. In the first iteration, the algorithm begins with an explored set of nodes S with only one node, that is randomly chosen as the source node, as its member. Obviously the distance from the source node s to itself is 0. Since the distance from s to all of the other nodes is unknown, we assume them all to be equivalent to infinite. The algorithm will then proceed to explore the edge with the minimum cost associated with it that extends from s, and updates the total distance or cost from s to the other node and adds that node to S.It then backtracks and explores the other nodes that extend from s and adds them to S as well. It repeats the above mentioned process until it reaches the destination node.When the algorithm terminates it will have discovered the shortest path from the source node to a given destination node. The algorithm can be found on pg 138. To implement Dijkstra's algorithm place all nodes in a priority queue so that accessing the smallest edge from a given node is O(1). With the heap-based priority queue, all the operations are O(log n). Therefore, the algorithm's run time is O(m log n).
Section 4.5: Minimum Spanning Tree
In section 4.5, the authors discuss three greedy algorithms that will produce a minimum spanning tree from a graph. A minimum spanning tree is the minimum set of edges that will maintain a graph's connectivity. The three algorithms that will produce such a set are the following: Kruskal's Algorithm, Prim's Algorithm, and the Reverse-Delete Algorithm.
Prim's algorithm is the simplest and similar to Dijkstra's. In Prim's Algorithm, a random node s is added to a set T, select the minimum edge from s to another node and add that node to the set T. This process continues until all nodes have been reached.
In Kruskal's Algorithm, edges are added in order of increasing cost to an empty set E.If adding an edge to E results in a cycle, that edge is discarded and the algorithm continues to add the rest of the edges that don't violate the above condition.
Reverse-Delete performs the opposite way from Kruskal's. In Reverse-Delete, edges are deleted, in descending cost, from the set E. If a edge's deletion will destroy connectivity, it is skipped.
The above algorithms are optimal because all rely on one of the following two properties: Cycle Property and the Cut Property. Kruskal's and Prim's Algorithms' optimality is justified by the Cut Property. The Cut property states that with some subset of nodes,S, not equal to all of a graph's nodes V, there is a minimum cost edge e from S to V that will necessarily be apart of the minimum spanning tree. Kruskal's and Prim's Algorithms are structured to only include the edge e at each iteration. Reverse-Delete optimality is justified by the Cycle Property. The Cycle Property states that if there is some cycle C and e is the most expensive edge in C, then e will not be apart of the minimum spanning tree. Reverse-Delete is structured so that only the most expensive edges that do not destroy connectivity are removed from the graph. Thereby producing a minimum spanning tree.
Prim's Algorithm can be implemented with a priority queue with a run time of O(m log n).
Section 4.6: Implementing Kruskal's Algorithm
In section 4.6, the authors discuss efficient implementations of Kruskal's Algorithm. To prevent having to recalculate all edge relationships with every insertion of an edge, the union-find data structure serves to efficiently determine if an edge's insertion will create a cycle. The union-find data structure has three operations, MakeUnionFind, Find, and Union. The first operation takes a set of nodes S of length n, and returns n sets of length 1. Find will return the name of the set containing the input node. The Find operation will be used to determine if two endpoints of an edge belong to the same, if they do, then the addition of that edge will create a cycle. If they are members of two different sets, then it is valid to add that edge. Three different implementations of Union-Find are also discussed. The array-based implementation will result in Find having a O(1) run time, MakeUnionFind would be O(n), and the overall run time of Kruskal's would be O(k log k) where k is the number of Union operations. With the pointer-based implementation, the Union operation is O(1), MakeUnionFinds is O(n), and Find is O(log n). The third implementation is a variation of the pointer-based implementation, except the former uses path compression. Path compression efficiently determines if two nodes, s and e, are of the same set. Path compression is efficient since it has all nodes on the path from one node,s, to the other ,e, point directly to the destination node e. With the latter implementation, Kruskal's Algorithm has a run time of O(m log n). I would rate the reading an 8/10.
Section 4.7 : Clustering
In section 4.7, the authors discuss how to optimally find k number of clusters among a set of points. Rather than connecting all points, the algorithm will connect all the clusters by finding the maximum spacing between two clusters. Points closest to each other, i.e. in the same cluster, will be added to some graph H. If a pair of points are already connected in a cluster, the edge that would have created a cycle is not added. This process continues until the wanted k clusters are created. The above process is Kruskal's Algorithm, without adding the k-1 most expensive edges. The proof for optimality is on page 160.
Section 4.8: Huffman Coding
In section 4.8, the authors discuss, Huffman's Algorithm and its use in data compression. Before data is transmitted, it must be compressed so that it is easier to transmit. In order for the encoding to be efficient, the most frequent letters, in some alphabet S, should have an encoding that is shorter than the less frequent letters. For further efficiency, all encodings should be prefix codes; a prefix code is an encoding that has no other encoding as its prefix. Huffman's Algorithm performs these requirements optimally, by using a bottom-up approach. At every iteration, the algorithm will take the two lowest frequencies, z and y, and create a parent to z and y, x,such that x's frequency is the sum of z and y. It then recurs until there is only one node left which is the prefix code for the entire input alphabet. Algorithm is on page 172 and the proof is on page 174.
Chapter 5: Divide and Conquer
Divide and Conquer Algorithms divide problems into q smaller sub-problems and recursively solve those sub-problems. To analyze the efficiency of such algorithms it is necessary to solve a recurrence relation. The three methods to solving recurrence relation are unrolling the recursion, substitution, and partial substitution.
Section 5.1: A First Recurrence: The MergeSort Algorithm
In section 5.1, the authors solve the recurrence relation of MergeSort as an example. MergeSort divides the problem into two sub-problems and recursively solves each of those sub-problems, the cost to recombining the sub-solutions is O(n). Therefore, MergeSort's recurrence relation is defined as the following: T(n) ⇐ 2T(n/2)+ cn, when n>2 T(2) ⇐ c. The steps to unroll the recursion, substitute, and perform partial substitution are found on pages 212-213. The algorithm is bounded by O(n log n).
Section 5.2: Further Recurrence Relation
In section 5.2, the authors generalize recurrence relations to obtain the run-times of recursive algorithms that divide the problem into 2 sub-problems, more than 2 sub-problems, and into one sub-problem. If q ==1, then the run-time is O(n); if q == 2, then the run-time is O(n log n); if q > 2, then the run-time is O(nlog2q).
Section 5.3: Counting Inversions
In section 5.3, the authors use a divide and conquer algorithm, mergeSort, to optimally count the number of inversions between two lists. Knowing the number of inversions is useful, to determining how similar two lists are. Using a variant of mergeSort, the authors develop an algorithm, Sort-and-Count, that divides a list, and while sorting each sub-list also counts the number of inversions. During the merge operation, the number of inversions between each sub-list is counted. The run-time of the algorithm is O(n log n).
Section 5.4: Finding the Closest Pair of Points
In section 5.4, the authors discuss finding the closest pair of points in a plane. To determine the pair of points closest together, the sorted x values of the points are partitioned in half, and the closest pair is found in each half. The value d is the minimum distance between a pair of points, in each half, is min(q,r), where q is the minimum distance in the one half of the plane, and r is the minimum distance in the other half. Since there could exist a inter-partition pair of points such that its distance ,a, is less than d, then the algorithm must also consider the points that are at most d distance from the partition. Recursively obtaining the minimum distance in each half is O(log n), and determining if a value is less than d, is necessarily O(n). Thus, the algorithm has a run-time of O(n log n).
Section 5.5: Integer Multiplication
Another application of divide and conquer algorithms the authors discuss is integer multiplication. The algorithm typically taught to children is O(n2), but it can be optimized using recursion and the divide and conquer principle. Dividing the problem into 4 sub-problems will still yield a quadratic run-time. If the problem is divided into 3 sub-problems, then the overall run-time is O(nlog23). I found the reading to be interesting, 6/10.
Chapter 6: Dynamic Programming
Dynamic Programming consists in decomposing a problem into simpler subproblems and iterating over those subproblems to construct a solution or recursively solving the problem using a memoization array to store values.
Section 6.1: Weighted Interval Scheduling: A Recursive Procedure
The authors introduce dynamic programming with the Weighted Interval Scheduling problem. A variation of the Interval Scheduling problem, the goal of the Weighted Interval Scheduling problem is to find the subset of jobs with the maximum weight. Using dynamic programming to approach the problem, the algorithm initially considers if some job j is in the optimal solution or not. It determines if a job is in the solution by taking either the min or max of the problems. The algorithm would then recur to explore the additional jobs using memoization to reduce the runtime from exponential to polynomial. The algorithm for the problem is on page 256. To compute the solution from the memoization array, an algorithm simply backtracks through the values. The algorithm for finding the solution is on page 258.
Section 6.2: Principles of Dynamic Programming: Memoization or Iteration over Subproblems
Dynamic programming could also use iteration, rather than recursion, to construct an optimal solution. The algorithm would simply iterate through the number of subproblems and store them in a corresponding index in a memoization array. In order for a problem to have a solution that can be determined from dynamic programming it must satisfy the following criteria: 1. There must only be a polynomial number of subproblems 2. The solution to the original problem is easily computable from the subproblems. 3. There is a natural ordering to the subproblems and an easy-to-compute recurrence that allows the solutions to subproblems be determined from smaller subproblems.
Section 6.3: Segmented Least Squares: Multi-Way Choices
In section 6.3, the authors introduce the Segmented Least Squares problem to illustrate that subproblems in dynamic programs are not always binary. In the Segmented Least Squares problem, the goals is to plot a set of lines through a set of points using the fewest number of lines and with the highest possible accuracy. Since the subproblem is not a binary decision, it will consist of an error value for some partition, a constant, and the optimal solution of the rest of the possible partitions. The subproblem can be found on page 265. The run time of the algorithm is O(n^{2}) and can be found on page 266.
Section 6.4: Subset Sums and Knapsacks: Adding a Variable
In section 6.4, the authors discuss Subset sums and a variation of the Subset Sum problem called the Knapsack problem. The goal of the Subset Sums is to schedule as many jobs, with a duration w_{i}, to a machine without going over the maximum time W. The Knapsack problem is a little more complex since the goal is to find the subset of items with the highest possible value, but whose combined weight is less than the maximum weight W. In contrast to the previous section, the subproblems are binary decisions, either the job or item is in the optimal solution or it is not. The subproblem and algorithm can be found on page 269. The runtime of the algorithm is O(nW).
Chapter 7: Network Flow
Is a class of problem, where each edge in a directed graph has flow and each node can receive and send “flow”. Network Flow problems are particularly broad and include problems such as the Bipartite Matching Problem and the Maximum-Flow Problem.
Section 7.1: The Maximum-Flow Problem and the Ford-Fulkerson Algorithm
The Maximum-Flow Problem is concerned with finding the maximum amount of flow that is possible on a given directed graph so that all edges are used efficiently. The maximum-flow of any given graph is actually the minimum capacity of some partition or “cut” of the graph. The Ford-Fulkerson Algorithm efficiently obtains the maximum flow of a graph. The maximum flow is obtained by taking all of the edges with underutilized capacity and and using the difference between the flow and capacity in a “residual graph” as a forward edge. The flow in the original graph will be a backward edge in the residual graph. The process of creating the forward and backward edges in the residual graph eliminates bottlenecks and is known as augmenting a path. The Ford-Fulkerson Algorithm has a run-time of O(m).The algorithm and proof is on page 344.
Section 7.2: Maximum Flows and Minimum Cuts in a Network
In section 7.2, the authors further analyze the Ford-Fulkerson algorithm. They conclude that the algorithm is guaranteed to return the maximum flow, which is based on the minimum cut. This is due to the property that the value of some flow f, v(f) is equal to the difference of the number of edges leaving and entering a cut. Therefore, v(f) is guaranteed to be bounded by the capacity of some cut.
Section 7.5: A First Application: The Bipartite Matching Problem
The Ford-Fulkerson algorithm not only solves the Maximum-Flow Problem but also the Bipartite Matching Problem. The Bipartite Matching Problem is determining what is the largest subset of edges of some graph such that each node appears at most once in a particular matching. The minimum cut returned from the Ford-Fulkerson Algorithm is the largest matching in a given graph. The runtime of the algorithm in solving the Bipartite Matching problem is O(mn) and the explanation is on page 370. To determine if a particular matching is perfect, the size of the subset of some nodes |A|, must be less than the size of the rest of the nodes. The proof is on page 372.
Section 7.7: Extensions to the Maximum-Flow Problem
In section 7.7, the authors discuss extensions to the Maximum-Flow problem. In the one of these extensions, the graph is similar to the one constructed in the Bipartite Matching Problem with a source node connecting a set of nodes labelled as suppliers which are connected to consumers. The set of consumers are in turn connected to a sink node. Each edge has a flow and capacity and if the difference between the flow and capacity is positive for a given node, then that node is a consumer, otherwise that node would be a supplier. The goal is to find the best “circulation” that satisfies the demand for a given graph. The above problem description is known as the Circulation Problem. A circulation must satisfy the following conditions: (i) Every flow must be positive and less than or equal to its edge's capacity, (ii) the demand for a particular node will be the difference between the edges that enter and exit the node. The proof is on page 381. The problem can be extended to handle lower bounds on the amount of flow on edges. The proof is on pages 383-384.