Table of Contents
Chapter 4
This chapter, as its descriptive title indicates, talks about greedy algorithm. The introduction of the chapter first points out the philosophy behind the different greedy algorithms: using a local decision to optimal small steps and constructing a overall optimal solution at the same time. Then the introduction summarizes the context of the chapter: listing out examples to let us get a feeling of why greedy algorithms are greedy.
Section 1: Interval Scheduling: The Greedy Algorithm Stays Ahead
This section illustrates the first example of greedy algorithms in details: the interval scheduling. The core inside this algorithm is to find out the criterion of being greedy to rule each small step. There are three seemingly appealing criterions that do not work: earliest starting time, smallest interval of time, and most compatible. We did not manage to come up with a counter example in class to disapprove the most compatible case. The book gives a neat example, so that is good. The actual working criterion is the earliest finishing time. To show that this actually works, we compare the solution A resulted from this earliest finishing time criterion with any optimal solution O. It is not hard to conceive that for an optimal solution composed by k pieces “for all indices r⇐k, finishing time of rth piece from A is earlier or equal to that of O”. Hence it follows that the greedy algorithm indeed returns the optimal solution. The running time of the algorithm is O(nlogn) (coming from sorting, the rest costs O(n)).
The book then introduces some extensions of the interval scheduling problem that I think are very interesting, such as choosing intervals without knowledge of future output (like a philosophy problem from Socrates) and intervals with different values.
A related problem, scheduling all intervals, is discussed next in the section. Obviously, the minimum resources we need to use is the depth of the set of intervals (maximum overlap). Then the book shows a greedy algorithm that guarantees that the number of resources equal to the depth. Basically, we also use the criterion as in the original interval scheduling problem-the earliest finishing time, but instead of abandoning the ones overlapping with the chosen ones, we put it in a new resource. There would not be a d+1 resource used. Because of the way we arrange the intervals, there has to be 1 resource such that the last interval's finishing time being earlier than the current one we are trying to find position for. The running time is still O(nlogn), coming mainly from sorting.
I like this section's reading. Since it is my first time learning about greedy algorithm, the logic behind it attracts my attention. I'd like to see how the “choosing intervals without knowledge of future output works”.
Readability is 9.
Section 2: Scheduling to Minimize Lateness: an Exchange Argument
The section introduces the second example that makes use of the idea of being greedy. It first states the problem: we have some projects with deadlines and lengths and we need to schedule all of them such that the maximum lateness is minimized.
There are some appealing criterions that do not really meet our goal: the length of the jobs and the slack time of the jobs. The criterion that works is the Earliest Deadline First. The algorithm is written based on this Criterion and the running time is O(nlogn). We can easily realize that there should be no idle time, that is, contagiously working. The book defines an inversion to be finishing a job with later deadline first. The book proves that all schedules with no inversions and no idle time have the same maximum lateness, which is easy to see (different schedules will only be due to there exist two jobs having same deadlines). Then the book proves that there is an optimal schedule that has no inversions and no idle time. The hardest part of the proof, which is presented separately, is that if we swap an inversion, then the new schedule is still optimal.
The extension of the minimizing lateness problem can be adding a starting time to each job. This makes the original proof no longer valid, because we can no longer swap any inversions.
This section involves a challenging proof, which is worth more attention. So the readability is 8.
Section 4: Shortest Paths in a Graph
This section introduces the third example of greedy algorithm: finding the shortest paths.
The problem is just as what its name describes, given a directed graph, assigned a node s, how to find the path from s to every other node that is the “lightest”. Dijkstra came up with an algorithm to solve the problem, namely, the Dijkstra algorithm, which is presented in the book. If we want to output the detailed information of each shortest path, just keep track of the added edge at each iteration. To prove the validity of the algorithm, we show that the solution selected by the algorithm “stays ahead” from other possible solutions using inductions. Note that the Dijkstra algorithm only works when all edges are positive numbers; otherwise the solution does not necessarily stay ahead. The book also provides anther insight of the algorithm: BFS traversing the graph. This intuitively makes sense to me: because each iteration we look at the next level-one distance away from the set S. The book uses an analogy of water expand, but I personally did not find it too helpful, though the insight is nice to know.
After the discussing the theory aspects of the Dijkstra algorithm, the section further talks about the best way to implement the algorithm, using priority queue. Indeed, in class, I felt that we do not need all operations from PQ-the insert part will be linear even just treating it as a list. For each iteration, we need to add an node v to the set S, to select the right v-based on d(v)-the ExtractMin is used; to update d(v) for all nodes involved in the iteration, we use changeKey. ExtractMin is used at most n time and each time it takes o(logn); ChangeKey is used at most for all edges, m , times and each time it also takes o(logn). Hence, overall, the running ime is o(mlogn).
The algorithm is greedy because every iteration it only picks the node that a shortest path between it and s is the smallest under all circumstance. It is very smart.
The readability is 8.
Section 5: The Minimum Spanning Tree
This section talks about how to obtain the minimum spanning tree from a connected graph through the 3 different greedy algorithms.
First the definition of minimum spanning trees is introduced: a spanning tree, that is an acyclic path connecting all the nodes, that has minimum weight. There are three greedy algorithms to solve the problems, namely: 1, The Kruskal's Algorithm that inserts edges from an ascending order if the edge does not form a cycle. 2, Prim's Algorithm that starts with a node and greedily grow outward using the analogy of Dijkstra's Algorithm. 3, the Reverse-Delete Algorithm that deletes edges from an descending order if deleting the edge does not disconnect the graph.
Then the book continues to analyze the algorithms. Before getting into each algorithm, it first introduces two important properties (actually the second is introduced after the analysis of Kruskal and Prim's algorithm, just for simplicity, I group them together): 1, the cut property: if all edges are of distinct values, the minimum-weighted edge that connecting two non-empty, disconnected subsets need to be in the minimum spanning trees. 2, the cycle property: for a cycle, the maximum-weighted edge in the cycle is not in the minimum spanning tree.
Use the cut property we can prove Kruskal and Prim's algorithms easily, especially the Prim one, very straightforward. Use both properties to prove the Reverse-Delete algorithm: when deleting an edge is possible, use circle property; when not, cut property. In general, deleting is justified by Circle property, and inserting by Cut property, no matter what kind of greedy algorithm we used, which also explains the variety of greedy algorithms.
Before, we assume that all edges are of distinct values. The method to get rid of this assumption is the perturb the ties with very small value epsilon, that is, forcing an order, and then use those algorithms.
After analyzing the algorithm, naturally follow the implementations of those algorithms, except for the Reverse-delete algorithm when the running time is hard to reach O(mlogn). This section only introduces the implementation of the Prim's Algorithm through priority queue like in Dijkstra's Algorithm.
However, in practice, minimum spanning tree is not enough for a smoothly-run network. There are problems such as for individual pairs, the client may not be willing to have an edge that is heavy-weighted, only having one connection between each client makes the network insecure, the impossibility to evaluate the costs of each edge perfectly before hand.
The readability is 7.
Section 6: The Implementing Kruskal's Algorithm: the Union-Find Data Structure
The main goal of this section is to introduce the Union-Find data structure, which will allow us to quickly find if two nodes are connected, which is exactly needed in the Kruskal's Algorithm, because we always want to check if the nodes connected by an edge is connected by another path.
There are three operations in the data structure, namely MakeUnionFind,Find and Union. MakeUnionFind(S) separates each element in S to be an individual set. Find(u) finds out which set u is contained. Union(S1,S2) merges two sets together. From this, we can see that Union-Find data structure is handy when we hope to combine sets, but definitely not split a set in a desired way other than individual element as a set.
If we use an array to implement the Union-Find, we use the location of each element of the array to represent each node and fill in the number to represent the set the node belongs in. Therefore, we can easily see that the running time for Find is O(1), and MakeUnionFind(S) O(n). Any sequence of k Union operations takes at most O(klogk). Because at most 2k elements are involved in any Union at all and the process happens at most log2(2k) times.
There is actually a better data structure than arrays to implement Union-Find, especially for the Union operation. The key is using a pointer. MakeUnionFind(S) does similar things as above and has each set pointing to the element inside. When union two sets, we simply have the smaller set's pointer points at the larger set's pointer. Obviously, MakeUinonFind takes O(n) time and Union constant time. The Find(v) ultimately does is try to find what element is the set containing v points to. Since the set containing has at least 1 elements, at most n, and its size doubles most logn time, the running time is O(logn).
We can even further improve the algorithm to make Find(v) constant. However, setting up the data structure takes over more time, because we need to go though the pointing path for each element in the sets as we did in the Find above.
After looking into the data structure, it is not hard to implement Kruskal's Algorithm in O(mlogn) time.
This section talks in details of the construction of Union-Find, interesting. The readability is 9.
Section 7: Clustering
This section talks about another problem-Clustering-and how to solve it with greedy algorithm. Clustering (of maximum spacing) can be precisely described as partitioning a collection of objects U such that the maximum possible spacing s achieved (That is, the minimum distance between any pair of points lying in different clusters is minimized).
The algorithm is exactly the Kruskal's algorithm except that we stop it just before adding the last k-1th edge, hence its running time is O(mlogn). Basically, it is constructing the minimum spanning tree without the k-1 most weighted edges. We can prove it using a contradiction.
Practical applications can be finding out the clusters and using low-cost network within the cluster and using more robust network between each cluster.
This section is quite short but the application of the minimum spanning tree is interesting.
Section 8: Huffman Code and Data Compression
This section introduces us the Huffman Code and how to use greedy algorithm to achieve.
First, the book gradually introduces the origin of Huffman coding from using 1,0 bits representing letters, the different frequency of usage of letters in English, the Morse code which is similar but easier with the pausing, Prefix codes and the way to calculate the optimal prefix codes (minimize the sum of frequency*bit-length).
To begin design the algorithm, intuitively the Binary Tree comes to mind because if we only use the leaves to represent letters then it is gonna be prefix code since no parent is involved. It is not hard to see that the optimal prefix code corresponds with some full binary tree. Now we attempt to construct the algorithm to build such a full binary tree. The Top-Down Approach as in the S-F code is the first try, unfortunately the example used earlier is already a counter example. To perfectly modify this approach needs to borrow knowledge from dynamic programming, which has not come up yet. However, if given a tree structure (that is, all the frequencies of the letters), we can use the two lowest-frequency letters as leaves (then add up and regard as another leave) to plant the tree.This is the Huffman coding. The greedy part is to merge the two lowest-frequency leaves together in every recursive call, which is not hard to prove to be the optimal one. Priority Queue is the best data structure to implement the Huffman Coding, because the two lowest ones are easy to be extracted and push back the combined element to the queue it will automatically be in position.
Compression is not always as easy as the Huffman coding. Tow major challenges are the “fractional bit representation” and, as mentioned before, the variation of text. Also, there are occasions when compression is not necessary such as all letters appear at similar frequency. Doing the compression in turn adds more work.
I like the structure of this section because it tends to “guide” us to seek the right solution. Also, it stirred in some interesting stories along with the Huffman coding.
Readability is 8.