Table of Contents

The Front Matter

Summary

This section is an introduction to Greedy Algorithms. Greedy algorithms build a solution by making step-by-step, local decisions to optimize underlying criterion. There are two types of proofs: Greedy Stays Ahead and Exchange Argument. Greedy Stays Ahead measures the progress of the algorithm and proves that it does better than or equal to any other algorithm, especially a simple brute search. The exchange argument is when you consider an optimal solution and gradually transform it into the solution yielded by the greedy algorithm without jeopardizing it optimality at any point.

Notes
Additional Information

There wasn't anything that I was unclear about or had to reread to understand after class or after my first run through of this section.

In terms of readability, this section was a 10 because it was just an introduction section. It was basically definitions, and they were super easy to understand because they were clearly explained. Also, this section was a page, so it was a super quick read.

4.1 Interval Scheduling

Summary

In this section, there are two problems to solve: Interval Scheduling and Interval Partitioning. In Interval Scheduling, the greedy aspect of the problem is to select intervals based on finish times (the smallest one first); this allows us to maximize the time left to complete other tasks. We prove the algorithm's optimality based on this greedy criterion by using the greedy stays ahead proof. In Interval Partitioning, the greedy aspect is that we select intervals based on the earliest start time. We prove this greedy algorithm for optimality by using a structural argument; the number of resources for the intervals cannot be less than the depth of the intervals.

The Problem
Designing a Greedy Algorithm

Algorithm

   Initially let R be the set of all requests, and let A be empty
   While R is not yet empty
     Choose a request i in R that has the smallest finishing time
     Add request i to A
     Delete all requests from R that are not compatible with request i
   Endwhile
   Return the set A as the set of accepted requests
Analyzing the Algorithm
Scheduling All Intervals

Algorithm

   Sort the intervals by their start times, breaking ties arbitrarily
   Let I1, I2, …., In denote the intervals in this order
   For j = 1, 2, 3, …, n:
     For each interval Ii that precedes Ij in sorted order and overlaps it:
                Exclude the label of Ii from considerations for Ij’
     Endfor
     If there is any label from {1, 2,…, d} that has not been excluded then:
                Assign a nonexcluded label to Ij
     Else
                Leave Ij unlabeled
     Endif
   Endfor
Additional Information

In class, I understood what the structural argument was in the specific Interval Partitioning problem: the optimal solution cannot be less than the depth of the intervals. However, I was kind of confused by what it was in a broader sense. Luckily, after reading this section, it became much clearer. You just have to prove that there is a structural bound asserting that every possible solution must have at least a certain value. Then, you have to show that the greedy algorithm yields a result that has that minimum value.

In terms of readability, this section is a solid 9. Because I read this section after hearing the lecture in class, it was super easy to comprehend, and the proof s were written very clearly. The only proof that was a little difficult to understand in terms of the way that the book explained it was the greedy stays ahead proof. Luckily, it wasn't too difficult to get because we learned about it in class.

4.2 Scheduling to Minimize Lateness

Summary

We want to be able to schedule all of the tasks, without overlapping, so to minimize the maximum lateness. Scheduling jobs in order of increasing length and by shortest slack time do not always yield optimal solutions. The greedy criterion for this problem is sort the jobs in increasing order of the deadline and then schedule them in that order. Of course, there would be no idle time. To prove optimality for this greedy algorithm, we will use an exchange argument in which we will modify an optimal solution, ensuring it remains optimal at each step, slowly until we transform the old optimal solution into the solution that our greedy algorithm would produce. In this case, we would modify an optimal solution with inversions to get it to have no inversions to prove the greedy algorithm's optimality.

The Problem
Designing the Algorithm

Algorithm

   Order the jobs in order of their deadline
   Assume that d1<=…<=dn (sorted deadlines)
   Initially, f = s
   Consider jobs i=1,…, n in this order:
     Assign job I to the time interval from s(i) = f to f(i) = f + ti
     Let f = f + ti
   End
   Return the set of scheduled intervals [s(i), f(i)] for i = 1,…, n
Analyzing the Algorithm
Additional Information

When first reading through the section, it was hard for e t understand the algorithm and its proof because of the notation that the book invented for it. The book abbreviated full words to letters, so when I first read the section I must have not registered where they defined the terms. However, once I went back and reread the section, the notation became more clear. For instance, s is start time, f is finishtime, d is deadline, and t is time length of the interval. I can't type some of the symbols because they have some lines over them. However, the letters with the lines over them were post-swap, and the ones without were pre-swap.

In terms of readability, on a scale from 1 to 10, this section was an 8. The section was explained very well overall. Unfortunately, the proof of optimality of the algorithm for this section had some confusing notation in it. However, once I understood the notation, understanding the proof got exponentially easier.

4.4 Shortest Paths in a Graph

Summary

Our goal is to find the the shortest paths in a graph. To solve this, we will use Dijkstra's algorithm. Dijkstra's algorithm is greedy because we always form the shortest new s-v path we can make from a path followed by a single edge. However, all of the edges must have nonnegative distance; the algorithm breaks down if it does not. We will prove this algorithm for correctness using the greedy stays ahead proof. The runtime of the algorithm is O(mlogn) if you use the priority queue to implement it.

The Problem
Designing the Algorithm

Dijkstra's Algorithm

   Dijkstra’s Algorithm (G, l)
        Let S be the set of explored nodes
        For each u in S:
       Store a distance d(u)
        Initially, S = {s} and d{s} = 0
        While S != V:
             Select a node v not in set S with at least one edge from S for 
             which d’(v) = min(e=(u, v):u in S) d(u) + le is as small as possible
             Add v to S and define d(v) = d’(v)
        Endwhile
Analyzing the Algorithm
Additional Information

The notation was pretty hard for me to understand in the first run through of the section, specifically d’(v) = min(e=(u, v):u in S) d(u) + le. After reading the section again, the notation became more clear. min(e=(u, v):u in S)d(u) means the minimum distance to u of the paths from node u to node v in set of discovered nodes S. d'(v) is the new distance to v and le is the length of the edge. Therefore, now I better understand the algorithm overall.

In terms of readability, on a scale of 1 to 10, this section was a 7. It wasn't necessarily that this section was difficult to understand. It was just that some of the notation was confusing, especially when the algorithm is written out. However, once I got the notation down, the section was not too difficult. I thought that the proofs were explained fairly clearly.

4.5 The Minimum Spanning Tree Problem

Summary

We want to build a connected graph, that uses weighted edges, as cheaply as possible. In order to do this, the output would be a tree because there would be no cycles (that would be a waste of cost because deleting the maximum edge would not disconnect the graph). Thus, the output is a tree, which is called a minimum spanning tree. There are three algorithms that output optimal solution for this problem: Prim's Algorithm, Kruskal's Algorithm, and the Reverse-Delete Algorithm. We also learned about the cut property, which is used to prove Prim's Algorithm and Kruskal's Algorithm for correctness. We also learned about the cycle property, which is used to prove the Reverse-Delete Algorithm for correctness. The cut property tells you when you include an edge in the minimum spanning tree, and the cycle property tells you when you can delete an edge to create and minimum spanning tree. Finally, the implementation of Prim's Algorithm is very similar to the implementation of Dijkstra's Algorithm.

The Problem
Designing the Algorithm
Analyzing the Algorithms
Implementing Prim’s Algorithm
Extensions
Additional Information

The cycle and cut properties were hard for me to comprehend on the first day that we talked about them in class and the first time I read about them in the section. However, the cut property merely tells you when you can include an edge in a minimum spanning tree. It says that the minimum cost edge with one end in the explored set and the other in the unexplored set can must be included in the minimum spanning tree (with neither the explored set nor the unexplored set being empty). The cycle property tells you when you can delete edges to create the minimum spanning tree. It says that the most expensive edge in a cycle can be deleted and does not belong in any minimum spanning tree. Prim's and Kruskal's Algorithms use the cut property to prove for correctness, and the Reverse-Delete Algorithm uses the cycle property to prove for correctness.

On a scale of 1 to 10, the readability of this section was a 7. The description of the algorithms were very straightforward. However, the proofs of the algorithms using the cycle and cut properties were a little hard to comprehend at first. However, after going to class and reading over the proofs again, it was easier to understand.

4.6 Implementing Kruskal's Algorithmm: Union-Find

Summary

When an edge is added to the graph, we don’t want to have to keep continuously recomputing the connected components with each iteration. The Union-Find structure will store a representation of the components in a way that supports rapid searching and updating and will prevent us from having to add edges to recompute connected components many times over. The Union-Find data structure supports three operations MakeUnionFind(S), Find(u), and Union(A, B). We first try an array implementation, but that is not as efficient as we wish it to be. The more efficient form of the implementation is a pointer based implementation. For example, in the pointer-based implementation, if you are performing Union(Find(u), Find(v)) and |Find(u)| < |Find(v)|, you merely would change u's pointer to point to v. You do not update the pointers of any of the other nodes in u's set. To make this more efficient, we could also, each time a Find operation is performed, compress the paths by resetting all pointers along the path to point to the current name of the set. This results in the Union operation running in O(1) time and the Find operation running in O(logn) time at most. The implementation of Kruskal's Algorithm using the pointer-based Union-Find data structure is O(mlogn).

The Problem
A Simple Data Structure Union-Find
A Better Data Structure for Union-Find
Further Improvements
Implementing Kruskal’s Algorithm
Additional Information

The pointer implementation of the Union-Find data structure was a little confusing to me at first. We didn't really go over how the Union-Find structure was implemented in class, so the first time I read the pointer implementation, I did not fully comprehend what exactly that implementation did. Luckily, after reading the implementation again, it became clearer. The Union-Find structure in the form of the pointer implementation, to represent a union of the set with u in it and the set with v in it and making the union set name v's, merely update u's pointer to point to v. This makes the Union operation O(1) and the Find operation, with each subsequent operation when we update the pointers to point to the “parent” set with each Find, is O(logn).

On a scale of 1 to 10, the readability of this section was a solid 7. The different Union-Find implementations were a little hard to understand (and they are definitely hard things to explain). Nonetheless, I thought the book did a fair job doing just that. It just took a couple of rereadings to fully comprehend the Union-Find data structure using pointers.

4.7 Clustering

Summary

Minimum spanning trees play a role in clustering. We are trying to find the k-clustering (partition of U into k nonempty sets) with the maximum possible spacing (we want to group together the elements that are the closest). We will define a distance function d(pi, pj) on the objects with the conditions: d(pi, pi) = 0, d(pi, pj) > 0 when pi and pj are distinct, and d(pi, pj) = d(pj, pi). The algorithm to find this clustering is basically running Kruskal’s Algorithm but stopping it right before it adds its last k-1 edges. The connected components formed by deleting the k-1 most expensive edges of the minimum spanning tree T create a k-clustering of maximum spacing.

The Problem
Designing the Algorithm
Additional Information

The thing that confused me in this section was the proof of the algorithm for correctness. The notation in the proof was super confusing to me. This proof rests on the fact that Kruskal's Algorithm adds the smallest distance edges first and leaves the largest for last, which it does not add. Thus, in any instance when there is a deviation between the resulting MST from Kruskal's Algorithm and another algorithm. The distance of two points that are not in the same cluster in the deviant output its distance is less than or equal to the distance added by Kruskal's. Therefore, Kruskal's produces the optimal solution.

On a scale of 1 to 10, the readability of this section was a 9. The book did a really great job explaining what a clustering was, and the description of the algorithm was super clear. The only thing that confused me a little was the proof of the algorithm we used. But, still, I feel like I at least got the gist of the proof.

4.8 Huffman Codes

Summary

We want to map encodings to letters so that no encoding is the prefix of another in order to avoid ambiguity. A prefix code that minimizes the average bits per letter is optimal. In other words, more frequently used letters should have shorter encodings in order to compress the data. A brute force algorithm is infeasible because the search space is so large and so variable. We will use a full binary tree to represent prefix codes all letters encoded will be leaf nodes to avoid any encoding being the prefix of another. Meta-letters will be interior nodes (parents to letters). The greedy aspect of Huffman's algorithm is that you merge the two lowest frequency letters in the algorithm. The bottom-up approach is the optimal approach. The runtime of the algorithm is O(klogk).

The Problem
Designing the Algorithm
Huffman’s Algorithm
   To construct a prefix code for an alphabet S, with given frequencies:
   If S has two letters:
        Encode one letter using 0 and the other using 1
   Else: 
        Let y* and z* be the two lowest-frequency letters
        Form a new alphabet S’ by deleting y* and z* and replacing them with a new letter w of frequency fy*+fz*
        Recursively construct a prefix code y’ for S’, with tree T’
        Define a prefix code for S as follows:
             Start with T’
             Take the leaf labeled w and add two children below it labeled y* and z*
    Endif
Analyzing the Algorithm
Extensions
Additional Information

A concept that was troubling me in the beginning stages of understanding Huffman codes was the concept of the meta-letter that is the parent of leaf nodes. As a read the section after the class, it made more sense. The meta-letter is there because a letter node cannot be there. A letter encoding cannot be an interior node because that means that it is the prefix of another node, which is the point of the Huffman codes. There can be no encoding that is the prefix of another. Moreover, meta-letters are helpful and make more sense in Huffman's optimal bottom up approach anyway. The first meta-letter joins the two letters of the lowest frequency and that way its like a super letter and now the alphabet has one less letter. This explanation of the purpose of the meta-letter also makes it clear why we would implement the algorithm with a priority queue.

On a scale of 1 to 10, the readability of this section is a 9. The book explained really well the notation and the concepts behind the Huffman Code. Also, the main part of the code involved binary trees, which we've already seen multiple times in this class, so it was really building off the knowledge we already knew.