Chapter 4

4.1 Interval Scheduling

A greedy algorithm is one that involves no backtracking or other global optimization–it builds the solution by a simple metric and does not rethink what it has already done. Interval scheduling is easily solved by adding the first to end of the tasks that do not conflict with any other until there are no more such. It can be proven optimal by “greedy stays ahead”–its ith task must finish no later than the ith task of any other valid solution by a simple induction based on the selection criterion. A related algorithm can schedule all tasks (to an arbitrary number of processors)–take the tasks by start time, assigning each to an available processor and adding another when none are available. This can be proven optimal very directly–it will always produce a solution using the same number of processors as the input is deep, which is obviously the least possible. As before, running time is O(n log n), coming from the initial sort.
No insights.
No questions.
No other complaints.

4.2 Scheduling to Minimize Lateness

A related problem is the scheduling of tasks of fixed duration (but not time) to minimize maximum lateness. This can be solved by a very simple greedy algorithm, simply sorting the tasks by deadline and then scheduling them in that order with no gaps. This can be shown optimal by exchange; if an optimal solution has an inversion, where an early job has a later deadline than a later job, reversing the inversion will reduce the maximum lateness if the later job (by the first schedule) was the latest, or else not effect the maximum lateness; therefore, an optimal solution can be transformed into the greedy solution without increasing maximum lateness. Likewise, removing idle time will weakly decrease maximum lateness. Therefore, the greedy solution is optimal.
I was glad to see formal treatment of the possibility of jobs with identical deadlines (4.8)–I do not recall such a clause in the proof in class, and while not a weakness in the algorithm, it does seem a weakness in the proof. At the same time, I prefer the proof by contradiction (let S be the schedule with the fewest inversions; if S has any inversions, there is an optimal schedule with fewer inversions) to their proof by induction, that a series of exchanges can transform an arbitrary optimal schedule into an inversionless optimal schedule.
No questions.
The complaint against the induction in the proof aside, fairly well done.

4.4 Shortest Paths in a Graph

A common problem in graphs is finding the shortest route between two points. When edges are not equally weighted mere BFS does not suffice, so we use Dijkstra's greedy algorithm: sequentially explore the unexplored node closest to the start, maintaining a priority queue of the shortest discovered route to each unexplored node, and the distance to each explored node. Since all untraversed distances are greater than all traversed distances, committed shortest paths will never be revised (and optimality follows from this: if there were a path shorter than the one taken, it would have been chosen first. Proper use of caching route lengths and the priority queue keep runtime to O(m log n) time, driven by the possibility of a log n priority queue update for each of the m edges.
No insights.
No questions.
Clear and comprehensive. I appreciated the relating of Dijkstra's algorithm to BFS coming after te presentation and proof of optimality, since while I think the comparison useful they are sufficiently distinct that having BFS in mind could confuse the presentation of Dijkstra's.

4.5 Minimum Spanning Trees

A minimum spanning tree is the set of edges that connects every node while minimizing summed edge weights. Such a subgraph must be a tree (since a subgraph containing a cycle is not minimum, as part of the cycle could be removed). Interestingly, many natural greedy algorithms work. One can use a modification of Dijkstra's algorithm, taking not the shortest total path but the shortest edge from a discovered node to an undiscovered node, and two sort-and-test algorithms, going by decreasing edge weight and removing edges that would not cause a disconnection and going by increasing edge weight and adding those that would not cause a cycle. The first, Prim's, and last, Kruskal's, are optimal since they only include edges that satisfy the cut property (are the least-cost edge connecting some subset of nodes to the remainder), and the third is optimal since it only eliminates edges that are the highest-cost edge in some cycle (and thus cannot be part of an optimal solution). Both Prim's and Kruskal's work on O(m log n) time.
There seem to be two primary sorts of greedy algorithm–those that sort and take a pass through the sorted list (Kruskal's, Reverse-Delete, Interval Scheduling, etc.) and those that take from a PQ built along the way (Prim's, Dijkstra's). Is this comprehensive, or an artifact of limited exposure (or simply mistaken)?
No questions (except that last, I suppose).
Fairly clear, and decent handling of the three very different algorithms.

4.6 Union-Find Data Structure

Kruskal's algorithm requires the ability to efficiently tell whether adding an edge would create a cycle. One way to do this is to track the connected subgraphs created by the edges already added; if both ends of an edge are in the same subgraph, adding it would create a cycle. At its simplest, a UFDS could be an array giving the name of the set to which each node belongs, but then the Union operation would be linear. By keeping the name of the larger and maintaining lists of the nodes belonging to each set, Find takes O(1) and k Union operations take O(k log k) time. But this can be improved by using a reverse tree, where nodes have a pointer to their parent. By again always keeping the name of the larger set these inverse trees (roots?) can be kept balanced, so that Union is O(1) and Find is O(log n). Further non-asymptotic operations can be achieved by, on each completed find, updating the pointer to point to the new root, quickening future operations.
Nice to see this finally explained.
No questions.
No complaints.

4.7 Clustering

A problem related to finding MSTs is finding the clusterings in a graph, the grouping that maximizes minimum inter-group edges. Thus can be achieved with only slight adaptation of Kruskal's algorithm for finding MSTs; rather than stopping when all nodes are connected, stop when there are as many connected sets in the UFDS as desired. This produces a maximum-spacing clustering, since this algorithm has spacing equal to the first edge that Kruskal's algorithm would have added after it was stopped, and any other clustering must split nodes that this algorithm added before that edge, which, since Kruskal's algorithm goes by increasing edge length, must be less than the spacing the algorithm in question produced.
I am somewhat surprised to see no mention of runtime, even if it is fairly obviously the same as Kruskal's.
No questions.
No complaints.

4.8 Huffman Codes and Data Compression

Computers encode characters as bits, and so some easy compression can come from finding an encoding that minimizes average bit-length. Since over sufficiently large samples of text of a given language character frequencies are fairly predictable, some compression can come from adopting an encoding tailored to certain uses, as Morse Code was. But Morse Code was not optimized for the same constraints as computer encodings are; humans are accustomed to the clear dead-air letter and word spacing, but that forces Morse Code to be ternary, with dits, dahs, and spaces; since some letter codes are prefixes of others, omitting the spaces would lead to ambiguity. Therefore, computer encodings should be prefix codes, where no code is a prefix of another. One can represent prefix codes as binary trees, where each node represents a single bit of a code, and only leafs carry characters (the code then being the sequence of bits at the nodes from the root to the parent); an optimal prefix code will be represented as a balanced tree. Given a tree, characters should obviously be assigned to leaves on increasing levels by decreasing probability, so that the most common characters correspond to the leaves closest to the root. One can find the tree for the optimal code by proceeding through the characters by increasing frequency; if there are only two characters to be encoded, encode one as 0 and the other as 1 arbitrarily, and if there are more group the least frequent two into a meta-character with frequency the sum of their frequencies. Lastly, expand the meta-characters by assigning one child to 0 and the other to 1, recursively expanding any meta-characters among those children. Optimality can be established by induction; the Huffman algorithm optimizes a two-character set, and so it optimizes each level. Using a priority Queue, it runs in O(k log k), as k iterations of a log k operation.
No insights.
No questions.
I dislike presentation of the wrong solution first, except when giving a history of the development of the algorithm; I am wary of making someone's first association be a faulty algorithm. But otherwise well written.