This is an old revision of the document!

Chapter 4 – Greedy Algorithms

My notes on the assigned sections of Chapter 4 of Algorithm Design by Jon Kleinberg and Éva Tardos. This chapter details greedy algorithms. A greedy algorithm is an algorithm that “builds up a solution in small steps, choosing a decision at each step myopically to optimize some underlying criterion.”

4.1 – Interval Scheduling: The Greedy Algorithm Stays Ahead

The main challenge in designing a good greedy algorithm is deciding which rule to use for the selection process. For interval scheduling, some rules that might seem good are minimal start time and the smallest interval time, but the best rule is to use the request that finishes first. This way, we maximize the time left over to service the other requests. To prove that our greedy solution, A, is optimal, we must show that A contains the same number of intervals as an optimal solution, O. We then can compare partial solutions, showing that A is doing better in a step-by-step fashion.

Interval Scheduling Algorithm:

Sort jobs by finish times so that f1 ≤ f2 ≤ ... ≤ fn
G = {}
for j = 1 to n
    if job j compatible with G
        G = G ∪ {j}
return G

This algorithm runs in O(nlogn) time because we first sort the request, taking O(nlogn), then we construct an array in O(n) time, and we do one pass through n intervals, spending constant time per interval, which takes O(n) time. So, overall we can say the algorithm is O(nlogn).

A similar problem is the Interval Partitioning Problem in which we must partition all intervals across multiple resources (ex. scheduling classes in classrooms). In any instance of Interval Partitioning, the number of resources is going to be at least the depth of the set of intervals.

Interval Partitioning Algorithm:

Sort intervals by starting time so that s1 ≤ s2 ≤ ... ≤ sn
d = 0
for j = 1 to n
    if lecture j is compatible with some classroom k
        schedule lecture j in classroom k
    else
        allocate a new classroom d + 1
        schedule lecture j in classroom d + 1
        d = d + 1

This algorithm runs in O(nlogn) time because for each classroom k, we maintain the finish time of the last job added, and we can keep the classrooms in a priority queue by the last job's finish time.

I'd give this section an 8/10 on both readability and interestingness.

4.2 – Scheduling to Minimize Lateness: An Exchange Argument

In this problem, we have a single resource and a set of n requests to use this resource for a given time interval. However, instead of a start and finish time, the request has a deadline, and it requires a time interval length, t, and it can be scheduled at any point before the deadline. The goal is to minimize the lateness of the request (finish time minus deadline). The correct greedy approach for this problem is to sort the jobs in increasing order of their deadlines, and then schedule them in this order. This way, we make sure that jobs with earlier deadlines get completed earlier.

Minimize Lateness:

Sort n jobs by deadline so that d1 ≤ d2 ≤ … ≤ dn t = 0
for j = 1 to n
    Assign job j to interval [t, t + tj]
    sj = t
    fj = t + tj
    t = t + tj
output intervals [sj, fj]

I'd give this section a 10/10 on readability and a 7/10 on interestingness.

4.4 – Shortest Paths in a Graph

This section examines a greedy algorithm that calculates the shortest paths with a designated starting node, s, and the assumption that s has a path to every other node in the graph. Such an algorithm was created by Edsger Dijkstra in 1959, and the algorithm became known as Dijkstra's Algorithm.

Dijkstra's Algorithm:

Dijkstra's Algorithm (G, l):
Let S be the set of explored nodes
    For each u in S, we store a distance d(u)
Initially S = {s} and d(s) = 0
While S != V
    Select a node v in S with at least one edge from S for which
        d'(v) = min(e=(u,v):u in S) d(u) + le is as small as possible
    Add v to S and define d(v) = d'(v)
EndWhile

Dijkstra's Algorithm is greedy because “we always form the shortest new s-v path we can make from a path in S followed by a single edge”. Its correctness can be proved using a Greedy Stays Ahead proof. Each time it selects a path to a node v, such path is shorter than any other possible path there. Its runtime is O(mlogn) where m is the number of edges and n is the number of nodes. Seemingly, the runtime seems like it would be O(mn), but using the right data structures drastically improves efficiency. We use a priority queue to keep the nodes V - S with d'(v) as their key. So, the algorithm can run in O(m) time, “plus the time for n ExtractMin and m ChangeKey operations”.

This section was very readable, and I'd give it a 8/10 on readability and a 7/10 on interestingness.

4.5 – The Minimum Spanning Tree Problem

The goal of the minimum spanning tree problem is to find a tree from a graph that is fully connected, but with the minimum cost. This can be done with 3 different greedy algorithms. One, called Kruskal's algorithm, starts without any edges and builds a spanning tree by successively inserting edges in order of their increasing costs. You keep inserting edges unless a cycle is created. Another greedy algorithm is called Prim's Algorithm, which is essentially a modified version of Dijkstra's Algorithm. You start at a root node and greedily grow a tree outward based on the node that can be attached as cheaply as possible. Using a priority queue, Prim's Algorithm we can achieve a runtime of O(mlogn) because on a graph with n nodes and m edges, the algorithm runs in O(m) time, plus the n ExtractMin and m ChangeKey operations. The third greedy algorithm that we can run to get a minimum spanning tree is a backward version of Kruskal's Algorithm. We start with the full graph and delete nodes in order of decreasing cost; you delete edges as long as they don't disconnect the graph.

Overall this section was very readable and pretty interesting. 9/10 on readability and 8/10 on interestingness.

4.6 – Implementing Kruskal's Algorithm: The Union-Find Data Structure

A Union-Find data structure is a special one created for Kruskal's Algorithm, which stores a representation of the components in a way that supports rapid searching and updating. The data structure can test is two nodes are in the same set by comparing the Find operation on both. It can also merge two sets into a single set with the Union operation. The goal time for Find and Union both a are O(logn) time. We can use an array implementation so that Find takes O(1), MakeUnionFind takes O(n), and any sequence of k Union operations takes O(klogk) time. However, a better representation would be to use a pointer-based representation. This way, we get Union operations in O(1), MakeUnionFind in O(n), and Find operations in O(logn).

Overall this section was pretty readable and interesting. 7/10 for both.

4.7 – Clustering

Clustering happens when you're trying to classify a collection of objects into distinct groups. One way to divide them up into clusters is to calculate the distances between the distinct groups. However, there are infinitely many clusterings in a set; the issue is finding the ones with the max spacing. This procedure is the same as Kruskal's minimum spanning tree algorithm, except we stop the procedure when we obtain k connected components. This is the same as deleting the k-1 most expensive edges from the minimum spanning tree and returning the resulting components.

Overall this section was very interesting and very readable to me: 10/10.

4.8 – Huffman Codes and Data Compression

The goal of Huffman Codes is lossless data compression. The original top-down approach created by Shannon and Fano isn't guaranteed to always produce an optimal code. So, Huffman came up with a bottom-up approach that did. The idea in this is to have the most frequently used characters to have the shortest codes in order to optimize the overall compression. Furthermore, you don't want to have one character to be the prefix of another character, or you will get multiple ways to interpret data. To do this, we must minimize the average number of bits per letter ABL(y) = Sigma x in S fx|y(x)|.

Using a Priority Queue, we can get Huffman's algorithm to run in O(klogk):

if S has two letters:
  Encode one letter as 0 and the other letter as 1
else:
  Let y* and z* be the two lowest-frequency letters
  Form a new alphabet S’ by deleted y* and z* and replacing
    them with a new letter w of freq fy* + fz* Recursively construct a prefix code y’ for S’ with tree T’
  Define a prefix code for S as follows:
    Start with T’
    Take the leaf labeled w and add two children below it
      labeled y* and z*

Table of Contents