Table of Contents

Chapter 4

Greedy Algorithms

“Greed… is good. Greed is right. Greed works.” -Gordon Gecko, played by Michael Douglas in “Wall Street”, 1987

A greedy algorithm is one that builds up on small steps making decisions to optimize the solution.

4.1 Interval Scheduling: The Greedy Algorithm Stays Ahead

For this, we consider the same problem as in Chapter 1, where we have a set of requests and we need to figure out the optimal way of scheduling the jobs.

Designing a Greedy Algorithm

The simple rule is to select a request and reject all requests that are not compatible with the selected request. Then we continue the process with the rest of the requests. Some rules of ordering the selection could be according to:

Analyzing the Algorithm

Runtime: O(n log(n)) (Proof on page 121)

Extensions

There are problems that might arise in a practical situation:

Interval Partitioning Problem: We have multiple resources but we need to complete all requests using as minimum resources as possible. (Page 122, 123) “In any instance of Interval partitioning, the number of resources needed is at least the depth of the set of intervals.” (Proof on page 123, 124)

Interval Partitioning Problem Algorithm: Page 124

If we use the greedy algorithm as provided, every request will be assigned to a resource and no two overlapping requests would be assigned to the same resource. (Proof on page 124, 125)

The algorithm schedules every request on a resource, using a number of resources equal to the depth of the set of intervals. This is the optimal number of resources needed.

4.2 Scheduling to Minimize Lateness: An Exchange Argument

The Problem

Lateness Minimizing Problem, like the Interval Scheduling Problem, has one resource, but differs such that the requests have a deadline and can be scheduled at any time, provided that it meets the deadline.

Designing the Algorithm

Analyzing the Algorithm

Extensions

A much harder problem would occur when there also exists, for each request, an earliest possible starting time, or earliest release time.

4.4 Shortest Paths in a Graph

The Problem

“Given a start node s, what is the shortest path from s to each other node?”

Designing the Algorithm

Dijkstra has proposed a very simple algorithm to solve undirected shortest-path problems. From the source node, we find a path to each of the rest of the nodes and search for the shortest-paths for each of the nodes. (Algorithm on page 138)

Analyzing the Algorithm

For each path from the source, s, to any other node, u, already explored, the path is the shortest s-u path. (Proof on page 139, 140)

This algorithm, however, does not work if the distance between any of the nodes is negative. Also, Dijkstra's algorithm is just a standard BFS algorithm to traverse a graph.

Runtime

Total runtime: O(m log n) on a graph with n nodes and m edges. (Explanation in page 141, 142)

4.5 The Minimum Spanning Tree Problem

The Problem

We need to build a communication tree on different locations, with minimum cost and a connected graph.

The minimum-cost solution to the problem is a tree. (Proof on page 142, 143)

Designing Algorithms

Some greedy algorithms to think of, and that actually works, are:

Analyzing the Algorithms

Cut Property: “Assume that all edge costs are distinct. Let S be any subset of nodes that is neither empty nor equal to all of V, and let edge e = (v, w) be the minimum-cost edge with one end in S and the other in V - S. Then every minimum spanning tree contains the edge e.” (Proof on page 145)

Cycle Property: “Assume that all edge costs are distinct. Let C be any cycle in G, and let edge e = (v, w) be the most expensive edge belonging to C. Then e does not belong to any minimum spanning tree of G.” (Proof on page 147, 148)

All of the three algorithms above produce a minimum spanning tree of G (Proofs on pages 146-149)

Eliminating the Assumption that All Edge Costs Are Distinct (Please see question section below) How do we know that the algorithms we discussed still works if the edge costs are not distinct?

Implementing Prim's Algorithm

Prim's algorithm runs in O(m log n) time, on a graph with n nodes and m edges with n ExtractMin and m ChangeKey operations, as before. (Explanation on page 150)

Extension

4.6 Implementing Kruskal's Algorithm: The Union-Find Data Structure

The Problem

Using the Union-Find data structure, we determine whether two nodes, u and v, belong to the same set or not. It will support the following three actions:

A Simple Data Structure for Union-Find

We use an array to represent a Union-Find data structure, which contains the sets containing each element. If S is a set of size n, Component(S) is the name containing S. This makes Find(u) to run on O(1) time and Union(A, B) to run on O(n) time. We perform some optimization:

Even with these optimization, the worst case possible would end up with a runtime O(n), if both sets A and B are very large and contain a constant fraction of all the elements. However, this does not happen very often.

“The Find operation takes O(1) time, MakeUnionFind(S) takes O(n) time, and any sequence of k Union operations takes at most O(k log k) time.” (Proof on page 153, 154)

A Better Data Structure for Union-Find

We can implement Union-Find using pointers. This data structure implements Union in O(1) time since we have to update only one pointer. But Find(u) would take a long time and MAY go upto O(n) time. To prevent this, we still use the optimization mentioned above, to name the union as the larger set.

Using the above information, “A Union operation takes O(1) time, MakeUnionFind(S) takes O(n) time, and a Find operation takes O(log n) time.” (Proof on page 155)

Further Improvements

In an improved implementation, we update every pointed in the path to perform Find(v) so that each pointer points to the current name of the set. Thus now, Find takes O(log n) time at the most.

Implementing Kruskal's Algorithm

We sort the edges by the cost, which takes O(m log n) time, then we use the Union-Find data structure to maintain connected components as the edges are added. For each of the current edges, we check if Find(u) and Find(v) are equal and use Union(Find(u), Find(v)).

At most, we perform 2m Find and n-1 Union. Thus “Kruskal's algorithm can be implemented on a graph with n nodes and m edges to run in O(m log n) time.” (Page 157)

4.7 Clustering

The Problem

Clustering refers to a problem in which we are given a distance between various objects and we classify them into clusters depending on how far or how close they are to each other.

Clusterings of Maximum Spacing: Lets suppose that we have a n number of objects and we wish to divide them into, a given, k groups. The “spacing” of a 2 clusters is the minimum distance between any pair of points in clusters. Our goal in the problem is to seek maximum spacing.

Designing the Algorithm

We use Kruskal's algorithm to find the minimum cost edges and keep adding them. If we arrive at an edge between two nodes that already belong to a cluster, we don't need to add the edge. We stop once we have k number of clusters. (See page 160 for an example)

Analyzing the Algorithm

“The components C1, C2, … Ck formed by deleting the k - 1 most expensive edges of the minimum spanning tree T constitute a k-clustering of maximum spacing.” (Proof on page 160 and 161)

4.8 Huffman Codes and Data Compression

The Problem

Encoding Symbols Using Bits: We use sequences of bits to represent each letter of the English alphabet in a computer. Since we have 26 different alphabets, we might have to use at least 5 bits for each alphabet to have a unique code for each of the alphabets. However, we can even optimize this by using fewer number of bits to represent the most common occurring alphabets like e, t, a, o, i and n than the least frequently occurring alphabets like q, j, x and z. We would be saving a lot of space if data are stored that way. This is an example of data compression. We need to take advantage of the nonuniform frequencies of the alphabets.

Variable-Length Encoding Schemes: Samuel Morse developed Morse code to translate each letter into a sequence of dots (short pulses) and dashes (long pulses) to send telegraphs. For now, let's assume a dot as a 0 and a dash as a 1. He consulted different local printing presses for frequency estimates of the alphabets in English. Using this, Morse code maps e to 0, t to 1, a to 01 etc. With only this information, 0101 could represent eta, aa, etet or aet. To make it possible to read the encoded message, pauses are included in Morse code. That would mean there are actually three symbols used in a Morse code: 0, 1 and pause. Thus, if we want to use only 0 and 1 as symbols, we need to find a different method of coding the alphabets.

Prefix Codes: The problem with Morse code is that there exist pairs of letters where the bit string that encodes one letter is a prefix of the bit string that encodes another. The following Prefix Code eliminates this problem:

The above Prefix Code doesn't have any character with an encoding that is a prefix to another character.

Optimal Prefix Codes: With the above codes, and the frequencies as follows:

…, we have the average number of bits per letter as follows:

.32 * 2 + .25 * 2 + .20 * 3 + .18 * 2 + .05 * 3 = 2.25

But the above code is still not optimal. (See page 165 and 166 for more details) We now need to find a prefix code that is efficient.

Designing the Algorithm

Representing Prefix Codes Using Binary Trees: We make a binary tree T where each node is either a leaf or has 2 children. The number of leaves is also equal to number of different alphabets in a set S.

“The encoding of S constructed from T is a prefix code.” (Proof on page 166, 167)

We can construct a binary tree in such a way that the top node has two children, the right child being the most frequently occurred letter and the left child being a parent to two other child. Every time you take a right path, the code is 1, and for left, it is 0. This, the letter on the right child of the root node has a code 1. (Picture on page 168)

“The binary tree corresponding to the optimal prefix code is full.” (Proof on page 168, 169)

A First Attempt: The Top-Down Approach: We need to pack the leaves in the trees as tightly as possible. There are different ways to do it but there are schemes called Shannon-Fano codes that can make this precise. The resulting code is the one given above, which we already know that it is not optimal. Huffman, a graduate student in Fayo's class solved the problem a few years later.

What if We Knew the Tree Structure of the Optimal Prefix Code?: If we already have the tree but not the labeling of the leaves, it is very easy to create the codes.

“Suppose that u and v are two different leaves of the optimal tree T* such that depth of u is less than depth of v. Also, suppose that in a labeling of T* corresponding to an optimal prefix code, leaf u is labeled with y and leaf v is labeled with z, both belonging to S. Then fy >= fz.” (Proof on page 170)

Suppose that v is the deepest node in T* and its parent is u. Since the tree is a full binary tree, u has another child, w. “w is a leaf of T*.” (Proof on page 171)

“There is an optimal prefix code, with corresponding tree T*, in which the two lowest-frequency letters are assigned to leaves that are siblings in T*.”

The Algorithm to create such a tree is given in page 172, 173, with explanation too.

Analyzing the Algorithm

“The Huffman code for a given alphabet achieves the minimum average number of bits per letter of any prefix code.” (Proof on page 174 and 175, section 4.32 and 4.33)

Implementation and Running Time: Using a priority queue, the running time of the algorithm would be O(k log k).

Extensions

A black and white picture pf 1000 X 1000 pixels would be very hard to compress. Of course there is a way to compress them but each pixel representing a black or white code is already being represented by one symbol: black or white.

If a and b occur throughout the first section of a text with no c and d at all, and c and d occurs throughout the second part of the text, with no a and b at all (among only 4 alphabets), we might want to change the code halfway through the text so that the average number of bits is maintained low and is not radically changed halfway through the text.

Questions and Comments

Questions for Feb 29th

As for questions, I don't understand the two of the followings:

Most of the rest, we have discussed in class, and I have no problems.

Questions for Mar 7th

No questions. No problems, mostly dealt everything in class. Readable, probably because of the class. Interesting things: