This is an old revision of the document!


Chapter 4

A greedy algorithm is one that builds up a solution in small steps to optimize something. We will use greedy algorithms for many problems including the shortest path problem, the Minimum Spanning Tree problem and others. There are two main proof techniques for showing a greedy algorithm works: the greedy stays ahead proof and the exchange argument.

Section 1: Interval Scheduling: The Greedy Algorithm Stays Ahead

We can use the greedy algorithm to solve the interval scheduling problem. Given n intervals (with start and end times), we want to find the largest compatible (non-overlapping) set of these intervals. We can do this by successively choosing the interval with the earliest end time that does not conflict with our current set of accepted intervals. We can prove this works with a “greedy stays ahead” inductive proof. The algorithm has O(nlogn) run time. (p 118)

A related problem is that of scheduling all intervals in the fewest number of rooms (Interval Partitioning Problem). In this problem with optimal number of rooms is the depth of the intervals. We can achieve an optimal schedule with a greedy algorithm that orders the intervals by start time and assigns them to the first room in which there are no conflicts. (p 124)

Readability: 7

Section 2: Scheduling to Minimize Lateness: An Exchange Argument

We can also use greedy to solve the Minimum Lateness Problem. In this problem, we have tasks with deadlines and we want to schedule them so maximum lateness is minimized. By processing the tasks in order of earliest deadlines, we can accomplish this. To prove this works, we use an exchange argument. First we note that there is an optimal schedule with no idle time and that two schedules with no idle time and no inversions (a situation where a job with an earlier deadline is scheduled after a job with a later deadline) have the same max lateness. Then we show that by swapping inversions in some optimal schedule, we maintain its optimality and end up with the same max lateness as our schedule.

Readability: 7

Section 4: Shortest Paths in a Graph

Our next problem is to find the shortest path from a node s to every other node in a directed graph where edge lengths vary. We can use Dijkstra's algorithm to solve this problem. In this algorithm, we begin with an explored set that contains only s. At each pass, for each unexplored node, we determine the shortest path that can be constructed by traversing through the explored part and then going along one more edge to the unexplored node. We choose the unexplored node that has the shortest of these paths to add to the explored set (p 138).

We can prove Dijkstra's algorithm works using an inductive greedy stays ahead proof that uses the idea that at any point in the algorithm's execution, the path to nodes in the explored set is the shortest path. By using a priority queue, we can implement this algorithm in O(m log n) time.

Readability: 6

Section 5: The Minimum Spanning Tree Problem

In the minimum spanning tree problem, we want to find the subset of edges of a graph that will keep the graph connected and have the smallest total edge weight. There are three algorithms that accomplish this.

Kruskal's Algorithm: Add edges in order of increasing cost as long as a cycle is not created.

Prim's Algorithm: Choose a root and build a tree from it, adding whichever node is cheapest to add to the current partial tree.

Reverse-Delete Algorithm: Remove edges in order of descending cost, as long as the graph is not disconnected.

To prove that these algorithms work, we first need to note two properties: the Cut Property and the Cycle Property. The Cut Property says that if S is a proper subset of nodes of a graph V and e is the minimum cost edge from S to V\S, then e must be in the minimum spanning tree. The Cycle Property say that the most expensive edge in a cycle cannot be in a minimum spanning tree. The proof of Kruskal's and Prim's follows from the Cut Property. The proof of Reverse-Delete follows from the Cycle Property.

Like Dijkstra's algorithm, Prim's algorithm can be implemented with a priority queue in O(m logn) time. We will see the implementation for Kruskal's in the next section.

Readability: 8

Section 6: Union-Find Data Structure

We need a data structure to help us implement Kruskal's algorithm. The Union-Find data structure will help us by allowing us to maintain information about and perform operations on disjoint sets of a graph. The three operations supported by Union-Find are:

1. MakeUnionFind(S), which returns a Union-Find structure where every element is in its own set, 2. Find(u), which returns the name of the set containing u, and 3. Union(A, B), which merges the sets A and B into a single set.

One implementation for Union-Find uses an array Component to store the name of the set that contains each element. When two sets are merged, the name of the larger set is kept. In this implementation, Find is O(1), MakeUnionFind is O(n) and k Union operations take at most O(k logk) time.

We can improve the Union operation by using a pointer-based structure where each node points to the name of the set containing it. When two sets are merged, the pointer from the smaller set is pointed to the larger. In this implementation, Union is O(1), MakeUnionFind is O(n) and Find is O(log n).

We can further improve the Union-Find data structure by using path compression after every Find operation to make n Find operations almost linear.

Union-Find allows us to implement Kruskal's algorithm in O(m log n) time.

Readability: 8

Section 7: Clustering

Section 8: Huffman Codes and Data Compression

courses/cs211/winter2012/journals/virginia/chapter4.1330282806.txt.gz · Last modified: 2012/02/26 19:00 by lovellv
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0