CS211: Design and Analysis of Algorithms

This is an old revision of the document!

Chapter 4: Greedy Algorithms

4.1 - Interval Scheduling: The Greedy Algorithm Stays Ahead

A subset of requests is compatible if no two of them overlap in time, accepting the largest compatible subset possible. Compatible sets of maximum size are optimal.

Rules for selection of requests can vary but not all are desirable: selecting by earliest start time, shortest interval, and least amount of conflicts all have counterexamples. The best choice is selecting based on earliest finish time

A is a compatible set of requests. The goal is to show that the set A is equivalent to the optimal solution, making sure that each element in A finishes at a time less than or equal to the element from the optimal set. Through this process, we conclude the greedy algorithm returns an optimal set A. The algorithm runs in O(n log n) time.

A Related Problem: Scheduling All Intervals

Another problem may ask for all intervals to be satisfied using as few resources as possible. In any instance of Interval Partitioning, the number of resources needed is at least the depth of the sets of intervals. Altering our greedy algorithm to include a depth label allows every interval to be assigned a label. No two overlapping intervals will receive the same label. This schedules every interval on a resource, using a number of resources equal to the depth of the sets of intervals. This is the optimal number of resources needed.

Discussion

I found the most interesting part of this section was the lecture where we analyzed the algorithms with counterexamples. It gave the most clarity to the problem solving method used here. In addition, I was surprised to see that soling for all intervals was as simple as adding a depth tracker. I expected to implement new rules for obtaining the optimal solution.

4.2 - Scheduling to Minimize Lateness: An Exchange Argument

Reexamine the problem where the goal is to minimize lateness. Each task is then assigned a deadline. The optimal solution is found when arranging tasks in increasing order of deadlines (Earliest Deadline First). The schedule has no idle time, that is time between tasks. An inversion in the schedule occurs when a job is scheduled with a deadline later than another unselected job. All schedules with no inversions and no idle time have the same maximum lateness. Therefore, there is an optimal schedule that has no inversions and no idle time. A schedule produced by this greedy algorithm, with Earliest Deadline First sorting, will have a maximum lateness of L.

4.4 - Shortest Paths in a Graph

The goal is to determine the shortest path from starting node s to every other node in a graph. To do this, determine the length of shortest paths from s and store them as you travel to mark the explored portion of the graph. A node is only stored as the shortest path if it is properly so, staying ahead. Consider the set S at any point in the algorithm’s execution. For each node in S, the stored path is a shortest path to that node. Known as Dijkstra’s Algorithm.

The algorithm doesn’t find shortest paths where edges have negative values. Also, the algorithm is very similar to breadth-first search. Using a priority queue, the algorithm can be implemented on a graph with n nodes and m edges in O(m) time. Using heap-based priority queue can result in O(log n) time. Overall implementation: O(m log n).

4.5 - The Minimum Spanning Tree Problem

The goal is to build a path between every pair of nodes as cheaply as possible (or with least distance). A minimum-cost solution will not contain cycles. A minimum spanning tree can be found using several different greedy algorithms. It is possible to begin with no edges and add them in order of increasing cost. Another way is similar to Dijkstra’s Algorithm, begin at a root node and add the cheapest node possible. Another is to begin with a complete graph and delete edges in order of decreasing cost.

Assuming all edges are distinct, let S be any subset of nodes that is neither empty nor equal to all of V, and let edge e=(v,w) be the minimum-cost edge with one end in S and the other in V-S. Then every minimum spanning tree contains the edge e. Let C be any cycle in tree G, and let edge e=(v,w) be the most expensive edge belonging to C. Then e does not belong to any minimum spanning tree of G.

Both Prim’s and Kruskal’s Algorithms can be implemented in O(m log n) time.

4.6 - Implementing Kruskal’s Algorithm: The Union-Find Data Structure

Consider a graph with a fixed number of nodes but grows over time by the addition of edges. We want to keep track of the connected components of the graph. The Union-Find data structure maintains disjoint sets. Given node u, Find(u) will return the name of the set containing u. This can be used to test if u and v are in the same set by checking Find(u)=Find(v). Union(A,B) takes two sets and merges them into a single set. This method only supports addition of edges, not deletion.

In an array implementation of the Union-Find data structure for some set S of size n, where unions keep the name of the larger set, the Find operation takes O(1) time, MakeUnionFind(S) takes O(n) time, and any sequence of k Union operations takes at most O(k log k) time.

In a pointer-based implementtion of the data structure, a Union operation takes O(1) time, MakeUnionfind(S) takes O(n) time, and a Find operation takes O(log n) time.

Kruskal’s Algorithm can be implemented on a graph with n nodes and m edges to run in O(m log n) time.

Discussion

As the algorithms we explore begin to become more complicated, it is more important than ever to work through proofs slowly. It is much harder to fully understand optimal solutions upon first glance when the problem sets become large and complex. I found the similarities of the BFS and Dijkstra’s Algorithm interesting. I don’t see much of a difference, just an ordering of the edges added (not very complicated).

Table of Contents

Chapter 4: Greedy Algorithms

4.1 - Interval Scheduling: The Greedy Algorithm Stays Ahead

A Related Problem: Scheduling All Intervals

Discussion

4.2 - Scheduling to Minimize Lateness: An Exchange Argument

4.4 - Shortest Paths in a Graph

4.5 - The Minimum Spanning Tree Problem

4.6 - Implementing Kruskal’s Algorithm: The Union-Find Data Structure

Discussion