This is an old revision of the document!
Section 4.1 Interval Scheduling: Greedy Stays Ahead
This section examines the Interval Scheduling Problem, which aims to find the largest subset of intervals that do not overlap one another in time. We must come upon a rule around which to design the greedy algorithm, and a few are suggested.
One possible rule is selecting the earliest starting times, but this will not work because the interval that begins first may end last, so none of the other intervals will be completed. In this example, as long as there are two or more other intervals that do not overlap, using the aforementioned rule will not yield an optimal solution.
Another rule that could be used is selecting the intervals in order of size – smallest to largest. This rule could lead us to select only one interval which, while small, overlaps two non-overlapping intervals and so it yields a subset of size 1 which is not the optimal solution.
A third solution is choosing intervals in order of fewest conflicts. This one will not work either because it does not take into account end times, which are important in anchoring this solution.
The rule that will work is selection of intervals based solely on which intervals end first. It can be proven that this solution is optimal by proving that the greedy algorithm will provide a solution which contains the same number of intervals as the optimal solution. The greedy algorithm will start by choosing the first interval to end, and then by choosing an interval that ends at the same time as the one in the same index position in the optimal set or earlier. The greedy algorithm stays ahead here by choosing the same interval as the one in the optimal set as its latest possible interval for that index position. Thus, the worst set that the greedy can pick is an optimal set and by definition it is also the best set the algorithm can pick. This means that the greedy algorithm always pick the optimal set.
It runs in O(n log n) time. The algorithm must sort the inputs based on ending time, an O(n log n) operation. It then creates an array in linear time and then selects the first interval and iterates through the rest of the intervals, selecting intervals whose start is greater than the previously selected interval's finish.
If we wanted to schedule all intervals using the fewest resources possible, the algorithm would only need a slight modification. This modification involves assigning a different label to each interval that overlaps and thus each label corresponds to a different “layer” or resource that the interval will occupy for the specified time.
