Kelly's Wiki

This is an old revision of the document!

Chapter 4: Greedy Algorithms

Preface

A greedy algorithm optimizes some criterion by taking the most optimization on a step by step basis without direct regard to future iterations. For a given problem, there can be many possible greedy algorithms. The challenge is finding greedy algorithms which are effective and proving their quality.

Section 4.1: Interval Scheduling: The Greedy Algorithm Stays Ahead

Interval scheduling involves an effort to schedule the highest possible number of activities in a set amount of time without allowing them to overlap. Some activities are longer than others. We attempt to conceive a greedy algorithm to maximize the number of compatible activities which can be scheduled in any given situation. Simply choosing the activities which start the soonest for our available time does not work as this approach does not account for differing duration. Further, we cannot simply select the activities in the order of their individual running times. One reason for this is because one short activity may overlap with multiple other activities,making it better to choose them instead. For our optimal greedy stays ahead solution,we select the activities with the least number of conflicts with other activities in the order which they start. This approach greedily attempts to minimize the number of conflicts for each step. We can prove that this is the optimal solution by showing that it returns an optimal set A. This involves a simple contradiction. In short, if A is not optimal then it must contain at-least one fewer activities than produced by another solution. The missing activity could not have overlapped the other activities in A given the nature of the algorithm. Thus, the fact that it is not contained within A is a contradiction. This algorithm runs in O(n log(n)) time. The longest part is creating the sorted list of activities at the beginning(all other parts take at most O(n) time). The above scenario may be oversimplified in two key ways: we assume that we know (or the algorithm knows) the set of activities and it is possible that each activity has a different value of priority. This is an example of interval scheduling. Another related problem is interval partitioning. This also involves activities spanning different lengths of time. However, now we must include all activities and separate subsets of activities into different “rooms” such that none overlap. The depth of activities is represented by the maximum number which occur at the same time. Our goal is to achieve this using the smallest number of rooms possible, something which can be achieved with a greedy algorithm. The optimal algorithm sorts activities by start times and creates a new room each time there is an additional depth of overlap. Thus, the number of “rooms” aka “layers” aka “resources” is always equal to depth. No algorithm can produce a better result. I found this section very draining to read and believe that these concepts were summarized more clearly in class. Still, it was fairly logical. This section was a 7/10.

Section 4.2: Scheduling to Minimize Lateness: An Exchange Argument:

Here we look at a slightly different scenario: one where activities require x time to complete but may start and finish any time before their deadlines.Our greedy algorithm for this case seeks to minimize the maximum time that an activity completes after it's deadline. Simply scheduling the jobs in order of their time needed from start to finish is wrong because it does not account for each job's deadline.Another flawed approach is scheduling activities by slack (deadline-time needed), scheduling the ones with those that have the least slack first. Surprisingly, the correct approach is very simple: just order activities by deadline with soonest deadlines first(a common study tactic). This makes sense given that MAXIMUM lateness is related to deadlines alone. To prove that this algorithm always produces an optimal schedule, we use a proof by exchange. That is, we consider an optimal schedule and transform it step by step without changing its optimality until it resembles some schedule produced by the algorithm. First, we define an inversion as when two activities are not ordered in the schedule sequentially by deadline. Idle time is time during which no activity is occurring in a schedule. Clearly, all schedules with neither inversions nor idle time have the same maximum lateness(orders may differ only in cases of jobs with same deadline). In summary, the core of our proof is showing that a schedule with idle time and inversions could not be more optimal than one without. Since an inversion swaps two activities, putting the one with the closest due date later, there is either no change in lateness or a positive change overall. Idle time is a waste in this scenario, directly increasing lateness. Thus, the optimal solution will be one with neither idle time nor inversions, with the same efficiency as that created by greedy. Thus, our greedy algorithm produces the optimum solution.

Table of Contents

Chapter 4: Greedy Algorithms

Preface

Section 4.1: Interval Scheduling: The Greedy Algorithm Stays Ahead

Section 4.2: Scheduling to Minimize Lateness: An Exchange Argument: