Emily Boyes' Wiki

This is an old revision of the document!

Section 6.1: Weighted Interval Scheduling: A Recursive Procedure

Weighted Interval Scheduling

Job j starts at s_j, finishes at f_j, and has weight or value v_j
Two jobs are compatible if they don't overlap
Goal: find maximum weight subset of mutually compatible jobs
Notation: Label jobs by finishing time: f₁≤ f₂ ≤ … ≤ f_n
Definition: p(j) = largest index i < j such that job i is compatible with j

The recursive algorithm that computes the optimal schedule, now with weights for importance of each request/job:

Compute-Opt(j)
    If j----0 then
        Return 0
    Else
        Return max(v<sub>j</sub> + Compute-Opt(p(j)), Compute-Opt(j - 1))
    Endif

The correctness of the algorithm follows directly by induction on j: Compute-0pt(j) correctly computes OPT(j) for each j = 1, 2, …, n. But, unfortunately, if we really implemented the algorithm Compute-Opt as just written, it would take exponential time to run in the worst case. To come up with a better algorithm we must implement the concept of memoization, a technique that involves storing values that have already been computed. Below is the new algorithm that runs in O(n) time at worst.

M-Compute-Opt(j)
    If j = 0 then
        Return 0
    Else if M[j] is not empty then
        Return M[j]
    Else
        Define M[j] = max(v<sub>j</sub> + Compute-Opt(p(j)), Compute-Opt(j - 1))
        Return M[j]
    Endif

So far we have simply computed the value of an optimal solution; but presumably we want a full optimal set of intervals as well. It would be easy to extend M-Compute-0pt so as to keep track of an optimal solution in addition to its value. This extension of the algorithm is shown below.

Find-Solution(j)
    If j = 0 then
        Output nothing
    Else
        If v<sub>j</sub> + M[p(j)] ≥ M[j - 1] then
            Output j together with the result of Find-Solution(p(j))
        Else
            Output the result of Find-Solution(j - 1)
        Endif
    Endif

Since Find-Solution calls itself recursively only on strictly smaller values, it makes a total of O(n) recursive calls; and since it spends constant time per call, we have Find-Solution returns an optimal solution in O(n) time.

Section 6.2: Principles of Dynamic Programming: Memoization or Iteration over Subproblems

Once we have the array M, the problem is solved: M[n] contains the value of the optimal solution on the full instance, and Find-Solution can be used to trace back through M efficiently and return an optimal solution itself. We can directly compute the entries in M by an iterative algorithm, rather than using memoized recursion. This algorithm is below:

Iterative-Compute-Opt
    M[O] = 0
    For j = l, 2 .....n
        M[j] = max(v<sub>j</sub> + M[p(j)], M[j - 1])
    Endfor

The running time of Iterative-Compute-0pt is clearly O(n), since it explicitly runs for n iterations and spends constant time in each.

To set about developing an algorithm based on dynamic programming, one needs a collection of subproblems derived from the original problem that satisfies a few basic properties:

There are only a polynomial number of subproblems
The solution to the original problem can be easily computed from the solutions to the subproblems
There is a natural ordering on subproblems from “smallest” to “largest” together with an easy-to-compute recurrence that allows one to determine the solution to a subproblem from the solutions to some number of smaller subproblems

Section 6.3: Segmented Least Squares: Multi-way Choices

In the problem we consider here, the recurrence will involve what might be called “multi-way choices”: at each step, we have a polynomial number of possibilities to consider for the structure of the optimal solution. This problem involves finding a line of best fit for a collection of points on a graph. This process is used frequently in statistics and called regression, but in this case the points do not fall along a straight line. Any single line through the points on a polynomial graph would have a terrible error; but if we use two lines, we could achieve quite a small error. So we could try formulating a new problem as follows: Rather than seek a single line of best fit, we are allowed to pass an arbitrary set of lines through the points, and we seek a set of lines that minimizes the error. We need a problem formulation that requires us to fit the points well, using as few lines as possible. We now formulate a problem–the Segmented Least Squares Problem–that captures these issues quite cleanly.

Suppose we let OPT(i) denote the optimum solution for the points p₁, …, p_i and we let e_{i, j} denote the minimum error of any line with respect to p_i, p_{i + 1}, …, _j. Then our observation above says that if the last segment of the optimal partition is p_i, …, p_n, then the value of the optimal solution is OPT(n) = e_{i, n} + C + OPT(i - 1). The algorithm is as follows:

Segmented-Least-Squares(n)
    Array M[0... n]
    Set M[0] = 0
    For all pairs i < j
        Compute the least squares error e<sub>i, j</sub> for the segment p<sub>i</sub>, ..., p<sub>j</sub>
    Endfor
    For j = 1, 2, ..., n
        Use the recurrence (6.7) to compute M[j]
    Endfor
    Return M[n]

The algorithm to compute an optimum partition is as follows:

Find-Segments(j)
    If j = 0 then
        Output nothing
    Else
        Find an i that minimizes e<sub>i, j</sub> + C + M[i - 1]
        Output the segment {p<sub>i</sub>, ..., p<sub>1</sub>]} and the result of
            Find-Segments (i - 1)
    Endif

Analyzing The Run Time

There are O(n²) pairs (i, j) for which this computation is needed; and for each pair (i, j); we can use the formula given at the beginning of this section to compute e_{i, j} in O(n) time. Thus the total running time to compute all e_{i, j} values is O(n³). The algorithm has n iterations, for values j = 1, …, n. For each value of j, we have to determine the minimum in the recurrence (6.7) to fill in the array entry M[j]; this takes time O(n) for each j, for a total of O(n²). Thus the running time is O(n²) once all the e_{i, j} values have been determined.

Table of Contents

Section 6.1: Weighted Interval Scheduling: A Recursive Procedure

Section 6.2: Principles of Dynamic Programming: Memoization or Iteration over Subproblems

Section 6.3: Segmented Least Squares: Multi-way Choices