This is an old revision of the document!
Table of Contents
6.1 Weighted Interval Scheduling
Summary
Designing a Recursive Algorithm
- No natural greedy algorithm is known for this problem
- n requests labeled 1…n with each request i specifying a start time si and finish time fi; each interval i also has a value (or weight) vi; two intervals are compatible if they don’t overlap
- goal: select a subset of mutually compatible intervals to maximize the sum of the values of the selected intervals
- p(j): for an interval j, it is the largest index i<j such that intervals i and j are disjoint (i is the leftmost interval that ends before j begins
- optimal solution Oj: either j is in Oj, in which case OPT(j) = vj + OPT(p(j)), or j is not in Oj in which case OPT(j) = OPT(j-1)
- Therefore, OPT(j) = max(vj+OPT(p(j)), OPT(j-1))
- n belongs to the optimal solution if and only if the first of the options above is at least as good as the second
- Request j belongs to an optimal solution on the set {1, 2, …, j} if and only if vj+OPT(p(j)) >= OPT(j-1)
- Recursive algorithm to compute OPT(n) assuming we have already sorted the requests by finish time and computed the cvalues of p(j) for each j
Compute Opt Algorithm
Compute-Opt(j):
If j=0:
Return 0
Else:
Return max(vj+OPT(p(j)), OPT(j-1))
Endif
- Compute-OPT(j) correctly computes OPT(j) for each j = 1, 2, …, n
- The above algorithm for Compute-Opt as written would take exponential time in the worst case
Memoizing the Recursion
- The fact that the algorithm as written take exponential time is due to the redundancy in the number of times it issues each of these calls
- Solution: store the value of Compute-Opt in a globally accessible place the first time we compute it and then simply use this precomputed value in place of all future recursive calls
- Memorization: the technique of saving values that have already been computed
- Memorized Compute-Opt would make use of array M[0…n]; M[j] will start with the value “empty” but will hold the value of Compute-Opt(j) as soon as it is first determined
Compute Opt with Memoization
M-Compute-Opt(j):
If j=0:
Return 0
Elif M[j] is not empty:
Return M[j]
Else
Define M[j] = max(vj+OPT(p(j)), OPT(j-1))
Return M[j]
Endif
Analyzing the Memorized Version
- The runtime of M-Compute-Opt(n) is O(n) (assuming the input intervals are sorted by their finish time.
- Since M has only n+1 entries, it follows that there can be at most O(n) calls to M-Compute-Opt.
Computing a Solution in Addition to Its Value
• Extend memorized solution to keep track of an optimal solution in addition to its value: we could maintain an additional array S so that S[i] contains an optimal set of intervals among {1, 2, …, i}
- Avoid runtime blow-up by not explicitly computing S but rather by recovering the optimal solution from values saved in the array M after the optimum value has been computed
Retroactively Computing Optimal Solution
Find-Solution(j):
If j=0:
Output nothing
Else:
If vj+M[p(j)] >= M[j-1]:
Output j together with the result of Find-Solution(p(j))
Else:
Output the result of Find-Solution(j-1)
Endif
Endif
- Total of O(n) recursive calls
- Given the array M of the optimal values of the sub-problems Find-Solution returns an optimal solution in O(n) time
Additional Information
6.2 Principles of Dynamic Programming
Summary
Designing the Algorithm
- Basic principles of dynamic programming: iterating over subproblems rather than computing solutions recursively
- Can directly compute the entries in M by iterative algorithm rather than meoized recursion
Weighted Intervals Iterative Solution
Iterative-Compute-Opt:
M[0] = 0
For j = 1, 2, …, n:
M[j] = max(vj+M[p(j)], M[j-1])
Endfor
Analyzing the Algorithm
- Iterative-Compute-Opt is O(n) since it explicitly runs for n iterations and spends constant time in each
A Basic Outline of Dynamic Programming
- To set about developing an algorithm based on dynamic programming, one needs a collection of subproblems derived from the original problem that satisfies a few basic properties
- Polynomial number of subproblems
- Solution to original problem easily computed from the solutions to subproblems
- Natural ordering on subprobelms from “smallest” to “largest” together with an easy-to-compute recurrence that allows on the determine the solution to a subproblem from the solutions to some number of smaller subproblems
- Never clear that a collection of subproblems will be useful until one finds a recurrence linking them together; but it can be difficult to think about recurrences in the absence of the “smallest” subproblems that they build on
Additional Information
6.3 Segmented Least Squares
The Problem
- Suppose our data consists of a set P of n points in the plane denoted (x1, y1) … (xn, yn); suppose x1<x2<…<xn
- Given a line L defined by the equation ax+b, we say that the error of L with respect to P is the sum of its squared “distances” to the points in P
- A natural goal is to find the line with minimum error
- Rather than seeking a single line of best fit, we are allowed to pass an arbitrary set of lines through the points and seek a set of lines that minimizes the error
- Need a problem formulation that requires us to fit the points well, using as few lines as possible
- This is called the Segmented Least Squares Problem
- Change detection: given a sequence of data points, we want to identify a few points in the sequence at which a discrete change occurs
- Must first partition P into some number of segments (a segment is a subset of P that represent a contiguous set of x-coordinates)
- For each segment S in P, we compute the line minimizing the error with respect to the points in S
- Penalty of a partition:
- Number of segments into which we partition P, times a fixed, given multiplier c>0
- For each segment, the error value of the optimal line through that segment
- Goal: find partition of minimum penalty
- Exponentially many possible partitions of P
- Dynamic programming used to find a partition of minimum penalty in time polynomial in n
Designing the Algorithm
- Last point pn belongs to a single segment in the optimal partition and that segment begins at some earlier point pi
- If we knew the identity of the last segment pi, …, pn then we could remove those points from consideration and recursively solve the problem on the remaining points pi, …, pi-1
- OPT(i) denotes the optimum solution for the points p1,…pn and ei,j denotes that minimum error of any line eith respect to pi, pi+1, …, pj
- If the last segment of the optimal partition is pi,…,pn then the value of the optimal solution is OPT(n) = ei,n+c+OPT(i-1)
- For the subprolem on the points p1,..,pj, OPT(j) = min(ei,j + C +OPT(i-1)) and the segment pi,…,pn is used in an optimal solution for the subproblem if and only if the minimum is obtained using index i
Segmented Least Squares Algorithm
Segmented-Least-Squares(n):
Array M[0..n]
Set M[0] = 0
For all pairs i<=j:
Compute the least squares error ei,j for the segment pi,…,pj
Endfor
For j=1, 2, …, n:
Use the recurrence , OPT(j) = min(ei,j + C +OPT(i-1)) to compute M[j]
Endfor
Return M[n]
- Trace back through array M to find the optimum partition
Retroactively Compute the Optimal Solution
Find-Sequence(j):
If j=0:
Output nothing
Else:
Find an i that minimizes ei,j + c + M[i-1]
Output the segment {pi,…,pj} and the result of Find-Segment(i-1)
Endif
Analyzing the Algorithm
- Segmented-Least-Squares: O(n^2) pairs (i, j) for which this computation is needed, for each pair (i, j) we can compute ei,j in O(n) time
- Total runtime to compute all ei,j values is O(n^3) but can is possible to be done in O(n^2) time
- The algorithm has n iterations the determine the minimum of the recurrence OPT(j) = min(ei,j + C +OPT(i-1)); this takes O(n) time
- Total runtime of O(n^2) for this step
- Therefore, algorithm has overall runtime of O(n^2)
