Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2018:journals:lesherr:home:chapter6 [2018/03/26 13:25] – [Section 1: Weighted Interval Scheduling - A Recursive Procedure] lesherr
+++ courses:cs211:winter2018:journals:lesherr:home:chapter6 [2018/03/26 14:32] (current) – [Section 3: Segmented Least Squares - Mult-way Choices] lesherr
@@ Line 1: / Line 1: @@
 ====== Chapter 6: Dynamic Programming ======
 ===== Section 1: Weighted Interval Scheduling - A Recursive Procedure =====
-In previous chapters, we discovered that we can use a greedy approach to this problem to find an optimal solution to the generic Interval Scheduling Problem. We will now look at the Weighted Interval Scheduling Problem, where each interval is assigned a particular value or weight, and we are looking to maximize the total value scheduled. The original Interval Scheduling Problem addresses a special case where all weights of the intervals are equal to 1, but can't be applied to the situation where the weights are different. Using a greedy approach to this problem will not produce an optimal solution, which is why we will use dynamic programming to solve the problem. We will use a recursive algorithm to solve the problem initially. The same properties of the initial Interval Scheduling Problem remain the same in the Weighted version. We have a set of n intervals 1,2,...,n that each have a start time, and a finish time. However, now each interval also has a weight associated with it. Two intervals are considered compatible if their times do not overlap. The goal of the problem is to maximize the overall total value by selecting a set of intervals that are mutually compatible. To start, we will sort the intervals by decreasing finish time. An interval i comes before an interval j if i<j. We will then define p(j) for an interval j, to be the largest index i<j such that i and j are compatible. P(j) = 0 if there is no interval i<j that is compatible with j. We then look at a very obvious observation about our optimal schedule. For each job, the optimal solution will either include it, or not include it. If we choose to include job n, then that means no job between n and p(n) is included in the optimal solution, because there would be overlap. Thus, the optimal solution must also include the optimal solution to the problem of 1,2,...,p(n) intervals. If a job is not included in the optimal solution, then the problem becomes reduced to the optimal solution of 1,2,...,n-1 intervals. Therefore, combining these two options, the optimal solution of a list of intervals of size j is **OPT(j) = max(vj + OPT(p(j)), OPT(j-1)**. Thus, job j is included in the optimal solution IF AND ONLY IF vj+OPT(p(j)) >= OPT(j-1). This solution shows the basic components of dynamic programming: a recurrence equation that expresses the optimal solution in terms of the optimal solutions to smaller subproblems. Using this algorithm, the run time to solve the problem would be exponential, similar to that of the Fibonacci numbers. However, we can improve it using a technique known as 'Memoization'. To do this, we can store already computed values that we know we might need again so that instead of having to recompute them again, we only need to compute them once and then can access them in constant time from storage. For our algorithm, we will maintain and array M[0...n], where M[j] will start with the value of 'empty', but will hold the value of Compute-Opt(j) as soon as we have computed it. Using this implementation, the running time of the M-Compute-Opt(j) is O(n). At this point, we have only computed the value of the optimal solution, not returned the jobs included in the optimal solution.
+In previous chapters, we discovered that we can use a greedy approach to this problem to find an optimal solution to the generic Interval Scheduling Problem. We will now look at the Weighted Interval Scheduling Problem, where each interval is assigned a particular value or weight, and we are looking to maximize the total value scheduled. The original Interval Scheduling Problem addresses a special case where all weights of the intervals are equal to 1, but can't be applied to the situation where the weights are different. Using a greedy approach to this problem will not produce an optimal solution, which is why we will use dynamic programming to solve the problem. We will use a recursive algorithm to solve the problem initially. The same properties of the initial Interval Scheduling Problem remain the same in the Weighted version. We have a set of n intervals 1,2,...,n that each have a start time, and a finish time. However, now each interval also has a weight associated with it. Two intervals are considered compatible if their times do not overlap. The goal of the problem is to maximize the overall total value by selecting a set of intervals that are mutually compatible. To start, we will sort the intervals by decreasing finish time. An interval i comes before an interval j if i<j. We will then define p(j) for an interval j, to be the largest index i<j such that i and j are compatible. P(j) = 0 if there is no interval i<j that is compatible with j. We then look at a very obvious observation about our optimal schedule. For each job, the optimal solution will either include it, or not include it. If we choose to include job n, then that means no job between n and p(n) is included in the optimal solution, because there would be overlap. Thus, the optimal solution must also include the optimal solution to the problem of 1,2,...,p(n) intervals. If a job is not included in the optimal solution, then the problem becomes reduced to the optimal solution of 1,2,...,n-1 intervals. Therefore, combining these two options, the optimal solution of a list of intervals of size j is **OPT(j) = max(vj + OPT(p(j)), OPT(j-1)**. Thus, job j is included in the optimal solution IF AND ONLY IF vj+OPT(p(j)) >= OPT(j-1). This solution shows the basic components of dynamic programming: a recurrence equation that expresses the optimal solution in terms of the optimal solutions to smaller subproblems. Using this algorithm, the run time to solve the problem would be exponential, similar to that of the Fibonacci numbers. However, we can improve it using a technique known as 'Memoization'. To do this, we can store already computed values that we know we might need again so that instead of having to recompute them again, we only need to compute them once and then can access them in constant time from storage. For our algorithm, we will maintain and array M[0...n], where M[j] will start with the value of 'empty', but will hold the value of Compute-Opt(j) as soon as we have computed it. Using this implementation, the running time of the M-Compute-Opt(j) is O(n). At this point, we have only computed the value of the optimal solution, not returned the jobs included in the optimal solution. To achieve this, we will look at the values saved in the memoized array to determine which intervals were included to create the optimal solution. We know that an interval j belongs to the optimal solution if and only if vj+OPT(p(j)) >= OPT(j-1). Using this, we can trace back through the array of values to find which intervals are in the optimal solution. We check this comparison starting at the nth job, and then based on the result of whether n was or was not included, we either move to p(j) or n-1 respectively. Using this algorithm 'Find-Solution', we can return the optimal solution in O(n) time. This algorithm was easy to follow after having discussed it in class, I would give it a 9/10.
+===== Section 2: Principle of Dynamic Programming- Memoization or Iteration over Subproblems =====
+Having solved the Weighted Interval Scheduling problem using a recursive approach, we will now look at it through a different perspective using an iterative approach. The recursive approach returns an optimal solution to the Weighted Interval Scheduling problem in polynomial-time. In our iterative approach, we will again use an array M to keep track of the values, and when computing a value we can look at the values that come earlier in the array. Once we have the full array M, we have the solution to the problem, and can run FIND-SOLUTION to return the actual optimal solution itself.  We can directly compute the entries in M through an iterative for loop, instead of using memoized recursion. We start with M[0] = 0, and increment j, looking to determine the optimal value of M[j]. The running time of this  algorithm is O(n), since we are just running through a single for loop from 1 to n. Thus, we now have two different ways to solve the Weight Interval Scheduling Problem: a recursive approach, and an iterative approach. These approaches help us to form a pattern for developing an algorithm based on dynamic programming: find a collection of subproblems derived from the original problem that satisfies a few basic properties. These guidelines are: 1) There are only a polynomial number of subproblems, 2) the solution to the original problem can be easily computed from the solutions to the subproblems, 3) there is a natural ordering on subproblems from smallest to largest, together with an easy-to-compute recurrence that allows one to determine the solution to a subproblem from the solutions to some number of smaller subproblems. It can be useful sometimes to first defeine a recurrence using reasoning about the structure of the optimal solution, or by the other way around. It depends on the problem. This section was easy to read after having discussed it well in class. I would give it a 9/10.
+===== Section 3: Segmented Least Squares - Mult-way Choices  =====
+In the previous section, we looked at a recurrence based on a fundamentally binary choice: either include an option or don't include it. The problem in this instance maintains 'multi-way' choices: at each step, we have a polynomial number of possibilities to consider for the structure of the optimal solution. The basic of this problem lies to trying to find the line-of-best-fit for a set of statistical data points, trying to minimize the 'error' of the fit, which is calculated by finding the sum of the squared distances between the line and the points. We can easily derive a closed-form solution for calculating this using calculus. Usually, we only use a single line when calculating error of points. However, what if we wanted to use two lines to minimize the error more. We could continue to add lines to minimize the error until we had n total lines and an error of 0. However, this solution is too trivial. We could also try to find a solution using ONLY two lines, however, this requires us to be able to look at the data we are given, instead of dynamically responding to data provided. We want to be able to dynamically adapt to find the best number of lines to fit the data to minimize the error. This problem is called the Segmented Least Squares Problem. This is a fundamental instance of an issue in statistics known as //change detection//: Given a sequence of data points, we want to identify a few points in the sequence at which a discrete change occurs, or in this case, a change in linear approximations. To formulate the problem, we are given a set of points. We need to first partition P into a number of segments, where each segment is a subset of P that represents a contiguous set of x-coordinates. Then, for each segment S, we compute the line to minimize the error. The penalty for a partition is defined as the sum of the following terms: 1) the number of segments into which we partition P, times a fixed, given multiplier C>0, 2) For each segment the error value of the optimal line through that segment. The goal is to find a partition with the minimal penalty. The penalty multiplier C determines our ability to use additional lines to a greater or a lesser extent. In designing the algorithm, we want a polynomial number of subproblems, which we can build up using a recurrence. The last point pn belongs to a single segment in the optimal partition, which begins at some earlier point. If we know the identity of the last segment, we can remove all those points from consideration, and look at the smaller problem. 'If the last segment of the optimal partition is pi,...pn, then the value of the optimal solution is OPT(n) = ei,n + C + OPT(i-1). Thus, we have the following occurrence: **For the subproblem on the points pi,...,pj, OPT(j) = min[1<=i<=j](ei,j + C + OPT(i-1), and the segment pi,..,pj is used in an optimum solution for the suproblem IF AND ONLY IF the minimum is obtained using index i.** Like the Weighted Interval Scheduling Problem, the correctness of the algorithm can be proved using induction. Additionally, we can trace back through the array M to compute an optimum partition. The running-time of computing the values for ei,j for all (i,j) pairs is O(n^3), since there are n^2 pairs, and to compute the error takes O(n). Once all the error values have been calculated, it takes O(n) to find the value for each value of j, so a total run time of O(n^2) once all the ei,j pairs have been determined. This section was harder to follow, but the pictures made it easier to understand. Understanding the run-time of the algorithm wasn't well explained in class or in the book. I would give this section a 5/10 overall.