The weighted interval scheduling problem is a generalized version of the interval scheduling problem where each of the intervals has a certain value and we are trying to maximize the value of the set. To create this set of intervals with maximum value we first design a recursive solution. First, we sort the intervals in order of nondecreasing finish time and define p(j) to be the largest index i< j such that the interval i is compatible with interval j. For each interval there are two choices; it is either in the optimal solution or it is not in the optimal solution. If the interval is in optimal solution, then no interval with an index between the interval and it's p value is included. If the interval is not in the optimal solution, then the optimal solution will include the optimal value from 1 to that interval. In order to find the optimal solution, we choose the max of these to options (page 254 of the text). However, this algorithm runs in O(2^n) time, which is not efficient. Therefore, we memoize the recursion. Memoizing is the process of saving values that we have already computed nd placing them in an array. This way instead of computing the optimal values over and over we can just index the array for the values. Creating an algorithm with the memoized array (page 256 in the text) reduces the run time to O(n) if the input is sorted by finish times. These algorithms only compute the numerical value of the solution, but we want the set of intervals as well. In order to do this, after computing the optimum value, we go back and recover the intervals we used to find this value. This algorithm (on page 258 of the text) runs in O(n) time.
I'm still a little confused on how exactly to memoize an algorithm and how to find the value. How is the algorithm able to find the intervals in O(n) time in Find-Solution?
This section was wordy, but our discussion in class made it so that I understood how the algorithm worked. I would give it an 8/10.
This section breaks down the basic principles of dynamic programming after going through the example in section 1. First, in order to compute the value of the optimal solution we must choose whether the sub problem is in the optimal solution or not in the solution. Then we store these values in a memoized array. This will run in O(n). The basic ideas of a dynamic programming is to divide the problem into subproblems that there are only a polynomial number of, that the solution to the original problem can be computed from, and that have a natural ordering from smallest to largest. Often, when creating the dynamic algorithms, we first define a recursive solution, then we unpack the recurrence and create a solution that will run in less time.
The motives behind dynamic programming are to solve problems that do not have a natural greedy solution and that cannot be easily solved by divide and conquer algorithms.
This section helped to clarify some of my questions about the first problem we solved. it was short and provided a solid map for other types of dynamic programming problems. I would give this section a 8/10.
This section discussing the problem of finding lines of best fit to represent a set of points. Often, when a set of points is not linear, it cannot be represented by one line without a large amount of error. Thus we use a dynamic solution to figure out the optimal number of lines as well as the points on those lines. We represent error by the sum of squared distances from the point to the line. We want to identify where there are discrete changes in the point (i.e. where a new line would begin). We define the penalty of a segment to be Lc + E where L is the number of lines, c is the cost of a line, and E is the error of the line. We want to find a partition of minimum penalty. Starting at the last point, we know that this point belongs to one segment in the optimal solution and that segment starts at an earlier point. We then develop the algorithm on page 265 of the text. This computes the least squared errors of sequential sets of points and then we choose the minimum of the errors. Then to find the values we output the segments with the minimum error. This algorithm with run in O(n^3) error as there are O(n^2) pairs and O(n) time is needed to compute the error values.
This is important in modeling data sets.
I would give this section a 7/10 as it was a little confusing at first to understand the type of subproblem we needed to divide this problem into.
This section discusses the problem where we have a container with a fixed capacity and a number of items with fixed weights and values. We want to fill the container with the objects that will maximize its value. This problem cannot be solved by the “obvious” set of subproblems, so we create more subproblems. We create a table with an increasing bag capacity across the top and an increasing number of items down the side. This way we are able to compute the maximum value at any given capacity. Then, given a capacity, we are able to determine what items would give us the maximized value. The algorithm, given on page 269, or the text outlines how to generate this table and fill it in. This algorithm runs in O(nW) time which is pseudo-polynomial as it is not just based off of the input size n, but also off of the weight capacity W since we iterate through the possible capacities. A common version of this problem is the knapsack problem, which we discussed in class. I thought some of the discussion in class ended up being more confusing in helpful as I think some students were getting themselves lost. However, the book helped make everything very clear. So did running through the complete example in class.
I would give this section a 8/10 because I think this is a cool problem and I can think of real world applications for the algorithm which makes working through them that much easier!