Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
courses:cs211:winter2018:journals:boyese:chapter6 [2018/03/27 22:59] – [Section 6.3: Segmented Least Squares: Multi-way Choices] boyesecourses:cs211:winter2018:journals:boyese:chapter6 [2018/03/27 23:00] (current) – [Section 6.3: Segmented Least Squares: Multi-way Choices] boyese
Line 74: Line 74:
 In the problem we consider here, the recurrence will involve what might be called "multi-way choices": at each step, we have a polynomial number of possibilities to consider for the structure of the optimal solution. This problem involves finding a line of best fit for a collection of points on a graph. This process is used frequently in statistics and called regression, but in this case the points do not fall along a straight line. Any single line through the points on a polynomial graph would have a terrible error; but if we use two lines, we could achieve quite a small error. So we could try formulating a new problem as follows: Rather than seek a single line of best fit, we are allowed to pass an arbitrary set of lines through the points, and we seek a set of lines that minimizes the error. We need a problem formulation that requires us to fit the points well, using as few lines as possible. We now formulate a problem--the Segmented Least Squares Problem--that captures these issues quite cleanly. In the problem we consider here, the recurrence will involve what might be called "multi-way choices": at each step, we have a polynomial number of possibilities to consider for the structure of the optimal solution. This problem involves finding a line of best fit for a collection of points on a graph. This process is used frequently in statistics and called regression, but in this case the points do not fall along a straight line. Any single line through the points on a polynomial graph would have a terrible error; but if we use two lines, we could achieve quite a small error. So we could try formulating a new problem as follows: Rather than seek a single line of best fit, we are allowed to pass an arbitrary set of lines through the points, and we seek a set of lines that minimizes the error. We need a problem formulation that requires us to fit the points well, using as few lines as possible. We now formulate a problem--the Segmented Least Squares Problem--that captures these issues quite cleanly.
  
-Suppose we let OPT(i) denote the optimum solution for the points p<sub>1</sub>, ..., p<sub>i</sub> and we let e<sub>i, j</sub> denote the minimum error of any line with respect to p<sub>i</sub>, p<sub>i + 1</sub>, ..., <sub>j</sub>. Then our observation above says that if  the last segment of the optimal partition is p<sub>i</sub>, ..., p<sub>n</sub>, then the value of the optimal solution is OPT(n) = e<sub>i, n</sub> + C + OPT(i - 1). The algorithm is as follows:+Suppose we let OPT(i) denote the optimum solution for the points p<sub>1</sub>, ..., p<sub>i</sub> and we let e<sub>i, j</sub> denote the minimum error of any line with respect to p<sub>i</sub>, p<sub>i + 1</sub>, ..., p<sub>j</sub>. Then our observation above says that if  the last segment of the optimal partition is p<sub>i</sub>, ..., p<sub>n</sub>, then the value of the optimal solution is OPT(n) = e<sub>i, n</sub> + C + OPT(i - 1). The algorithm is as follows:
  
 <code> <code>
Line 102: Line 102:
  
 **Analyzing The Run Time** **Analyzing The Run Time**
 +
 There are O(n<sup>2</sup>) pairs (i, j) for which this computation is needed; and for each pair (i, j); we can use the formula given at the beginning of this section to compute e<sub>i, j</sub> in O(n) time. Thus the total running time to compute all e<sub>i, j</sub> values is O(n<sup>3</sup>). The algorithm has n iterations, for values j = 1, ..., n. For each value of j, we have to determine the minimum in the recurrence (6.7) to fill in the array entry M[j]; this takes time O(n) for each j, for a total of O(n<sup>2</sup>). Thus the running time is O(n<sup>2</sup>) once all the e<sub>i, j</sub> values have been determined. There are O(n<sup>2</sup>) pairs (i, j) for which this computation is needed; and for each pair (i, j); we can use the formula given at the beginning of this section to compute e<sub>i, j</sub> in O(n) time. Thus the total running time to compute all e<sub>i, j</sub> values is O(n<sup>3</sup>). The algorithm has n iterations, for values j = 1, ..., n. For each value of j, we have to determine the minimum in the recurrence (6.7) to fill in the array entry M[j]; this takes time O(n) for each j, for a total of O(n<sup>2</sup>). Thus the running time is O(n<sup>2</sup>) once all the e<sub>i, j</sub> values have been determined.
courses/cs211/winter2018/journals/boyese/chapter6.1522191548.txt.gz · Last modified: by boyese
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0