Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2018:journals:beckg:ch6 [2018/03/26 21:20] – created beckg
+++ courses:cs211:winter2018:journals:beckg:ch6 [2018/03/27 03:50] (current) – beckg
@@ Line 64: / Line 64: @@
 ===== 6.3: Segmented Least Squares: Multi-way Choices =====
+For this next problem, we introduce another variable to the decision making, rather than a simple binary "yes" or "no" a given interval is in the optimal solution.
+==== The Problem ====
+Essentially, we have the equation for a best fit line of a given set of points in an x-y plane. However, there are many groupings of points for which using one, two, three, or any number of lines for groupings of contiguous segments of those points that lead to a //far// better fit. Wanting to minimize the numbers of partitions we divide the full set into, we define a //penalty// of a partition to be the sum of
+  * the number of segments into which we partition //P// (the full set of points) multiplied by a fixed given multiplier //C>0//
+  * for each segment, the error value of the optimal line through it.
+So, the goal of this Segmented Least Squares Problem is to find a partition of minimal penalty. Note that varying the value of //C// and such can yield entirely different solutions.
+As in the previous section, we can begin with a simple observation: that point //p<sub>n</sub>// belongs to a line segment that begins at some earlier point in the optimal solution. This leads toward an understanding of subproblems--if we know that last segment, then we simply work the same optimality on the remaining points. Denoting the error of a line segment from point //i// to point //n// as //e<sub>i,j</sub>//, we have that if the last segment of the optimal partition is from point //i// to //n//, then the value of the optimal solution is
+  * OPT(n) = //e<sub>i,j</sub> + C + OPT(i-1)//.
+Further, we have the following recurrence for any point //j//:
+  * OPT(j) = min<sub>//1=<i=<j//</sub>(//e<sub>i,j</sub> + C + OPT(j-1))//.
+So, the algorithm that follows is thus:
+Segmented-Least-Squares(n)
+  * Array M[0...n]
+  * Set M[0]=0
+  * For all pairs i=<j:
+    * Compute the least squares error //e<sub>i,j</sub>// for the segment from point //i// to point //j//
+  * For j=1,2,...,n
+    * Use the recurrence above to compute M[j]
+  * Return M[n]
+The correctness of this algorithm can be proved with induction. We can also trace back through the array M to find the optimal partition:
+Find-Segments(j)
+  * If j=0:
+    * output nothing
+  * Else:
+    * Find an i that minimizes //e<sub>i,j</sub> + C + M[i-1]//
+    * Output the segment from point //i// to point //j// and the result of Find-Segments(i-1)
+The total running time of computing all of the //e<sub>i,j</sub>// values for all the pairs is //O(n<sup>3</sup>)//. After this, though, the total running time is simply //O(n<sup>2</sup>)//.
+Another very good section. They went over everything extremely well and explained the nuance thoroughly. 9/10.