====== 6.3. Segmented Least Squares: Multi-way Choices ======
With this problem, at each step we have a polynomial number of possibilities to consider the structure of the optimal solution.\\
\\
===== The Problem =====
>>>>>>>>>>>>>>>>>> Suppose our data consists of a set P of n points in the plane,:\\
>>>>>>>>>>>>>>>>>> (x1y1),(x2y2),...,(xnyn), where x1 < x2,...,< xn\\
>>>>>>>>>>>>>>>>>> Given a line L with equation y = ax + b, we say that an "error" of L with respect to P is the sum of all of its squared distances to the points in P:\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Error(L,P) = ∑ from i =1 to n (yi - axi- b) 2\\
>>>>>>>>>>>>>>>>>> Thus naturally, we are bound to finding the line with minimum error.\\
>>>>>>>>>>>>>>>>>> The solution turns out to be a line y = ax + b, where:\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a = [(n∑i xiyi) - (∑ixi)(∑i yi)]/[(n∑i xi2) - (∑i xi)2]\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> And b = (∑i yi- a∑i xi)/n\\
\\
>>>>>>>>>>>>>>>>>> However, our problem is different from the above mentioned which these (above) formulas solve.\\
** Formulating our problem**
\\
>>>>>>>>>>>>>>>>>> Given a sequence of data points, we need to identify a few points in the sequence at which a discrete change occurs.--> In our specific case, a change from one linear approximation to another.\\
>>>>>>>>>>>>>>>>>> So, we have a set of points P = {(x1,y1),(x2,y2),...,(xn,yn)} with x1 < x2,..., xn.\\
>>>>>>>>>>>>>>>>>> pi denotes the point (xi,yi)\\
>>>>>>>>>>>>>>>>>> We must first partition P into some number of segments.\\
>>>>>>>>>>>>>>>>>> Each segment is a subset of P that represents a contiguous set of x-coordinates of the form {pi,pi+1,...,pj-1,pj} for some indices i≤j.\\
>>>>>>>>>>>>>>>>>> For each segment S in our partition P, we compute the line minimizing the error with respect to the points in S using the formula we found above(where we wrote the value of a and b in the line y = ax + b).\\
>>>>>>>>>>>>>>>>>> The penalty of a partition is defined to be the sum of:\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The number of segments into which we partition P, times a fixed, given multiplier C>0.\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For each segment, the error value of the optimal line through that segment.\\
>>>>>>>>>>>>>>>>>> Our goal is to find the a partition of minimum penalty.\\
===== Designing the Algorithm =====
>>>>>>>>>>>>>>>>>> INPUT: n, p1,...,pN, C\\
>>>>>>>>>>>>>>>>>> Segmented-Least-Squares()\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> M[0] = 0\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> e[0][0] = 0\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for j = 1 to n\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for i = 1 to j\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> e[i][j] = least square error for the segment pi,...,pj\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for j = 1 to n\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> M[j] = min 1 ≤ i ≤ j (e[i][j] + c + M[i-1])\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> return M[n]\\
\\
Just like the Weighted Interval Scheduling Problem, we need to find an algorithm that gives us the solution itself.\\
** Algorithm to give for the solution itself**
\\
>>>>>>>>>>>>>>>>>> FindSegments(j):\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> if j = 0:\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> output nothing\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> else:\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Find an i that minimizes ei,j + C + M[i-1]\\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Output the segment {pi, …, pj} and the result of \\
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> FindSegments(i-1)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> End if
===== Algorithm Analysis =====
\\
The running time of the algorithm is O(n2) since filling in the array M[j] takes us O(n) time for each j.\\
I give this section 7/10.