Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
courses:cs211:winter2012:journals:jeanpaul:chaptersixsectioniii [2012/03/28 01:48] – [The Problem] mugabejcourses:cs211:winter2012:journals:jeanpaul:chaptersixsectioniii [2012/03/28 02:39] (current) – [Algorithm Analysis] mugabej
Line 8: Line 8:
 >>>>>>>>>>>>>>>>>> (x<sub>1</sub>y<sub>1</sub>),(x<sub>2</sub>y<sub>2</sub>),...,(x<sub>n</sub>y<sub>n</sub>), where x<sub>1</sub> < x<sub>2</sub>,...,< x<sub>n</sub>\\ >>>>>>>>>>>>>>>>>> (x<sub>1</sub>y<sub>1</sub>),(x<sub>2</sub>y<sub>2</sub>),...,(x<sub>n</sub>y<sub>n</sub>), where x<sub>1</sub> < x<sub>2</sub>,...,< x<sub>n</sub>\\
 >>>>>>>>>>>>>>>>>> Given a line L with equation y = ax + b, we say that an "error" of L with respect to P is the sum of all of its squared distances to the points in P:\\ >>>>>>>>>>>>>>>>>> Given a line L with equation y = ax + b, we say that an "error" of L with respect to P is the sum of all of its squared distances to the points in P:\\
->>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Error(L,P) = ∑<sub>i =1</sub><sup>n</sup> (y<sub>i</sub>ax<sub>i</sub>- b) <sup>2</sup>+>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Error(L,P) = ∑<sub> from i =1 to n</sub> (y<sub>i</sub> - ax<sub>i</sub>- b) <sup>2</sup>\\ 
 +>>>>>>>>>>>>>>>>>> Thus naturally, we are bound to finding the line with minimum error.\\ 
 +>>>>>>>>>>>>>>>>>> The solution turns out to be a line y = ax + b, where:\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a = [(n∑<sub>i</sub> x<sub>i</sub>y<sub>i</sub>) - (∑<sub>i</sub>x<sub>i</sub>)(∑<sub>i</sub> y<sub>i</sub>)]/[(n∑<sub>i</sub> x<sub>i</sub><sup>2</sup>) - (∑<sub>i</sub> x<sub>i</sub>)<sup>2</sup>]\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> And b = (∑<sub>i</sub> y<sub>i</sub>a∑<sub>i</sub> x<sub>i</sub>)/n\\ 
 +\\ 
 +>>>>>>>>>>>>>>>>>> However, our problem is different from the above mentioned which these (above) formulas solve.\\ 
 +** Formulating our problem** 
 +\\ 
 +>>>>>>>>>>>>>>>>>> Given a sequence of data points, we need to identify a few points in the sequence at which a discrete change occurs.--> In our specific case, a change from one linear approximation to another.\\ 
 +>>>>>>>>>>>>>>>>>> So, we have a set of points P = {(x<sub>1</sub>,y<sub>1</sub>),(x<sub>2</sub>,y<sub>2</sub>),...,(x<sub>n</sub>,y<sub>n</sub>)} with x<sub>1</sub> < x<sub>2</sub>,..., x<sub>n</sub>.\\ 
 +>>>>>>>>>>>>>>>>>> p<sub>i</sub> denotes the point (x<sub>i</sub>,y<sub>i</sub>)\\ 
 +>>>>>>>>>>>>>>>>>> We must first partition P into some number of segments.\\ 
 +>>>>>>>>>>>>>>>>>> Each segment is a subset of P that represents a contiguous set of x-coordinates of the form {p<sub>i</sub>,p<sub>i+1</sub>,...,p<sub>j-1</sub>,p<sub>j</sub>} for some indices i≤j.\\ 
 +>>>>>>>>>>>>>>>>>> For each segment S in our partition P, we compute the line minimizing the error with respect to the points in S using the formula we found above(where we wrote the value of a and b in the line y = ax + b).\\ 
 +>>>>>>>>>>>>>>>>>> The penalty of a partition is defined to be the sum of:\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The number of segments into which we partition P, times a fixed, given multiplier C>0.\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For each segment, the error value of the optimal line through that segment.\\ 
 +>>>>>>>>>>>>>>>>>> Our goal is to find the a partition of minimum penalty.\\ 
 +=====  Designing the Algorithm ===== 
 + 
 +>>>>>>>>>>>>>>>>>> INPUT: n, p<sub>1</sub>,...,p<sub>N</sub>, C\\ 
 +>>>>>>>>>>>>>>>>>> Segmented-Least-Squares()\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> M[0] = 0\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> e[0][0] = 0\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for j = 1 to n\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for i = 1 to j\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> e[i][j] = least square error for the segment p<sub>i</sub>,...,p<sub>j</sub>\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for j = 1 to n\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> M[j] = min <sub>1 ≤ i ≤ j</sub> (e[i][j] + c + M[i-1])\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> return M[n]\\ 
 +\\  
 +Just like the Weighted Interval Scheduling Problem, we need to find an algorithm that gives us the solution itself.\\ 
 +** Algorithm to give for the solution itself** 
 +\\ 
 +>>>>>>>>>>>>>>>>>> FindSegments(j):\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> if j = 0:\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> output nothing\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> else:\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Find an i that minimizes e<sub>i,j</sub> + C + M[i-1]\\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Output the segment {p<sub>i</sub>, …, p<sub>j</sub>} and the result of \\ 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> FindSegments(i-1) 
 +>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> End if 
 + 
 +=====  Algorithm Analysis ===== 
 +\\ 
 +The running time of the algorithm is O(n<sup>2</sup>) since filling in the array M[j] takes us O(n) time for each j.\\ 
 + 
 +I give this section 7/10. 
 + 
 + 
courses/cs211/winter2012/journals/jeanpaul/chaptersixsectioniii.1332899330.txt.gz · Last modified: 2012/03/28 01:48 by mugabej
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0