What we will call the error of a line of best fit is the sum of each point's distance from the line. It is relatively easy to place such a line and calculate its error when all the points seem to fit around one line, but sometimes the points in a data set can look like they fit on a sequence of two or more lines. This makes us want to find a way to allow the algorithm to fit a set of lines to the data points. We cannot allow it to fit an arbitrarily large set of lines, because then our error will be 0. We also cannot hard-code in any number of lines because the points could better fit on a different number of lines.
This dilemma leads us to explore the issue of change detection, which is helps us identify a few points at which a change occurs. Such a technique will help to tell us how many linear approximations to use. To assist us in the solving of this problem we will define a penalty of a partition as the sum of the number of segments times a given multiplier and the error of the optimal line through a segment of points. We want to minimize the penalty. As we increase the number of segments into which the data is partitioned, we decrease the error of each segment but increase the penalty in terms of number of segments.
To design the algorithm, we can begin by thinking that the final point in the set of points will only be in one segment and that segment must begin at a point. If we can figure out what these two points are, then we will have found the composition of the final segment and we can remove those points from the data. Letting ei, n be the error for this last segment, we can say that the optimal solution for nth point is the error of segment i to n plus our penalty for adding a new segment plus the optimal solution for all the points up to point i.
To be completely honest, I am not completely sure what statement 6.7 in the book is saying, so I will look back over that at a later time..
The algorithm uses O(n3) time to calculate the error for each pair of points, however once this is calculated it runs in O(N2) time.