Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionNext revisionBoth sides next revision | ||
courses:cs211:winter2011:journals:charles:chapter6 [2011/03/30 02:40] – [6.6 Sequence Alignment] gouldc | courses:cs211:winter2011:journals:charles:chapter6 [2011/04/05 05:26] – gouldc | ||
---|---|---|---|
Line 39: | Line 39: | ||
See Figure 6.16 for an illustration of building up the subproblems. The solution we want is OPT(1,n). The running time of the algorithm is bounded by O(n^3). | See Figure 6.16 for an illustration of building up the subproblems. The solution we want is OPT(1,n). The running time of the algorithm is bounded by O(n^3). | ||
===== 6.6 Sequence Alignment ===== | ===== 6.6 Sequence Alignment ===== | ||
- | Problem outline: Need a measure by which to compare strings for similarity. | + | Problem outline: Need a measure by which to compare strings for similarity. |
- | '' | + | Solution outline: Find a minimum-cost alignment between the two strings. Alignments are composed of gaps and mismatches. The total cost of all gaps and mismatches is a measure of how dissimilar two strings are. So to see how similar two strings are, we need the minimum-cost alignment between them. We do this with the following recurrence relation: |
- | ~tops'' | + | |
- | This alignment may be written {(2,1),(3,2),(4,3)} -- the second character in ' | + | (6.16) //OPT(i,j) = min(mismatch_cost(xi,yi) + OPT(i-1,j-1), gap_cost + OPT(i-1,j), gap_cost + OPT(i,j-1))// |
- | Solution outline: | + | This is because there are three possibilities for the mth letter in string 1 and the nth letter in string 2: they form a pair in the optimal |
===== 6.7 Sequence Alignment in Linear Space via Divide and Conquer ===== | ===== 6.7 Sequence Alignment in Linear Space via Divide and Conquer ===== | ||
+ | This section shows us an improved algorithm for sequence alignment. It still runs in O(mn) time but it decreases the space requirement from O(mn) to O(m+n). We realize that computing a single column in the memoized array in the last problem only requires us to know about the current column and the previous column. Therefore it is possible to arrive at the solution while only maintaining a m-by-2 array instead of a m-by-n array. This idea will work nicely for building up to the solution //value// (i.e., the total cost of all gaps and mismatches)... but we also want to know the optimal // | ||
+ | We do this by incorporating divide-and-conquer principles into our algorithm. I'm slightly confused by the divide-and-conquer component to this problem. In class we did it with the graph in Figure 6.19. We started in the top left and found the shortest path to some midpoint, then we did the same thing from the bottom right. I will need to look in more depth at the pseudo-code on page 288 to get a better grasp of the concept. | ||
===== 6.8 Shortest Paths in a Graph ===== | ===== 6.8 Shortest Paths in a Graph ===== | ||
+ | Problem outline: We want to find the shortest path between two nodes in a directed graph. We cannot use Dijkstra' | ||
+ | |||
+ | Here we assume no negative cycles because otherwise the optimal path would follow that cycle infinitely many times and so reduce the cost of the path to negative infinity. But that would be stupid. We saw from class that like the problem above we can fill any column by referring only to the column itself and the column that came before it. | ||
+ | |||
+ | (6.25) says that the shortest path algorithm can run in O(mn) time. | ||
+ | |||
+ | ===== 6.9 Shortest Paths and Distance Vector Protocols ===== |