Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2011:journals:charles:chapter6 [2011/03/30 03:20] – [6.6 Sequence Alignment] gouldc
+++ courses:cs211:winter2011:journals:charles:chapter6 [2011/04/05 05:52] (current) – [6.9 Shortest Paths and Distance Vector Protocols] gouldc
@@ Line 45: / Line 45: @@
 (6.16) //OPT(i,j) = min(mismatch_cost(xi,yi) + OPT(i-1,j-1), gap_cost + OPT(i-1,j), gap_cost + OPT(i,j-1))//
-This is because there are three possibilities for the mth letter in string 1 and the nth letter in string 2: they form a pair in the optimal alignment, the mth letter in string 1 is not in any pair in the optimal alignment, or the nth letter in string 2 is not in any pair in the optimal alignment. The dynamically programmed algorithm for this problem runs in O(mn) time because every pair (i,j) must be considered from 1...m and 1...n, and each of these takes at most constant time.
+This is because there are three possibilities for the mth letter in string 1 and the nth letter in string 2: they form a pair in the optimal alignment, the mth letter in string 1 is not in any pair in the optimal alignment, or the nth letter in string 2 is not in any pair in the optimal alignment. The dynamically programmed algorithm for this problem runs in O(mn) time because every pair (i,j) must be considered from 1...m and 1...n, and each of these takes at most constant time. It also takes O(mn) space.
 ===== 6.7 Sequence Alignment in Linear Space via Divide and Conquer =====
+This section shows us an improved algorithm for sequence alignment. It still runs in O(mn) time but it decreases the space requirement from O(mn) to O(m+n). We realize that computing a single column in the memoized array in the last problem only requires us to know about the current column and the previous column. Therefore it is possible to arrive at the solution while only maintaining a m-by-2 array instead of a m-by-n array. This idea will work nicely for building up to the solution //value// (i.e., the total cost of all gaps and mismatches)... but we also want to know the optimal //sequence// itself.
+We do this by incorporating divide-and-conquer principles into our algorithm. I'm slightly confused by the divide-and-conquer component to this problem. In class we did it with the graph in Figure 6.19. We started in the top left and found the shortest path to some midpoint, then we did the same thing from the bottom right. I will need to look in more depth at the pseudo-code on page 288 to get a better grasp of the concept.
 ===== 6.8 Shortest Paths in a Graph =====
+Problem outline: We want to find the shortest path between two nodes in a directed graph. We cannot use Dijkstra's Algorithm because there can be negative weights. (We won't know whether we have grown the blob in the right way because there could be a negative edge in the future that provides a shorter path to some node already in the blob.) Therefore we want to build up partial solutions. The first partial solution is the cost of s->t that uses at most one edge; if s doesn't have an edge to t, then the value we put in the memoized array is infinity. (6.22) says that the longest possible optimal path has n-1 edges. So at most we would do n-1 iterations of this.
+Here we assume no negative cycles because otherwise the optimal path would follow that cycle infinitely many times and so reduce the cost of the path to negative infinity. But that would be stupid. We saw from class that like the problem above we can fill any column by referring only to the column itself and the column that came before it.
+(6.25) says that the shortest path algorithm can run in O(mn) time.
+===== 6.9 Shortest Paths and Distance Vector Protocols =====
+Outline: An instance of the shortest-path problem on an undirected graph for which all edge costs are positive. Dijkstra's is possible, but we don't want to use it (we can think of something better) because we are talking about routers here and we prefer to work with local information. The BF Algorithm from 6.8 has a local property that can be utilized in a dynamic programming solution. If we maintained each node's value in a memoized array, we could update it by simply looking at the values of all of its neighbors and computing the total cost of that value and an edge to that node. This section proposes a change in the BF Algorithm from a "pull-based" implementation to a "push-based" implementation: this means we will only update the values in the memoized array for nodes that have been affected by a recent change, saving a lot of computation time.
+Problems: Assumes edge costs will remain constant during runtime, which for many reasons will probably not be the case in practice. Also, there is the potential "problem of counting to infinity" which we talked about in class - two nodes constantly update one another and tell the other that they have been updated, which leads to an infinite number of operations.