Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
courses:cs211:winter2011:journals:chen:chapter_6 [2011/03/16 06:14] – [6.3 Segmented Least Squares: Multi-way Choices] zhongc | courses:cs211:winter2011:journals:chen:chapter_6 [2011/04/06 16:41] (current) – [6.9 Shortest Paths and Distance Vector Protocols] zhongc | ||
---|---|---|---|
Line 104: | Line 104: | ||
interesting/ | interesting/ | ||
+ | |||
+ | |||
+ | ====== 6.4 Subset Sums and Knapsacks: Adding a Variabl ====== | ||
+ | Summary | ||
+ | Problem: | ||
+ | Given n objects and a " | ||
+ | Item i weighs wi > 0 kilograms and has value vi > 0 | ||
+ | |||
+ | 1d-dynamical programming won't work. | ||
+ | |||
+ | Recurrence: | ||
+ | Either select item i or not | ||
+ | |||
+ | Case 1: OPT does not select item i | ||
+ | • OPT selects best of { 1, 2, …, i-1 } | ||
+ | |||
+ | Case 2: OPT selects item i | ||
+ | • Accepting item i does not immediately imply that | ||
+ | we will have to reject other items | ||
+ | No known conflicts | ||
+ | • **Without knowing what other items were selected | ||
+ | before i, we don't even know if we have enough | ||
+ | room for i** | ||
+ | Needs more memorization power! | ||
+ | |||
+ | Add a variable. | ||
+ | |||
+ | Def. OPT(i, **w**) = max profit subset of items 1, | ||
+ | …, i with weight limit w | ||
+ | Case 1: OPT does not select item i | ||
+ | • OPT selects best of { 1, 2, …, i-1 } using weight | ||
+ | limit w | ||
+ | Case 2: OPT selects item i | ||
+ | • new weight limit = w – wi | ||
+ | • OPT selects best of { 1, 2, …, i–1 } using new | ||
+ | weight limit, w – wi | ||
+ | |||
+ | |||
+ | |||
+ | iterate through the n-w array. | ||
+ | |||
+ | Runtime: O(nw) | ||
+ | |||
+ | Interesting/ | ||
+ | |||
+ | |||
+ | ====== 6.5 RNA SECONDARY STRUCTURE ====== | ||
+ | |||
+ | |||
+ | Summary | ||
+ | |||
+ | |||
+ | Problem: | ||
+ | RNA. String B = b1b2…bn over alphabet { A, C, G, U } | ||
+ | • Secondary structure. RNA is single-stranded so it | ||
+ | |||
+ | A set of pairs S = { (bi, bj) } that satisfy: | ||
+ | Constraints: | ||
+ | |||
+ | |||
+ | [Watson-Crick] S is a matching and each pair in | ||
+ | S is a Watson-Crick complement: A-U, U-A, CG, | ||
+ | or G-C | ||
+ | [No sharp turns] The ends of each pair are | ||
+ | separated by at least 4 intervening bases. If (bi, | ||
+ | bj) ∈ S, then i < j - 4 | ||
+ | [Non-crossing] If (bi, bj) and (bk, bl) are two | ||
+ | pairs in S, then we cannot have i < k < j < l | ||
+ | • Free energy. Usual hypothesis is that an RNA | ||
+ | molecule will form the secondary structure with the | ||
+ | optimum total free energy. | ||
+ | • Goal. Given an RNA molecule B = b1b2…bn, find a | ||
+ | secondary structure S that maximizes the number | ||
+ | of base pairs | ||
+ | |||
+ | |||
+ | |||
+ | OPT(i, j) = maximum number of base pairs in | ||
+ | a secondary structure of the substring | ||
+ | bibi+1…bj | ||
+ | Case 1. If i ≥ j - 4 | ||
+ | • OPT(i, j) = 0 by no-sharp turns condition | ||
+ | Case 2. Base bj is not involved in a pair | ||
+ | • OPT(i, j) = OPT(i, j-1) | ||
+ | Case 3. Base bj pairs with bt for some i ≤ t < j - 4 | ||
+ | • non-crossing constraint decouples resulting subproblems | ||
+ | • OPT(i, j) = 1 + maxt { OPT(i, t-1) + OPT(t+1, j-1) } | ||
+ | |||
+ | |||
+ | |||
+ | Runtime: O(N3) two for loops and a max(). | ||
+ | |||
+ | Interesting/ | ||
+ | |||
+ | |||
+ | |||
+ | ====== 6.6-6.7 SEQUENCE ALIGNMENT ====== | ||
+ | Some general context of the problem: | ||
+ | |||
+ | Google search and dictionaries have functions that recognizes what the user probably mean when typing in a word | ||
+ | that does not exist in the dataset. | ||
+ | In order to do | ||
+ | **How should we define similarity between two words or strings?** | ||
+ | |||
+ | Dissimilar by gaps and mismatches. | ||
+ | |||
+ | |||
+ | Formalized: | ||
+ | |||
+ | Sequence Alignment | ||
+ | • Goal: Given two strings X = x1 x2 . . . xm and | ||
+ | Y = y1 y2 . . . yn find alignment of minimum | ||
+ | cost | ||
+ | • An alignment M is a set of ordered pairs xi-yj | ||
+ | such that each item occurs in at most one | ||
+ | pair and no crossings | ||
+ | • The pair xi-yj and xi' | ||
+ | |||
+ | |||
+ | different cases | ||
+ | |||
+ | Case 1: xM and yN are aligned | ||
+ | • Pay mismatch for xM-yN + min cost of aligning rest of | ||
+ | strings | ||
+ | • OPT(M, N) = αXmYn + OPT(M-1, N-1) | ||
+ | Case 2: xM is not matched | ||
+ | • Pay gap for xM + min cost of aligning rest of strings | ||
+ | • OPT(M, N) = δ + OPT(M-1, N) | ||
+ | Case 3: yN is not matched | ||
+ | • Pay gap for yN + min cost of aligning rest of strings | ||
+ | • OPT(M, N) = δ + OPT(M, N-1) | ||
+ | |||
+ | |||
+ | Similar to KnapSack problem in the data structure. | ||
+ | |||
+ | space O(mn) | ||
+ | |||
+ | Can do this in linear space: | ||
+ | Collapse into an m x 2 array | ||
+ | M[i,0] represents previous row; M[i,1] -- current | ||
+ | |||
+ | |||
+ | T(m, n) <= cmn + T(q, n/2) + T(m - q, n/2) | ||
+ | = cmn + kqn/2 + k(m - q)n/2 | ||
+ | = cmn + kqn/2 + kmn/2 - kqn/2 | ||
+ | = (c + k/2)mn. | ||
+ | |||
+ | combination of divide-and-conquer and dynamic programming | ||
+ | |||
+ | we can map the problem to shortest path problem on a weighted graph. | ||
+ | |||
+ | |||
+ | Divide: find index q that minimizes f(q, n/2) + g(q, n/2) using DP | ||
+ | Conquer: recursively compute optimal alignment in each piece. | ||
+ | |||
+ | Interesting/ | ||
+ | |||
+ | |||
+ | |||
+ | ====== 6.8 Shortest Paths in a Graph ====== | ||
+ | |||
+ | Negative weights would change everything that made the Greedy appropriate. Thus we cannot | ||
+ | make decision only relying on local inforamtion at each step b/c a big negative edge that | ||
+ | come later may drastically change the picture. | ||
+ | |||
+ | Some constraints: | ||
+ | No negative weight cycle | ||
+ | If some path from s to t contains a negative | ||
+ | cost cycle, there does not exist a shortest s-t | ||
+ | path. | ||
+ | we can loop forever and get ever smaller. | ||
+ | |||
+ | |||
+ | Reccurence: | ||
+ | |||
+ | OPT(i,v): minimum cost of a v-t path P using | ||
+ | at most i edges | ||
+ | This formulation eases later discussion | ||
+ | • Original problem is OPT(n-1, s) | ||
+ | |||
+ | |||
+ | DP | ||
+ | |||
+ | Case 1: P uses at most i-1 edges | ||
+ | • OPT(i, v) = OPT(i-1, v) | ||
+ | Case 2: P uses exactly i edges | ||
+ | • if (v, w) is first edge, then OPT uses (v, w), and | ||
+ | then selects best w-t path using at most i-1 edges | ||
+ | • Cost: cost of chosen edge | ||
+ | |||
+ | Implementation | ||
+ | for every edge number, | ||
+ | for possible node v | ||
+ | for each edge incident to v | ||
+ | find out foreach edge (v, w) ∈ E | ||
+ | M[i, v] = min(M[i, v], M[i-1, w] + cvw ) | ||
+ | |||
+ | |||
+ | O(mn) space. | ||
+ | |||
+ | General process of DP | ||
+ | |||
+ | Review: Dynamic Programming Process | ||
+ | 1. Determine the optimal substructure of the | ||
+ | problem define the recurrence relation | ||
+ | 2. Define the algorithm to find the value of the | ||
+ | optimal solution | ||
+ | 3. Optionally, change the algorithm to an | ||
+ | iterative rather than recursive solution | ||
+ | 4. Define algorithm to find the optimal | ||
+ | solution | ||
+ | 5. Analyze running time of algorithms | ||
+ | |||
+ | |||
+ | Interesting/ | ||
+ | |||
+ | |||
+ | |||
+ | ====== 6.9 Shortest Paths and Distance Vector Protocols ====== | ||
+ | |||
+ | Problem Motivation: | ||
+ | One important application of the Shortest-Path Problem is for routers in a | ||
+ | communication network to determine the most efficient path to a destination. | ||
+ | |||
+ | |||
+ | attempt: | ||
+ | Dijkstra' | ||
+ | we need to work with only local information. | ||
+ | |||
+ | |||
+ | Bellman-Ford uses only local knowledge of | ||
+ | neighboring nodes | ||
+ | Distribute algorithm: each node v maintains its | ||
+ | value M[v] | ||
+ | Updates its value after getting neighbor’s values | ||
+ | |||
+ | |||
+ | Problems with the Distance Vector Protocol | ||
+ | One of the major problems with the distributed implementation of Bellman- | ||
+ | Ford on routers (the protocol we have been discussing above) is that it’s derived | ||
+ | from an initial dynamic programming algorithm that assumes edge costs will | ||
+ | remain constant during the execution of the algorithm. | ||
+ | That is, we might get into a situation where there is infinite looping of mutual dependancy. | ||
+ | |||
+ | could fail if the other node is deleted. | ||
+ | |||
+ | |||
+ | Solution: | ||
+ | |||
+ | Each router stores entire path Not just the distance and the first hop | ||
+ | Based on Dijkstra' | ||
+ | Avoids " | ||
+ | difficulties | ||
+ | Tradeoff: requires significantly more storage | ||
+ | |||
+ | Interesting/ | ||
+ | |||