Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
courses:cs211:winter2011:journals:chen:chapter_6 [2011/03/16 06:14] – [6.3 Segmented Least Squares: Multi-way Choices] zhongccourses:cs211:winter2011:journals:chen:chapter_6 [2011/04/06 16:41] (current) – [6.9 Shortest Paths and Distance Vector Protocols] zhongc
Line 104: Line 104:
  
 interesting/readable: 7/7 interesting/readable: 7/7
 +
 +
 +====== 6.4 Subset Sums and Knapsacks: Adding a Variabl ======
 +Summary
 +Problem:
 +Given n objects and a "knapsack"
 +Item i weighs wi > 0 kilograms and has value vi > 0
 +
 +1d-dynamical programming won't work.
 +
 +Recurrence:
 +Either select item i or not
 +
 +Case 1: OPT does not select item i
 +• OPT selects best of { 1, 2, …, i-1 }
 +
 +Case 2: OPT selects item i
 +• Accepting item i does not immediately imply that
 +we will have to reject other items
 + No known conflicts
 +• **Without knowing what other items were selected
 +before i, we don't even know if we have enough
 +room for i**
 +Needs more memorization power!
 +
 +Add a variable.
 +
 +Def. OPT(i, **w**) = max profit subset of items 1,
 +…, i with weight limit w
 + Case 1: OPT does not select item i
 +• OPT selects best of { 1, 2, …, i-1 } using weight
 +limit w
 + Case 2: OPT selects item i
 +• new weight limit = w – wi
 +• OPT selects best of { 1, 2, …, i–1 } using new
 +weight limit, w – wi
 +
 +
 +
 +iterate through the n-w array.
 +
 +Runtime: O(nw)
 +
 +Interesting/readable:7/7
 +
 +
 +====== 6.5 RNA SECONDARY STRUCTURE ======
 +
 +
 +Summary
 +
 +
 +Problem:
 +RNA. String B = b1b2…bn over alphabet { A, C, G, U }
 +• Secondary structure. RNA is single-stranded so it
 +
 +A set of pairs S = { (bi, bj) } that satisfy:
 +Constraints:
 +
 +
 + [Watson-Crick] S is a matching and each pair in
 +S is a Watson-Crick complement: A-U, U-A, CG,
 +or G-C
 + [No sharp turns] The ends of each pair are
 +separated by at least 4 intervening bases. If (bi,
 +bj) ∈ S, then i < j - 4
 + [Non-crossing] If (bi, bj) and (bk, bl) are two
 +pairs in S, then we cannot have i < k < j < l
 +• Free energy. Usual hypothesis is that an RNA
 +molecule will form the secondary structure with the
 +optimum total free energy.
 +• Goal. Given an RNA molecule B = b1b2…bn, find a
 +secondary structure S that maximizes the number
 +of base pairs
 +
 +
 +
 +OPT(i, j) = maximum number of base pairs in
 +a secondary structure of the substring
 +bibi+1…bj
 + Case 1. If i ≥ j - 4
 +• OPT(i, j) = 0 by no-sharp turns condition
 + Case 2. Base bj is not involved in a pair
 +• OPT(i, j) = OPT(i, j-1)
 + Case 3. Base bj pairs with bt for some i ≤ t < j - 4
 +• non-crossing constraint decouples resulting subproblems
 +• OPT(i, j) = 1 + maxt { OPT(i, t-1) + OPT(t+1, j-1) }
 +
 +
 +
 +Runtime: O(N3)    two for loops and a max().
 +
 +Interesting/readable:7/7
 +
 +
 +
 +====== 6.6-6.7 SEQUENCE ALIGNMENT ======
 +Some general context of the problem:
 +
 +Google search and dictionaries have functions that recognizes what the user probably mean when typing in a word
 +that does not exist in the dataset.
 +In order to do 
 +**How should we define similarity between two words or strings?**
 +
 +Dissimilar by gaps and mismatches.
 +
 +
 +Formalized:
 +
 +Sequence Alignment
 +• Goal: Given two strings X = x1 x2 . . . xm and
 +Y = y1 y2 . . . yn find alignment of minimum
 +cost
 +• An alignment M is a set of ordered pairs xi-yj
 +such that each item occurs in at most one
 +pair and no crossings
 +• The pair xi-yj and xi'-yj' cross if i < i', but j > j’.
 +
 +
 +different cases 
 +
 +Case 1: xM and yN are aligned
 +• Pay mismatch for xM-yN + min cost of aligning rest of
 +strings
 +• OPT(M, N) = αXmYn + OPT(M-1, N-1)
 + Case 2: xM is not matched
 +• Pay gap for xM + min cost of aligning rest of strings
 +• OPT(M, N) = δ + OPT(M-1, N)
 + Case 3: yN is not matched
 +• Pay gap for yN + min cost of aligning rest of strings
 +• OPT(M, N) = δ + OPT(M, N-1)
 +
 +
 +Similar to KnapSack problem in the data structure.
 +
 +space O(mn)
 +
 +Can do this in linear space:
 +Collapse into an m x 2 array
 + M[i,0] represents previous row; M[i,1] -- current
 +
 +
 +T(m, n) <= cmn + T(q, n/2) + T(m - q, n/2)
 += cmn + kqn/2 + k(m - q)n/2
 += cmn + kqn/2 + kmn/2 - kqn/2
 += (c + k/2)mn.
 +
 +combination of divide-and-conquer and dynamic programming
 +
 +we can map the problem to shortest path problem on a weighted graph.
 +
 +
 +Divide: find index q that minimizes f(q, n/2) + g(q, n/2) using DP
 +Conquer: recursively compute optimal alignment in each piece.
 +
 +Interesting/readable:7/7
 +
 +
 +
 +====== 6.8 Shortest Paths in a Graph ======
 +
 +Negative weights would change everything that made the Greedy appropriate. Thus we cannot 
 +make decision only relying on local inforamtion at each step b/c a big negative edge that 
 +come later may drastically change the picture.
 +
 +Some constraints:
 +No negative weight cycle
 +If some path from s to t contains a negative
 +cost cycle, there does not exist a shortest s-t
 +path.
 +we can loop forever and get ever smaller.
 +
 +
 +Reccurence:
 +
 +OPT(i,v): minimum cost of a v-t path P using
 +at most i edges
 + This formulation eases later discussion
 +• Original problem is OPT(n-1, s)
 +
 +
 +DP
 +
 + Case 1: P uses at most i-1 edges
 +• OPT(i, v) = OPT(i-1, v)
 + Case 2: P uses exactly i edges
 +• if (v, w) is first edge, then OPT uses (v, w), and
 +then selects best w-t path using at most i-1 edges
 +• Cost: cost of chosen edge
 +
 +Implementation
 +for every edge number,
 +for possible node v
 +for each edge incident to v 
 +find out foreach edge (v, w) ∈ E
 +M[i, v] = min(M[i, v], M[i-1, w] + cvw )
 +
 +
 +O(mn) space.
 +
 +General process of DP
 +
 +Review: Dynamic Programming Process
 +1. Determine the optimal substructure of the
 +problem  define the recurrence relation
 +2. Define the algorithm to find the value of the
 +optimal solution
 +3. Optionally, change the algorithm to an
 +iterative rather than recursive solution
 +4. Define algorithm to find the optimal
 +solution
 +5. Analyze running time of algorithms
 +
 +
 +Interesting/readable: 8/8
 +
 +
 +
 +====== 6.9 Shortest Paths and Distance Vector Protocols ======
 +
 +Problem Motivation:
 +One important application of the Shortest-Path Problem is for routers in a
 +communication network to determine the most efficient path to a destination.
 +
 +
 +attempt:
 +Dijkstra's algorithm requires global information of network- unrealistic
 +we need to work with only local information.
 +
 +
 +Bellman-Ford uses only local knowledge of
 +neighboring nodes
 + Distribute algorithm: each node v maintains its
 +value M[v]
 + Updates its value after getting neighbor’s values
 +
 +
 +Problems with the Distance Vector Protocol
 +One of the major problems with the distributed implementation of Bellman-
 +Ford on routers (the protocol we have been discussing above) is that it’s derived
 +from an initial dynamic programming algorithm that assumes edge costs will
 +remain constant during the execution of the algorithm.
 +That is, we might get into a situation where there is infinite looping of mutual dependancy.
 +
 +could fail if the other node is deleted.
 +
 +
 +Solution:
 +
 + Each router stores entire path Not just the distance and the first hop
 + Based on Dijkstra's algorithm
 + Avoids "counting-to-infinity" problem and related
 +difficulties
 + Tradeoff: requires significantly more storage
 +
 +Interesting/readable: 5/5
 +
  
courses/cs211/winter2011/journals/chen/chapter_6.1300256065.txt.gz · Last modified: 2011/03/16 06:14 by zhongc
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0