Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2014:journals:eric:home [2014/03/25 20:03] – [Eric's Journal] utomoe
+++ courses:cs211:winter2014:journals:eric:home [2014/03/31 14:36] (current) – [Eric's Journal] utomoe
@@ Line 1: / Line 1: @@
 ====== Eric's Journal ======
+Interest: 7/10. this is back to bit too abstract for me. maybe (in fact probably) this week's reading will help out if I take big data class this fall. it seems network flow would be huge part of big data since it's about partitioning too big a set to compute with x resources into smaller sets with x*y resources
+===== Week 11 Reading =====
+====7.1 The Max-Flow Prob and the Ford-Fulkerson Alg====
+>the prob: graphs through nodes and switches act as transportation networks with traffic of certain capacities traveling through, and having source nodes and sink nodes for that data
+-flow networks
+  * flow: abstract entity gen by source nodes
+  * flow network: directed graph with following features:
+  * capacity, a nonnegative number ce
+  * source node s in V
+  * sink node t in V
+  * internal nodes in between
+-defining flow: 0<=f(e)<=ce and other than s t flow into node matches flow out of node
+-the max-flow prob: arrange traffic in a flow network as efficiently as poss (max the flow value)
+  * by cutting a graph, we can know the max possible flow value
+  * minimum cut: max-flow value equals the minimum capacity of any such division
+>defining the alg: there's no general alg to do it, so to design one think in terms of pushing forward on edges with leftover capa, and backward on edges already carrying flow; this is a residual graph
+-the residual graph: node set of Gf is same as G, ce-f(e) leftover units of capacity on edges to push forward to, and backward edges e'=(v,u)  to undo flow
+-aug paths in a res graph
+aug(f,P)
+ let b=bottleneck(P,f)
+ for each edge in P
+  if edge is forward: increase f(e) in G by b
+  else: decrease f(e) in G by b
+   * 7.1 f' is a flow in G
+   * proof: 0<=f(e)<=f'(e)=f(e)+ bottleneck(p,f)<=f(e)+(ce-fe) = ce for forward edges and reverse the <= to >= and subtract bottleneck for backward edges
+Max-Flow
+ all e in G to 0
+ while there's path s-t in Gf
+  let P be that simple plath
+  f'=augment(f,P)
+  update f to be f'
+  update Gf to be Gf'
+ return f
+  * the above is the FF alg who dev in 1956
+>ana the alg: termination and RT
+-7.2: at every intermediate stage of FFA, flow values{fe} and the resi caps in Gf are ints. Proof: suppose true after the while loop j iterations, then bn(P,f) will be an it and f' will have ints so will the new res graph
+-7.3: let f be a flow in G, and P a simple st path in G. Then vf'=vf+bn(P,f); and since bn(P,f)>0 we have v' > vf. Proof: e must be forward since path is simple, so if we increase bn and not change any incident edge to s, f' exceeds f by bn
+-7.4: suppose as above that all caps in G are ints. Then the FFA terminates i nat most c iterations of the while loop. Proof: by 7.3 flow in FFA increases each iter, and by 7.2 at least by 1. it starts with 0 and not higher than c, so that's max c iters
+-7.5 suppose that all caps in G are ints, then FFA can be implemented to run in mC time. Proof: most work done in one iter is f. have two linked lists for each node in Gf, then to find path st we can use breadfirst or depthfirst in )(m+n). then for each e we can construct forward and backward edges in Gf' in Om time
+====7.2 max flows and min cuts in a network====
+>ana the alg: flows and cuts: structure is important because the max flow is determined by any point where it's just one point connecting two separate cut sets
+-7.6: let f be any st flow and AB and set cut. then vf=fout(A) - fin(A). Proof: vf = foutv-finv for all v in A. Then that equals feoutA - feinA = foutA-finA
+-7.7 let f be any st flow and AB any st cut then vf=finB-foutB
+-7.8: vf<c(A,B) (the value of every flow is upper bounded by the cap of every cut
+>ana the alg: max-flow equals min-cut
+-7.9: if f is an st flow such that there is no st in Gf, then there's st (A*,B*) in G for which vf=c(A*,B*). conseq, f has max val of any flow in G and A*B* has the min cap of any st cut in G. Proof: vf=foutA*-finA* = feoutA* - feinA* = ce-0 out of A* = c(A8,B*)
+-7.10:the flow f_ ret by FFA is a max flow
+-7.11: given f of max val we can compute st of min cap in m time. proof: 79 then bfs or dfs to find A*, then B*=v-A* and return cut(A*,B*)
+-7.12: in every flow network there f and cut so that vf=c(A,B)
+-7.13: in every flow network the max val of st flow is equal to min cap of st cut
+-further analysis: integer valued flows
+--7.14: if all caps in flow network are ints then there is max flow f for which every flow val fe is an its
+--real #s as caps? all the theorems here were assumed that caps had to be ints. but what if they're real? then we have no guarantee that the while loop in the FFA ever terminates since each increment can be inf small; with pathological choices for the aug path, the FFA can run forever. next section is about aug paths to avoid bad behavior of the alg (using reals I think)
+====7.3 choosing good aug paths====
+>the upper bound can be very bad if each bottleneck is the same in graph as in res graph, and if that bottleneck were potentially one. then to get flow value of 200, for example, we'd have to do 200 augmentations
+>Des a faster flow alg
+Scaling Max-Flow
+ set all e in G to 0
+ set D to be largest power of 2 no larger than max cap out of s
+ while D>=1
+  while there's st path in GfD
+   let P be a simple st path in GfD
+   f'=aug(f,P)
+   update f to be f' and GfD
+  D = D/2
+ return f
+>ana the alg
+-7.15: if the caps are int vals then throughout the SMFA the flow and res caps are ints, implying when D=1, GfD is same as Gf and hence when the alg terminates the flow, f is of max value
+-7.16: the number of iters of the outer while loop is at most 1+logC
+-7.17: during the Dscaling phase, each aug increases the flow val by at least D
+-7.18 let f be the flow at the end of the Dscaling phase. There is st cut AB in G for which CAB<=vf+mD where m is the number of edges in G. conseq, the max flow in the network has val at most vf+mD. Proof: vf = feoutA - feinA >= (ce-D outA) - DeinA = ceoutA - DoutA - DinA >= c(A,B) - mD
+-7.19 the # of augs in scaling phase at most 2m
+-7.20 the SMFA in graph with m edges and its is at most 2m(1+logc) augs and runs in at most m^2logc time
+>extensions: strongly polynomial algs: polynomial number of bits, solved by dinitz, edmonds, karp shows a way to do those in Omn iters; algs to do mnlogn, n^3, and min(n^2/2, m^.5)mlognlogU
+====7.5 a first app: the bipartite matching problem====
+>the prob: bipartite graph unidirected whose node set can be partitioned. a matching is a subset M such that each node apppears in at most one edge in M
+>des the alg: direct all X to Y, then add node s and sx to each X then to and Y to t, then each edge a cap of 1
+>ana the alg
+-7.34: M' contains k edges cuz in the cut set, first is cardinality of M' while second is 0 since nothing enters A, so M' contains k edges
+-7.35: each X is the tail of at most one edge in M': if tail of 2, 2 units of flow leave from x, so 2 must have come into X but the cap is 1 so that can't be
+-7.36: each Y is the head of at most one edge in M'
+-7.37: the size of max match in G is equal to val of max flow in G' and the edges in such a matching G are the edges that carry flow from X to Y in G'
+-bounding the running time
+  * 7.38: the FFA can be used to find a max matching in a bip graph in O(mn) time
+  * better bounds from 7.3 would not give better RT oddly enough because these bounds were for all instances of C where as before C had to be small enough
+  * aug paths are also called alternating paths in the context of finding max matching
+-Extensions: the structure of bip graphs with no perfect matching
+-what if the graph doesn't have a perfect matching? 7.39: if bib G with two sides XY has perfmat, then for all A in X we must have GamA>=|A|
+-7.40: assume bip G with XY has |X|=|Y|. then G has either perfmat or there's subset AinX that GamA<|A|. a perf mat or appropriate subset can be found in mn time
+-proof 7.40: show val of maxflow in G' is n; by mfmct we know A' contains s from XY. assume by 7.11 A'B' can be found by running FFA, then moving from B' to A' on y, yt now crosses the cut. but there was at least one edge xy with x in A since y in GamA so all edges from A and y used to cross the cut and can't anymore. so now the cap of the cut can't be increased, and is |A intersect B'| + |Y intersect A'| so n-A + gamA <= c(A'B') < n
+====7.7 extensions to the max flow prob====
+>the prob: circulations with demands: consider fixed supply and fixed demand values and shift flow to meet those demands (like highway or railway of shipping routes). So now G has dv>0 and supply of -dv. set S to ann nodes with negative demand T to nodes with pos demands
+) cap conditions: for each e in E we have 0<=fe<=ce
+) demand conditions: for each v we have v finv-foutv = dv
+-feasibility prob: does circulation meeting 1 and 2 exist?
+-7.49: if there exists feas circ with demands dv then sum of vs in dv = 0
+-proof of 7.49: dv must = finv - foutv so e is counted twice and cancel out. so dv = 0
+>des and ana an alg for circs: attach super source s* and super sink t* to each node in T, then add those to G. for each v in T add edge vt* with cap dv and each u in S add s*u with capacity -du then carry over rest of G to G'
+-7.50: there's feas circ with dv in G iff max s*-t* in G' has value D. if all caps and demands in G are ints and there's feas circ, then there's feas circ that is int
+-7.51: the graph G has fc with dv iff for all cuts AB, dv<=c(AB)
+>the prob: circs with demands and lower bounds
+-lower bounds are used to force/ensure each edge has a min flow through them, useful so that wasted paths are not there
+) cap conditions: each e in E, le<=fe<=ce
+) dem cond: each v in V finv-foutv = dv
+>des and ana an alg with lower bounds
+-consider f0inv - f0outv = leeinv - leeoutv then we have ce-le units to work with in remainder, so in G' the capacities of the edges will be ce-le and demand of node v dv-Lv
+-7.52: there's fc in G iff there's fc in G'. if all demands, caps, and lower bounds in G are ints, and there's fc, then there's fc that is int
+-proof of 7.52: finv-foutv = einv(le+f'e) - eoutf(le+f'e) = Lv + (dv-Lv) = dv so G stipulations are met
+and conversely f'inv - f'outv = (fe-le)einv - (fe-le)eoutv = dv-Lv so G' stipulations are met
 Interest: 8/10. partly because now we're dealing with non-binary, but multifaceted problems, and largely because it's week11! semester almost over. but not quite.
 ====CHAPTER 6: DYNAMIC PROGRAMMING====
-===6.1 weighted int scheduling: a rec procedure===
+====6.1 weighted int scheduling: a rec procedure====
 -des a rec alg
   * greedy algs can't do anything if the values of the things aren't all 1
@@ Line 34: / Line 146: @@
   * 6.5: given M of opt values for subprobs findSol returns opt sol in On
-===6.2 principles of dynamic prog: memoization or iteration over subprobs===
+====6.2 principles of dynamic prog: memoization or iteration over subprobs====
 -des the alg
 IterativeCompOpt
@@ Line 46: / Line 158: @@
 There's a chicken and egg problem underlying dyprog because it's never clear that a collection of subprobs will be useful until we find the recurrence linking them, but that is hard to find without smaller subprobs to build on in the first place
-===6.3 segmented least squares: multiway choices===
+====6.3 segmented least squares: multiway choices====
 -we're no longer in binary mode (ooh wow)
 -the problem
@@ Line 74: / Line 186: @@
 -analyzing the alg: it's n2 because there is a pair to check for n entries. and then to find ei,j is also linear so its n3, but we can work it down to n2 by using the ingredients of calc for ei,j-1 to find ei,j in constant time. so RT: n^2
-===6.4: subset sums and knapsacks: adding a variable===
+====6.4: subset sums and knapsacks: adding a variable====
 -the problem
   * we have a single machine with requests, and only a certain time period to them in. so now we have a cut off time and need to decide which jobs to do and which to ignore.