Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2012:journals:paul:home [2012/03/14 02:49] – [Chapter 5] nguyenp
+++ courses:cs211:winter2012:journals:paul:home [2012/04/06 02:17] (current) – [Chapter 7] nguyenp
@@ Line 509: / Line 509: @@
           * This way, we end up sorting the entire list as well as finding inversions.
           * We can see how we are able to recursively count inversions this way.
+          * The recurrence relation is T(n) ≤ T(n/2) + T(n/2) + O(n)
+          * We can see that the running time is O(n log n)
+          * Pretty obvious seeing how similar it is to merge sort
   * Section 5.4 - Finding Closest Pair of Points
       * Summary of Algorithm is on Page 230
@@ Line 518: / Line 521: @@
           * We are done.
           * We can see clearly how this can be done recursively
+          * We can get an awesome efficiency of O(n log n)
+              * This part is really confusing
+          * The recurrence relation is T(n) ≤ 2T(n/2) + O(n log n) for the algorithm on page 230
+          * We divide each sub box into two new sub boxes each time
+          * This explains the 2T(n/2)
+          * The O(n log n) combination cost comes from the fact that we must sort the points along the delta strip on the dividing lines
+          * This gives us T(n) = O(n log^2 n)
+          * We can do better and get T(n) = O(n log n)
+          * When we do the recursive step, the sub problem that going up the chain is returning a sublist that is already sorted. We just need to merge these lists instead of merging from scratch.
+          * This gives us T(n) = O(n log n)
   * Section 5.5 - Integer Multiplication
       * Assume we have two n digit numbers
@@ Line 530: / Line 543: @@
           * xy=(x_1*y_1(10^(n/2))(10^(n/2)))+(x_1*y_0+x_0*y_1)*10^(n/2)+(x_0*y_0)
           * xy=(x_1*y_1*10^n)+(x_1*y_0+x_0*y_1)*10^(n/2)+(x_0*y_0)
+      * This leads us to a method called Karatsuba Multiplication
+      * Algorithm is on page 233
+          * Recursively multiply on x and y
+          * Divide into two parts like above
+          * compute x_1+x_0 and y_1+y_0
+          * Recursive step on these two sums
+          * set p = to this value
+          * set x_1*y_1=Recur on x_1 and y_1
+          * set x_0*y_0=Recur on x_0 and y_0
+          * return (x_1*y_1*10^n)+(p-x_1*y_1-x_0*y_0)*10^(n/2)+(x_0*y_0)
+          * (I know this notation is very terrible, but the formal proof is in the book. I wrote it this way so I could understand it better)
+      * This has a recurrence relation of T(n) ≤ 3T(n/2) + cn
+      * This is because we divide the problem into three sub problems, p, x_1*y_1, x_0*y_0
+      * They are all of size n/2
+      * the constant combination is obvious
+====== Chapter 6 ======
+  * Section 6.1 - Weighted Interval Scheduling: A Recursive Procedure
+      *  Key thing: MEMOIZATION. It's pretty awesome.
+          * Something I find very interesting is that memoizing seems very intensive on I/O actions. It makes the program a lot more efficient in terms of run time, but it really hits the I/O hard because of how much reading and writing we are doing. This really makes me wonder if memoizing is that great after the Andy Danner talk.
+      * Example Problem: Fib Seq
+          * Can be easily solved with tail recursion.
+          * The idea is to write things down that you have already computed.
+      * Example Problem: Weighted Interval Scheduling
+          * Job j starts at s_j, finishes at f_j, and has weight or value v_j
+          * Two jobs are compatible if they don't overlap
+          * We want to find the group of jobs that are compatible with each other that also return the largest possible total weight/value.
+              * Greedy algorithm doesn't work because it does not take into account weight/value.
+          * Informal Algorithm Explanation:
+              * Label jobs by their end times, e.g. if a job finishes before any other job ends, then it is job 1.
+              * Define p(j) to be the job i such that i<j and is also compatible with j
+              * Consider some optimal solution
+              * Let OPT(j) be the group of jobs selected by the optimal solution consisting of jobs1 through j
+              * We can assume that for job j, either j is in OPT(j) or not
+              * Case 1: j is in OPT(j)
+                  * Then the jobs incompatible with j cannot be in OPT(j). They are p(j)+1, p(j)+2, ... , j-1
+                  * Must include OPT(j-1)
+              * Case 2: j is not in OPT(j)
+                  * Must include OPT(j-1)
+              * This is pretty much what we do. It's a divide and conquer type of thing. It's recursive. The dynamic programming part comes in where we memoize the values for OPT(whatever)
+              * The concept is very similar to the idea of tail recursion on the Fib Seq. Pretty much the same thing.
+          * Non-Dynamic Algorithm: Exponential and Very Slow (Page 254)
+              * Sort jobs by finish times so that f_1 ≤ f_2 ≤ ... ≤ f_n #this is the labeling step from above
+              * Compute p(1), p(2), ..., p(n) #compute the compatible jobs. This has to be done. No way around it.
+              * Compute(n)
+              *
+              * Compute-Opt(j):
+                  * if j = 0
+                      * return 0
+                  * else
+                      * #Max determines which on the optimal solution would pick since the optimal solution is trying to get the max weight
+                      * #             picks j                does not pick j
+                      * return max(v_j + Compute-Opt(p(j)), Compute-Opt(j-1))
+          * Having to constantly recompute Compute-Opt(p(j)) is stupid. Let's not do that and just compute it once.
+          * Dynamic Algorithm: Fast (Page 256)
+              * Sort jobs by finish times so that f_1 ≤ f_2 ≤ ... ≤ f_n
+              * Compute p(1), p(2), ..., p(n)
+              *
+              * for j=1 to n
+                  * M[j]=Empty #This is a global array we use to store all the memoized values for M-compute so we only compute if we have to
+              * M[0]=0
+              *
+              * M-Compute(n)
+              *
+              * M-Compute-Opt(j):
+                  * if j = 0
+                      * return 0
+                  * else #        v_j is the value/weight of j
+                      * return max(v_j + M-Compute-Opt(p(j)), M-Compute-Opt(j-1))
+          * Efficiency:
+              * Sorting jobs by finish time is O(n log n)
+              * Computing p(whatever) is O(n log n)
+                  * I don't really see how this is not O(n^2)
+                  * I think it is O(n^2) because we have n elements, so we must compute p(_) n times.
+                  * p(_) is linear right? don't we have to go through all n elements to find one that it compatible with it?
+              * M-Compute-Opt(j) is O(n)
+                  * This is awesome.
+                  * Everything we do is pretty much just constant accessing actions.
+                  * The thing that is not is recursively calling M-Compute-Opt.
+                  * The thing is, we only do it once for each value since we memoized everything.
+                  * So, we only do it n times.
+                  * When we actually compute something, all we really do is look up a weight/value and look up other stuff.
+              * So, this takes O(n log n)
+          * This only gives us the total value/weight that is the max achievable. It does not tell us what intervals were chosen.
+          * We can figure this out after the fact.
+          * We still have our M[] array, which is nice.
+          * Informal Algorithm Explanation:
+              * We essentially go through M and find the intervals that would have been chosen using this inequality
+                  * v_j + M[p(j)] > M[j-1]
+                  * This means if the value of this current interval + that of the one compatible with it that ends the latest is greater than the value of the interval right before this one, then this is one would have been chosen.
+          * Algorithm:
+              * M-Compute-Opt(n) # compute M[]
+              * Find-Solution(n)
+              *
+              * Find-Solution(j):
+                  * if j = 0:
+                      * output nothing
+                  * else if v_j + M[p(j)] > M[j-1]:
+                      * print j
+                      * Find-Solution(p(j))
+                  * else:
+                      * Find-Solution(j-1)
+  * Section 6.2 - Principles of Dynamic Programming: Memoization or Iteration over Subproblems
+      * The book first goes over what makes the previous algorithm dynamic and efficient
+          * If I was the author, I would have put that in the last chapter. It has no business being here.
+      * It says nothing else useful.
+  * Section 6.3 - Segmented Least Squares: Multi-way Choices
+      * The problem we are focusing in  this section is the one where we have a bunch of dots and want to know what line (if it even is a line) that is most similar to the data (you know what I mean)
+      * If it is not a line, then we do not try and form a curve, instead we form a sequence of lines.
+      * How do we do this?
+          * Consider the first and last points.
+          * For every point p_n, it can only belong to one line segment
+          * We define cost for each point as cost: c (segment cost) + error of segment
+          * Let OPT(j) = minimum cost for points p_1, p_i+1 , … , p_j (the min cost of all the points)
+          * Let e(i, j) = minimum sum of squares for points p_i, p_i+1 , … , p_j
+          * How to compute OPT(j)?
+              * Cost = e(i, j) + c + OPT(i-1).
+              * OPT(J):
+                  * if j=0
+                      * return 0
+                  * else
+                      * min {e(i,j)+c+OPT(i,j): 1<=i<=j }
+          * Algorithm: page 265
+              * INPUT: n, p_1,…,p_N , c
+              * Segmented-Least-Squares()
+                  * M[0] = 0
+                  * e[0][0] = 0
+                  * for j = 1 to n
+                      * for i = 1 to j
+                          * e[i][j] = least square error for the segment p_i, …, p_j
+                  *
+                  * for j = 1 to n
+                      * M[j] = min 1 ≤ i ≤ j (e[i][j] + c + M[i-1])
+                  *
+                  * return M[n]
+          * The magic really lies in the math behind the least square error. That's how we determine which segment something belongs to.
+          * Efficiency:
+              * The first group of nested for loops is O(n^3).
+              * This can be reduced to quadratic if we pre-compute and write down the values for p_whatever
+                  * YAY MEMOIZING
+              * the last for loop is actually quadratic even though it initially looks linear because it's just a single for loop, but remember that min is a linear time function
+          * For some reason the book likes to do things in this very not straightforward way of finding all the data we need using one algorithm and then later using that data to find the information we really need to get our results.
+          * So, since we really need to know how to split up the points into a sequence of segments, we will go through all the segments we just assigned
+          * Algorithm: page 266
+              * FindSegments(j):
+                  * if j = 0:
+                      * output nothing
+                  * else:
+                      * Find an i that minimizes ei,j + c + M[i-1]
+                      * Output the segment {pi, …, pj}
+                  * FindSegments(i-1)
+  * Section 6.4 - Subset Sums and Knapsacks: Adding a variable
+      * We're focusing on the knapsack problem
+      * We have n things and a knapsack
+      * item i has weight w_i (always positive) and also has a value v_i
+      * We can only carry so much weight
+      * We want to maximize the value (more bang for the buck type of deal)
+      * So, for each item, we either pick it or we don't pick it
+      * OPT(i) = max profit subset of items 1,…, i
+      * How to compute OPT(i)?
+          * if it does not select object i
+              * OPT selects best of { 1, 2, …, i-1 }
+          * if it does
+              * Does not mean we must not pick everything lower than i
+              * So, we don't know what to do.
+              * MORE SUBPROBLEMS!!! w0000ooooooo...
+              * So, we set a new weight limit, our current available weight minus what we jsut added, i.e. w – w_i
+              * We let OPT select the best of these
+      * That's it! That's the algorithm
+      * Algoriddim be on Page 269 mahn
+          * Input: N, w_1,…,w_N, v_1,…,v_N
+          * for w = 0 to W
+              * M[0, w] = 0
+          * for i = 1 to N
+              * for w = 1 to W
+                  * if wi > w :
+                      * M[i, w] = M[i-1, w]
+                  * else
+                      * M[i, w] = max{ M[i-1, w], vi + M[i-1, w-wi] }
+          * return M[n, W]
+      * The reason for why this works is suprisingly simple once we realize the important of just changing the amount of weight left over and over again
+      * Efficiency:
+          * This is a strange one.
+          * It's Θ(n W)
+          * So it both depends on the knap sack capacity and the number of objects
+          * they call it Pseudo-polynomial.
+          * Weird stuff. Never thought algorithms like this existed.
+      *
+====== Chapter 7 ======
+  * Section 7.1 - The Maximum-Flow Problem and the Ford-Fulkerson Algorithm
+      * Problem
+          * You have a directed graph
+          * Each edge has a capacity
+          * Have a start node s and end node t
+          * Greedy
+              * Find an s-t path with highest capacity
+              * do it!
+              * keep doing this until you cant do it any more
+              * Locally optimal, but not globally optimal (it's very easy to think of a counter example, counter example is on page 339 in book)
+          * Awesome Solution: Residual Graphs
+              * Edges have a capacity, but also a flow
+              * We don't have to use all of it at once
+              * Residual edge is one where you have some going one way and some going the other
+              * The Ford-Fulkerson Algorithm uses residual edges and works awesomely (optimal)
+                  * page 342 - 344
+                  * Informal explanation
+                      * graphs
+                      * keep updating the residual edges using the following formula:
+                          * If edge is the same as it was originally, set the forward rate to the bottle neck + original rate (which is the edge with the smallest flow rate)
+                          * Otherwise, set the backwards residual rate to what it originally was minus the bottle neck
+  * Section 7.2 - Maximum Flows and Minimmum Cuts in a Network
+      * Cut: s-t cut is a partition (A,B) where s is in A and t is in B
+      * Problem: Find an s-t cut of minimum capacity
+      * The capacity of a cut is the sum of all the possible stuff that can go through the edge at once
+      * Flow Value Lemma: Let f be any flow, and let (A, B) be any s-t cut. Then, the value of the flow is = f_out(A) – f_in(A).
+      * Weak Duality: let f be any flow and let (A, B) be any s-t cut. Then the value of the flow is at most the cut’s capacity.
+      * Corollary: Let f be any flow, and let (A, B) be any cut. If v(f) = cap(A, B), then f is a max flow and (A, B) is a min cut
+      * Augmenting path theorem: Flow f is a max flow iff there are no augmenting paths
+      * Max-flow min-cut theorem: The value of the max flow is equal to the value of the min cut.
+          * More explanation on page 350
+          * Boom and we are done. Just use the same stuff as in the past section and we have all we need.
+  * Section 7.5 - A First Application: Bipartite Matching Problem
+      * Given an undirected bipartite graph
+      * We must match so each node appears at most once on each edge
+      * How do we find one with the max number of nodes? This is our problem.
+      * Here's how we do it. We got a bunch of red and blue nodes. Add nodes s and t, which will be the start and end nodes. Have the start node s have a unit edge (edge of size 1) connect it to all the red nodes. Have all the blue nodes have a unit edge connect them to the end node t. Still have same edges in the middle.
+      * Now, use same algorithm as in the first section. BOOOM!!! Right? Yep. It's pretty intuitive, these later sections are more like practical uses rather than actual algorithms. I guess this is kind of good since pretty much all computer science problems go back to graph theory.
+      * Extensions: The structore of bipartite graphs with no perfect matching
+          * Dicusses how to tell if it is impossible to find a perfect matching in a bipartite graph.
+          * Can easily find an algorithm based on the following theorem:
+              * Assume that the bipartite graph G=(V,E) has two sides X and Y such that |X| and |Y|. Then the graph G either has a perfect matching or these is a subset A such that |\G(A)|<|A|. A perfect matching or an appropriate subset A can be found in O(mn) time.
+  * Section 7.7 - Extensions to Max-Flow Problem
+      * Circulations with Demands
+          * We have a directed graph
+          * edges have capacities
+          * nodes have supply and demands
+          * A circulation is a function that satisfies the following
+              * the flows for a particular edge are always less than the capacity (makes sense and is very intuitive)
+              * the demand for a node is equal to the stuff coming in minnus the stuff going out (the the demand that his node has? I guess that makes enough sense...)
+          * To keep the world form exploding, the sum of supplies == sum of demands
+          * Algorithm: more details on pg 381 and 382
+              * Add a new source s and sink t
+              * For each v with d(v) < 0, add edge (s, v) with capacity -d(v)
+              * For each v with d(v) > 0, add edge (v, t) with capacity d(v)
+          * MAGIC: G has circulation iff G' has max flow of value D
+          * MORE MAGIC : Given (V, E, c, d), there does not exist a circulation iff there exists a node partition (A, B) such that the sum of all the demands is greater than cap(A,B)
+          * These two pretty much solve the problem for us.  more details are on page 382 if you get confused rereading this sometime in the future
+      * Circulation with Demands and Lower Bounds
+          * Same givens as last problem
+          * except!!!! we have lower bounds on each edge
+          * algorithm on page 384
+          * Here's how we do
+          * we set teh flow along an edge to be the lower bound and then we update the demand on both nodes attached to it.
+          * the rest is self explanatory