Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
courses:cs211:winter2012:journals:lee:home [2012/03/06 15:46] – [Section 5.1: A First Recurrence: The MergeSort Algorithm] davislcourses:cs211:winter2012:journals:lee:home [2012/04/02 21:15] (current) – [Section 7.7: Extensions to the Maximum-Flow Problem] davisl
Line 65: Line 65:
 In section 4.8, the authors discuss, Huffman's Algorithm and its use in data compression. Before data is transmitted, it must be compressed so that it is easier to transmit. In order for the encoding to be efficient, the most frequent letters, in some alphabet S, should have an encoding that is shorter than the less frequent letters. For further efficiency, all encodings should be prefix codes; a prefix code is an encoding that has no other encoding as its prefix. Huffman's Algorithm performs these requirements optimally, by using a bottom-up approach. At every iteration, the algorithm will take the two lowest frequencies, z and y, and create a parent to z and y, x,such that x's frequency is the sum of z and y. It then recurs until there is only one node left which is the prefix code for the entire input alphabet. Algorithm is on page 172 and the proof is on page 174. In section 4.8, the authors discuss, Huffman's Algorithm and its use in data compression. Before data is transmitted, it must be compressed so that it is easier to transmit. In order for the encoding to be efficient, the most frequent letters, in some alphabet S, should have an encoding that is shorter than the less frequent letters. For further efficiency, all encodings should be prefix codes; a prefix code is an encoding that has no other encoding as its prefix. Huffman's Algorithm performs these requirements optimally, by using a bottom-up approach. At every iteration, the algorithm will take the two lowest frequencies, z and y, and create a parent to z and y, x,such that x's frequency is the sum of z and y. It then recurs until there is only one node left which is the prefix code for the entire input alphabet. Algorithm is on page 172 and the proof is on page 174.
  
-==== Chapter 5: Divide and Conquer ==== +===== Chapter 5: Divide and Conquer =====
 Divide and Conquer Algorithms divide problems into q smaller sub-problems and recursively solve those sub-problems. To analyze the efficiency of such algorithms it is necessary to solve a recurrence relation. The three methods to solving recurrence relation are unrolling the recursion, substitution, and partial substitution.   Divide and Conquer Algorithms divide problems into q smaller sub-problems and recursively solve those sub-problems. To analyze the efficiency of such algorithms it is necessary to solve a recurrence relation. The three methods to solving recurrence relation are unrolling the recursion, substitution, and partial substitution.  
  
 ==== Section 5.1: A First Recurrence: The MergeSort Algorithm ==== ==== Section 5.1: A First Recurrence: The MergeSort Algorithm ====
  
-In section 5.1, the authors solve the recurrence relation of MergeSort as an example. MergeSort divides the problem into two sub-problems and recursively solves each of those sub-problems, the cost to recombining the sub-solutions is O(n). Therefore, MergeSort's recurrence relation is defined as the following: T(n) < =  2T(n/2)+ cn, when n>2 T(2) < = c. The steps to unroll the recursion, substitute, and perform partial substitution are found on pages 212-213. The algorithm is bounded by O(n log n). +In section 5.1, the authors solve the recurrence relation of MergeSort as an example. MergeSort divides the problem into two sub-problems and recursively solves each of those sub-problems, the cost to recombining the sub-solutions is O(n). Therefore, MergeSort's recurrence relation is defined as the following: T(n) <= 2T(n/2)+ cn, when n>2 T(2) <= c. The steps to unroll the recursion, substitute, and perform partial substitution are found on pages 212-213. The algorithm is bounded by O(n log n)
 + 
 +==== Section 5.2: Further Recurrence Relation ====  
 + 
 +In section 5.2, the authors generalize recurrence relations to obtain the run-times of recursive algorithms that divide the problem into 2 sub-problems, more than 2 sub-problems, and into one sub-problem. If q ==1, then the run-time is O(n); if q == 2, then the run-time is O(n log n); if q > 2, then the run-time is O(n<sup>log<sub>2</sub>q</sup>). 
 + 
 +==== Section 5.3: Counting Inversions ====  
 +In section 5.3, the authors use a divide and conquer algorithm, mergeSort, to optimally count the number of inversions between two lists. Knowing the number of inversions is useful, to determining how similar two lists are. Using a variant of mergeSort, the authors develop an algorithm, Sort-and-Count, that divides a list, and while sorting each sub-list also counts the number of inversions. During the merge operation, the number of inversions between each sub-list is counted. The run-time of the algorithm is O(n log n). 
 + 
 + 
 +==== Section 5.4: Finding the Closest Pair of Points ==== 
 +In section 5.4, the authors discuss finding the closest pair of points in a plane. To determine the pair of points closest together, the sorted x values of the points are partitioned in half, and the closest pair is found in each half.  
 +The value d is the minimum distance between a pair of points, in each half, is min(q,r), where q is the minimum distance in the one half of the plane, and r is the minimum distance in the other half.  Since there could exist a inter-partition  pair of points such that its distance ,a, is less than d, then the algorithm must also consider the points that are at most d distance from the partition. Recursively obtaining the minimum distance in each half is O(log n), and determining if a value is less than d, is necessarily O(n). Thus, the algorithm has a run-time of O(n log n).  
 + 
 + 
 +==== Section 5.5: Integer Multiplication ==== 
 +Another application of divide and conquer algorithms the authors discuss is integer multiplication. The algorithm typically taught to children is O(n<sup>2</sup>), but it can be optimized using recursion and the divide and conquer principle. Dividing the problem into 4 sub-problems will still yield a quadratic run-time. If the problem is divided into 3 sub-problems, then the overall run-time is O(n<sup>log<sub>2</sub>3</sup>). I found the reading to be interesting, 6/10. 
 + 
 + 
 + 
 +===== Chapter 6: Dynamic Programming ===== 
 +Dynamic Programming consists in decomposing a problem into simpler subproblems and iterating over those subproblems to 
 +construct a solution or recursively solving the problem using a memoization array to store values. 
 + 
 +==== Section 6.1: Weighted Interval Scheduling: A Recursive Procedure ==== 
 +The authors introduce dynamic programming with the Weighted Interval Scheduling problem. A variation of the Interval Scheduling problem, the goal of the Weighted Interval Scheduling problem is to find the subset of jobs with the maximum weight. Using dynamic programming to approach the problem, the algorithm initially considers if some job j is in the optimal solution or not.  
 +It determines if a job is in the solution by taking either the min or max of the problems. The algorithm would then recur to explore the additional jobs using memoization to reduce the runtime from exponential to polynomial. The algorithm for the problem is on page 256. To compute the solution from the memoization array, an algorithm simply backtracks through the values. 
 +The algorithm for finding the solution is on page 258. 
 + 
 +==== Section 6.2: Principles of Dynamic Programming: Memoization or Iteration over Subproblems ==== 
 +Dynamic programming could also use iteration, rather than recursion, to construct an optimal solution. The algorithm would simply iterate through the number of subproblems and store them in a corresponding index in a memoization array. In order for a problem to have a solution that can be determined from dynamic programming it must satisfy the following criteria: 
 +1. There must only be a polynomial number of subproblems 
 +2. The solution to the original problem is easily computable from the subproblems. 
 +3. There is a natural ordering to the subproblems and an easy-to-compute recurrence that allows the solutions to subproblems be determined from smaller subproblems. 
 + 
 +==== Section 6.3: Segmented Least Squares: Multi-Way Choices ==== 
 +In section 6.3, the authors introduce the Segmented Least Squares problem to illustrate that subproblems in dynamic programs are not always binary. In the Segmented Least Squares problem, the goals is to plot a set of lines through a set of points using the fewest number of lines and with the highest possible accuracy. Since the subproblem is not a binary decision, it will consist of an error value for some partition, a constant, and the optimal solution of the rest of the possible partitions. The subproblem can be found on page 265.  The run time of the algorithm is O(n^{2}) and can be found on page 266. 
 + 
 +==== Section 6.4: Subset Sums and Knapsacks: Adding a Variable ==== 
 +In section 6.4, the authors discuss Subset sums and a variation of the Subset Sum problem called the Knapsack problem. The goal of the Subset Sums is to schedule as many jobs, with a duration w_{i}, to a machine without going over the maximum time W. The Knapsack problem is a little more complex since the goal is to find the subset of items with the highest possible value, but whose combined weight is less than the maximum weight W. In contrast to the previous section, the subproblems are binary decisions, either the job or item is in the optimal solution or it is not. The subproblem and algorithm can be found on page 269. The runtime of the algorithm is O(nW).   
 +  
 + 
 + 
 +===== Chapter 7: Network Flow ===== 
 +Is a class of problem, where each edge in a directed graph has flow and each node can receive and send "flow". Network Flow problems are particularly broad and include problems such as the Bipartite Matching Problem and the Maximum-Flow Problem. 
 + 
 +==== Section 7.1: The Maximum-Flow Problem and the Ford-Fulkerson Algorithm ==== 
 +The Maximum-Flow Problem is concerned with finding the maximum amount of flow that is possible on a given directed graph so that all edges are used efficiently. The maximum-flow of any given graph is actually the minimum capacity of some partition or "cut" of the graph. The Ford-Fulkerson Algorithm efficiently obtains the maximum flow of a graph. The maximum flow is obtained by taking all of the edges with underutilized capacity and and using the difference between the flow and capacity in a "residual graph" as a forward edge. The flow in the original graph will be a backward edge in the residual graph. The process of creating the forward and backward edges in the residual graph eliminates bottlenecks and is known as augmenting a path. The Ford-Fulkerson Algorithm has a run-time of O(m).The algorithm and proof is on page 344. 
 + 
 +==== Section 7.2: Maximum Flows and Minimum Cuts in a Network ==== 
 +In section 7.2, the authors further analyze the Ford-Fulkerson algorithm. They conclude that the algorithm is guaranteed to return the maximum flow, which is based on the minimum cut. This is due to the property that the value of some flow f, v(f) is equal to the difference of the number of edges leaving and entering a cut. Therefore, v(f) is guaranteed to be bounded by the capacity of some cut.  
 + 
 +==== Section 7.5: A First Application: The Bipartite Matching Problem ==== 
 +The Ford-Fulkerson algorithm not only solves the Maximum-Flow Problem but also the Bipartite Matching Problem. The Bipartite Matching Problem is determining what is the largest subset of edges of some graph such that each node appears at most once in a particular matching. The minimum cut returned from the Ford-Fulkerson Algorithm is the largest matching in a given graph. The runtime of the algorithm in solving the Bipartite Matching problem is O(mn) and the explanation is on page 370. To determine if a particular matching is perfect, the size of the subset of some nodes |A|, must be less than the size of the rest of the nodes. The proof is on page 372. 
 + 
 +==== Section 7.7: Extensions to the Maximum-Flow Problem === 
 +In section 7.7, the authors discuss extensions to the Maximum-Flow problem. In the one of these extensions, the graph is similar to the one constructed in the Bipartite Matching Problem with a source node connecting a set of nodes labelled as suppliers which are connected to consumers. The set of consumers are in turn connected to a sink node. Each edge has a flow and capacity  and if the difference between the flow and capacity is positive for a given node, then that node is a consumer, otherwise that node would be a supplier. The goal is to find the best "circulation" that satisfies the demand for a given graph. The above problem description is known as the Circulation Problem. A circulation must satisfy the following conditions: (i) Every flow must be positive and less than or equal to its edge's capacity, (ii) the demand for a particular node will be the difference between the edges that enter and exit the node. The proof is on page 381. The problem can be extended to handle lower bounds on the amount of flow on edges. The proof is on pages 383-384.
courses/cs211/winter2012/journals/lee/home.1331048799.txt.gz · Last modified: 2012/03/06 15:46 by davisl
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0