Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2014:journals:cory:home [2014/02/22 21:57] – walkerc
+++ courses:cs211:winter2014:journals:cory:home [2014/03/31 21:51] (current) – walkerc
@@ Line 80: / Line 80: @@
 ===== 4.6 Summary =====
 We will develop a data structure called the Union-Find structure, which will store a representation of the components in a way that supports rapid searching and updating. This kind of data structure is needed to implement Kruskal's Algorithm efficiently. As each edge is considered, we need to find the identities of the connected components containing v and w. If these components are different, there isn't a path from v and w, so the edge should be included; conversely, if the components are the same, that edge is already included and the new addition should be excluded. If the edge e is included, the data structure should support the efficient merging of the components of the nodes it connects into a single new component. The Union-Find data structure supports three main operations: 1) MakeUnionFind(S): for a set S will return a Union-Find data structure on set S where all elements are in separate sets, implementing MakeUnionFind in O(n) time. 2) Find(u), which will return the name of the set containing a particular node u. This should be done in O(logn) time. 3) Union(A, B) will change the data structure by merging two sets A and B into a single set, which should be implemented in O(logn) time. Now, consider the array implementation of the Union-Find data structure for some set S of size n, where unions keep the name of the larger set. The Find operation takes O(1) time, MakeUnionFind(S) takes O(n) time, and any sequence of k Union operations takes at most O(klogk) time. Also, consider the pointer-based implementation of the Union-Find data structure (page 154-155 of the book) for some set S of size n, where unions keep the name of the larger set. A Union operation takes only O(1) time, MakeUnionFind(S) takes O(n) time, and a Find operation takes O(logn) time. So, Kruskal's Algorithm can be done on a graph with n nodes and m edges in O(mlogn) time. Would using a pointer for Kruskal's be noticeably faster than the original way described? Interest: 8/10
+===== 4.7 Summary =====
+Clustering occurs whenever we have a collection of objects that we are trying to classify or organize into logical, coherent groups. A common approach for grouping these objects is defining a distance function on the objects, so that objects at a larger distance from one another are less similar to each other. Minimum Spanning Trees thus play an important role in this kind of grouping. If we are seeking to divide the objects in a set U of n objects in k groups, for a given parameter k, we that a k-clustering of U is a partition of U into k nonempty sets C_1, C_2, ..., C_k. The spacing of a k-clustering is the minimum distance between any pair of points lying in different clusters. So, there are exponentially many different k-clusterings of a set U. To find one of maximum spacing, we grow a graph on the vertex set U. The connected components are the clusters, and we try to bring nearby points together into the same cluster as quickly as possible so they don't end up as points in different clusters that are close together. Each time we add an edge that spans two distinct components, we have basically merged the two corresponding clusters, which is called single-link clustering: a special case of hierarchical agglomerative clustering. (Agglomerative means that we combine clusters; single-link means we do so as soon as a single link joins them together). The components C_1, C_2, ..., C_k formed by deleting the k-1 most expensive edges of the minimum spanning tree T constitute a k-clustering of maximum spacing. What are some other uses for the minimum spanning tree algorithms? Interest: 8/10
+===== 4.8 Summary =====
+Data compression is an area that forms part of the foundations for digital communications. We can take text written in richer alphabets (such as the alphabets used for human lanugages) and convert them into long strings of bits using a fixed number of bits for each symbol in the alphabet and then concatenating the bit strings for each symbol to form the text. However, it's a huge waste to translate them all into the same number of bits; we could use a smaller number of bits for more frequent letters and larger numbers of bits for less frequent ones, and perhaps use fewer than five bits per letter on average. Reducing the average number of bits per letter is a fundamental problem in the area of data compression, and are sought after with compression algorithms. A similar example to this kind of compression on the internet is Morse code, where each letter was translated into a sequence of dots and dashes. The ambiguity problem in Morse code happens because there are pairs of letters where the bit string that encodes one letter is a prefix of the bit string that encodes another. We can eliminate this problem by mapping letters to bit strings in such a way that no encoding is a prefix of any other. There is an optimal prefix code, with corresponding tree T*, in which the two lowest-frequency letters are assigned to leaves that are siblings in T*. The Huffman code for a given alphabet achieves the minimum average number of bits per letter of any prefix code. Did the people who invented computer codes model their binary ideas after Morse code? Interest: 7/10
+===== 5.1 Summary =====
+The mergesort algorithm is a very basic example of a divide-and-conquer algorithm. To analyze its running time, we will abstract its behavior into this template that is used to describe many common divide-and-conquer algorithms: Divide the input into two pieces of equal size; solve the two subproblems on these pieces separately by recursion; and then combine the two results into an overall solution, spending only linear time for the initial division and final recombining. The running time T(n) satisfies the following recurrence relation. For some constant c, T(n) ≤ 2T(n/2) + cn when n > 2, and T(2) ≤ c. There are two ways one can go about solving a recurrence: 1)"Unroll" the recursion, accounting for the running time across the first few levels, and identify a pattern that can be continued as the recursion expands. Then sum the running times over all levels of the recursion, thereby arriving at a total running time. 2) Start with a guess for the solution, substitute it into the recurrence relation, and check that it works. This can be justified by induction on n. Is mergesort the most efficient sorting algorithm? Interest: 6/10
+===== 5.2 Summary =====
+A more general class of algorithms is obtained by considering divide-and-conquer algorithms that create recursive calls on q subproblems of size n/2 each and then combine the results in O(n) time. 1) For some constant c, T(n) ≤ qT(n/2) + cn when n >2, and T(2) ≤ c. Any function T(*) satisfying 1) with q > 2 is bounded by O(n^(log_2 * q)). So, the running time is more than linear, since log_2 * q > 1, but still polynomial in n. Any function T(*) satisfying 1) with q = 1 is bounded by O(n). Furthermore, for some constant c, T(n) ≤ 2T(n/2) + cn^2 when n > 2 and T(2) ≤ c. What are some algorithms that we may use regularly that use these recurrence relations? Interest: 4/10
+===== 5.3 Summary =====
+Another problem that can be solved with divide-and-conquer algorithms is a problem that arises in the analysis of rankings. For example, many sites on the Web make use of a technique known as collaborative filtering, where they try to match people's preferences with those of other people on the Internet. It uses these similarities in interest to recommend new things that the other people have liked. Another application is in meta-search tools on the Web. To compare two sets of rankings, we count the number of inversions used to order the sets. This can be done in O(nlogn) time. The most crucial routine in the process of counting inversions is the Merge-and-Count algorithm. Because the Merge-and-Count procedure takes O(n) time, the Sort-and-Count algorithm correctly sorts the input list and counts the number of inversions; it runs in O(nlogn) time for a list with n elements. Question: Do Facebook, twitter, and other modern search engines still use a similar algorithm in their recommendations? Interest: 6/10
+===== 5.4 Summary =====
+Finding the closest pair of points means that, given n points in the plane, we want to find the pair that is closest together. This problem is found in computational geometry, and is useful in graphics, computer vision, geographic information systems, molecular modeling, etc. Given points P = {p_1, ..., p_n}, our goal is to find a pair of points p_i, p_j that minimizes the distance d(p_i, p_j). This is solved with another divide and conquer algorithm, like Mergesort: find the closest pair among the points in the "left half" of P and the closest pair among the points in the "right half" of P. Use this information to get the overall solution in linear time. Overall, this gives O(nlogn) running time, and correctly outputs a closest pair of points in P. Question: How exactly DO we recombine at the end in linear time? Interest: 3/10
+===== 5.5 Summary =====
+Integer multiplication can be done with a different divide and conquer algorithm, in which the "default" quadratic algorithm is improved by means of a different recurrence. Here, we break up the product of the two integers into partial sums. Putting it into base-2 reduces the problem of solving a single n-bit instance to the problem of solving four n/2 - bit instances. Then, recursively compute the results for these four n/2 - bit instances, and then combine them. This results in the Recursive-Multiply algorithm. The running time of Recursive-Multiply on two n-bit factors is O(n^(log_2(3)) = O(n^1.59). Question: Is this algorithm used behind the scenes for each integer multiplication that we ask of our computers? Interest: 8/10
+===== 6.1 Summary =====
+The Interval Scheduling Problem can be solved for an optimal solution with a particular greedy algorithm, but the Weighted Interval Scheduling Problem requires a more general version, where each interval has a value or weight, and we want to accept a set of maximum value. With weighted intervals, no natural greedy algorithm is known; instead, we switch to dynamic programming. In particular, a recursive type of algorithm is used to solve this problem, where each request is said to belong to an optimal solution on the given set if and only if v_j + OPT(p(j)) >= OPT(j-1), where OPT(x) is the value of the optimal solution. In the process of using this recursion to calculate the optimal solution, we must save values that have already been computed, which is called memoization. The running time of the complete algorithm with the memoization is O(n). Question: How does this algorithm avoid delving through all possibilities, and thus being as inefficient as brute force? Interest: 4/10
+===== 6.2 Summary =====
+The key to the efficient algorithm used for the weighted interval scheduling problem is the array M. This array is built by using the value of optimal solutions to the subproblems on intervals 1 through j for each j, using the previous algorithm to define each value of M[j] based on values that come earlier in the array. Once we have the array M, the problem is solved, because it contains the value of the optimal solution on the full instance, and our Find-Solution algorithm can be used to backtrack through M and return the final, optimal solution. Dynamic programming algorithms can be developed using an iterative building up of subproblems; this approach is often a simpler way to express the algorithms. To develop an algorithm based on dynamic programming, one needs a collection of subproblems derived from the original problem that satisfy the following properties:  1) There are only a polynomial number of subproblems 2) The solution to the original problem can be computed from solutions to the subproblems 3) There's a natural ordering on subproblems from "smallest" to "largest" together with an easy-to-compute recurrence that allows one to determine the solution to a subproblem from the solutions to some number of smaller subproblems. Question: Is dynamic programming only used for those problems that cannot be solved with more efficient algorithms, just as a better alternative to brute force? Interest: 4/10
+===== 6.3 Summary =====
+In the segmented Least Squares problem, the recurrence we use will involve what could be called "multi-way choices." At each step, we have a polynomial number of possibilities to consider for the structure of the optimal solution. Essentially, we are trying to find the smallest number of lines of best fit that can be used and still approximately cover the majority of the points on the graph. For each segment, we compute the line minimizing error with respect to the points in S. The penalty of a partition is defined to be a sum of 1) the number of segments into which we partition P, times a fixed, given multiplier C > 0 and 2) for each segment, the error value of the optimal line through the segment. We want to find a partition of least penalty. For the subproblem on the points p_1 through p_j, OPT(j) = min(e_i,j + C + OPT(i-1)) and the segment p_i through p_j is used in an optimum solution for the subproblem if and only if the minimum is obtained using index i. In total, the algorithm that is found for this problem has a running time of O(n^2) once all the e_i,j values have been determined. Question: What would we do if there was a graph with several very-stray points; would it just be impossible to compute a good best-fit line? Interest: 6/10
+===== 6.4 Summary =====
+In the subset sums and knapsacks problem, the "obvious" set of subproblems turns out to not be enough, and we end up creating a richer collection of subproblems. This is done by adding a new variable to the recurrence underlying the dynamic program. Here we have a single machine that can process jobs between the times of 0 and W for some W. Our goal is to process jobs so as to keep the machine as busy as possible up to the W "cut-off." Given n items that each have a nonnegative weight, and a bound W, we want to select a subset S of the items so that the sum of each of the items' weights is less than the W cut-off, while the sum of all of their weights is as large as possible. This is called the Subset Sum Problem, which is a natural special case of the Knapsack Problem where each request i has both a value and a weight. Dynamic programming can be used to solve this problem. The Subset-Sum(n, W) Algorithm correctly computes the optimal value of the problem and runs in O(nW) time. Given a table M of the optimal values of the subproblems, the optimal set S can be found in O(n) time. The Knapsack Problem can be solved in O(nW) time. Question: Is there another algorithm (not using dynamic programming) that can find an equally efficient solution/running time? Interest: 7/10
+===== 7.1 Summary =====
+Graphs can often be used to model transportation networks, whose edges carry "traffic" and whose nodes act as "switches" passing traffic between different edges. Network models have: capacities on the edges, source nodes which generate traffic, sink (destination) nodes, and the traffic itself (or flow) which is transmitted across the edges. To find a maximum flow in a network, with the smallest bottleneck, we must answer the Maximum-Flow Problem with a combination of greedy and dynamic programming. We can push forward on edges with leftover capacity and backward on edges that are already carrying flow to change its direction. A residual graph provides a systematic way to search for forward-backward operations like this.The algorithm that answers this is known as the Ford-Fulkerson Algorithm, which runs in O(mC) time. Question: Which parts of this are dynamic programming, and which are greedy? Interest: 6/10
+===== 7.2 Summary =====
+This section is about the analysis of the Ford-Fulkerson Algorithm, and how it provides insight into the Maximum-Flow Problem. Here we see and prove that the Ford-Fulkerson Algorithm has the maximum possible value of any flow in G. By watching the amount of flow f (any s-t flow) sends across a cut, we can measure the flow value. Let f be any s-t flow and (A, B) any s-t cut. Then v(f) ≤ c(A, B). If f is an s-t-flow such that there is no s-t path in the residual graph G_f, then there is an s-t cut (A*, B*) in G for which v(f) = c(A*, B*). Consequently, f has the maximum value of any flow in G, and (A*, B*) has the minimum capacity of any s-t cut in G. Given a flow f of maximum value, we can compute an s-t cut of minimum capacity in O(m) time. In every flow network, the maximum value of an s-t flow is equal to the minimum capacity of an s-t cut. If all capacities in the flow network are integers, then there is a maximum flow f for which every flow value f(e) is an integer. Question: Because this algorithm could potentially run forever, is it really always efficient? Interest: 6/10
+===== 7.5 Summary =====
+One of the main goals of the Maximum-Flow Problem is to be able to solve the Bipartite Matching Problem. A bipartite graph G=(V, E) is an undirected graph whose node set can be partitioned so that every edge has one end in X and the other in Y. A matching M in G is a subset of the edges so that each node appears in at most one edge in M. The Bipartite Matching Problem involves finding a matching in G of largest possible size. The size of the maximum matching in G is equal to the value of the maximum flow in G'; and the edges in such a matching in G are the edges that carry flow from X to Y in G'. The Ford-Fulkerson Algorithm can be used to find a maximum matching in a bipartite graph in O(mn) time. Assume that the bipartite graph G = (V, E) has two sides X and Y such that |X| = |Y|. Then the graph G either has a perfect matching or there is a subset A⊆X such that |Γ(A)| < |A|. A perfect matching or an appropriate subset A can be found in O(mn) time. Question: What real world applications do bipartite graphs have? Interest: 4/10
+===== 7.7 Summary =====
+In the Maximum-Flow Problem, we had only a single source s and a single sink t. Now we can have a set S of sources and a set T of sinks. Instead of maximizing the flow value, we will have sources that have fixed supply values and sinks with fixed demand values, and we want to ship flow from nodes with supply to those with demands. There is a feasible circulation with demands {d_v} in G if and only if the maximum s* - t* flow in G' has value D. If all capacities and demands in G are integers, and there is a feasible circulation, then there is a feasible circulation that is integer-valued. There is a feasible circulation in G if and only if there is a feasible circulation in G'. If all demands, capacities, and lower bounds in G are integers, and there is a feasible circulation, then there is a feasible circulation that is integer-valued. Question: Is this algorithm just a slightly tweaked version of the Ford-Fulkerson Algorithm? Interest: 6/10