Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2018:journals:lesherr:home:chapter5 [2018/03/11 17:00] – [Section 2: Further Recurrence Relations] lesherr
+++ courses:cs211:winter2018:journals:lesherr:home:chapter5 [2018/03/11 17:14] (current) – [Section 3: Counting Inversions] lesherr
@@ Line 6: / Line 6: @@
 The Mergesort algorithm serves as a very good example of recursion that we can use to generalize for a much larger class of divide-and-conquer algorithms. These algorithms follow the same pattern of splitting the initial sample size into smaller samples of size n/2, however we want to look at cases where the number of smaller sizes is not binary, (i.e. q > 2). Analyzing the first few levels, we see that the total work per level is increasing as we proceed through the recursion. Identifying a pattern, we see that the total work performed at level k is q^k(cn/(2^k), which is equivalent to (q/2)^k*cn. Summing this over all levels of recursion, we see that the total run time for an algorithm of this setup is O(n^(logq)). Any algorithm that follows the recurrence relation of T(n) = qT(n/2) + O(n) with q > 2 will have a run time of the form O(n^(logq)). Considering an algorithm that follows that form with q = 1, we see that the work per level is actually decreasing as we proceed through the recursion. The pattern for the cost per level l is cn/2^l, and thus the sum of the total work of the algorithm is O(n). Any function with that form will always be bounded by O(n). Thus from this analysis, we see that the value of q in the algorithm has a major effect on the total run time. When q = 1, the run time is linear, when q = 2, the run time is O(nlogn), when q >2, the run time is O(n^(logq)). A related recurrence to those of which we have been looking at is T(n) = 2T(n/2) + O(n^2). This recurrence follows the structure of: divide the input into two pieces of equal size; solve the two subproblems on these pieces separately by recursion; and then combine the two results into an overall solution, spending quadratic time for the initial division and final recombining. Analyzing the first few levels, we see that the total amount of work per level is decreasing, as opposed to mergesort where the total work stays the same per level. Identifying a pattern, we see the total work per level l is cn^2/2l. Summing the total work over all levels, we find that the total run time is O(n^2). The beginning of this section was easy to follow having studied it in class and worked on the problem set. The second part was a bit more confusing, with regards to the different algorithm. I would give this section a 6/10.
 ===== Section 3: Counting Inversions =====
+In this section, we consider a problem that arises in the analysis of rankings. An example of this is collaborative filtering, where sites try to match your preferences with those of other people to find people with similar tastes, so that they can recommend new things that you might like. A major issue in applications that use this is comparing two rankings to find how similar two people's rankings are. A natural way to quantify how different two rankings are is by counting the number of inversions there are between the two, or the number of times a pair of elements is out of order. For a sequence in ascending order, there are no inversions, but for a sequence in descending order, there are n choose 2 inversions, since every pair is an inversion. A brute force algorithm to calculate the number of inversions would take O(n^2) time, looking at every pair of elements. The following algorithm can achieve this calculation in O(nlogn) time.
+Merge-and-Count(A,B)
+     Maintain a current pointer into each list, initialized to point to the front element
+     Maintain a variable Count for the number of inversions, initialized to 0
+     While both lists are nonempty
+          let ai and bj be the elements pointed to by the Current pointer
+          Append the smaller of these two to the output list
+          If bj is the smaller element
+               Increment Count by the number of elements remaining in A
+          Advance the Current pointer in the list from which the smaller element was selected
+     Once one list is empty, append the remainder of the other list to the output
+     Return Count and the merged list
+Sort-and-Count(L)
+     If the list has one element then there are no inversions
+     Else
+          Divide the list into two halves:
+               A contains the first [n/2] elements
+               B contains the remaining [n/2] elements
+               (rA, A) = sort-and-count(A)
+               (rB, B) = sort-and-count(B)
+               (r, L) = merge-and-count(A,B)
+     return r = rA + rB + r, and the sorted list L
+'The Sort-and-count algorithm correctly sorts the input list and counts the number of inversions; it runs in O(nlogn) time for a list with n elements.' This section was more difficult to understand, and hopefully will be made clearer in class. I would give it a 5/10.