Table of Contents

Divide and Conquer

This group of algorithmic techniques hinges upon breaking apart the input, and then solving each piece recursively. Analyzing run times then involves solving a recurrence relation that bounds the time recursively in terms of the run time of smaller instances.

5.1: First Recurrence--The Mergesort Algorithm

We know Mergesort very well at this point in our CS studies. To analyze it, let us abstract its behavior as so:

The nice thing about such an abstraction is that we see many other problems fit this pattern. Also, we see how it could be simply changed slightly to match others.

So, we define the recurrence relation by letting T(n) denote the worst case running time on inputs of size n. Supposing n even, we then clearly see that it takes O(n) time dividing the input, then time T(n/2) on each half, and then O(n) on combining. So, the recurrence relation is, for some constant c:

There are two primary methods of solving the recurrence: “unrolling it”, and induction.

Unrolling the Mergesort Recurrence

As we did in class, we first analyze the first few levels. So, we see that the first step takes time cn, and then those 2 steps below it each take time c(n/2) because the input size is halved for each. So, we identify and generalize the pattern: the jth level in the recursion contributes at most 2j(cn/2j) = cn to the total run time. Lastly, we sum this over all the levels of recursion. In this case, to reduce the input size from n to 2 takes log2n recursive calls. So, we have cn*logn for the sum: a run time of O(n logn).

Substitute Solution In; Induction

Here, we could take a “guess” that T(n) =< cn logn for all n >= 2. So, we simply make our inductive assumption that the inequality holds true for all m<n, and prove that it is true for the n case.

A weaker method of partial substitution can also be used in figuring out certain constants when we have a guess at the form of the solution (e.g. a guess that it is log base something, so we substitute in logbn for the inductive argument, and eventually find 2 to be our answer. This seems slightly unwieldy in my opinion, and less useful given our algorithms are still relatively easy to intuit. 9/10.

5.2: Further Recurrence Relations

To generalize the recurrence relation above, let us instead consider breaking the problem up into q subproblems in each step, each of which still have an input size of half the size. Then, the recurrence relation T(n) =< q*T(n/2) + cn where q>2 yields a runtime of O(nlog2q).

The main complication came in the summing bit of analysis, but fortunately we can always refer back to the books handling of this geometric series, as well as its application of partial substitution.

Interestingly, if q=1, the function is bounded by O(n). Is because, even though it is only making logn levels of recursion, the combining and recombining is linear. So, we have that

Another similar recurrence relation building from that q=2 case is what if we have the constant term time at each call be O(n2) instead of O(n). Then, we follow a similar analysis as the initial one in section 5.1, but find that the n2 term certainly dominates the geometric series as the reducing subproblem sizes decrease fast. We get a net bound of O(n2).

5.3: Counting Inversions

We can extend a few notions of Mergesort to problems not directly related to sorting numbers. In a collaborative filtering system, we can compare different ranking lists. So, if we define an inversion in a list to be any time that two elements are out of order. We proceed to give an algorithm which can do this is O(n logn) time, by never really having to look at each inversion specifically. It starts by dividing the list into two parts of length n/2, and then count the number of inversions in each individually, and then those between the halves.

The algorithm Merge-and-Count(A,B) for two sorted lists A, B of equal length is:

Note that the run time of this is O(n), because there are 2*n total comparisons happening where only constant time things take place. Then, we can define Sort-and-Count(L) to sort and count the inversions of an individual list L of size n:

Because Merge-and-Count is linear time, we have that the Sort-and-Count algorithm correctly sorts the input list and counts the number of inversions in O(n logn) time for a list with n elements.

This section was not my favorite. I think that the manner in which we discussed it in class was much better, and the slightly closer to real code pseudocode used in your version of the algorithms was actually more illuminating than their stuff here. However, it certainly still got the point across: 8/10.