Wiki Navigation

This is an old revision of the document!

Chapter 5: Divide and Conquer

This chapter covers divide and conquer algorithms, which aim to break up a problem into several parts, solve them recursively, and then combine the solutions to the subproblems. The idea is to reduce a brute-force algorithm, with is often polynomial time, to a divide and conquer algorithm that is has a lower polynomial running time. Analysing this running time depends on the recurrence relationship, we which will learn about in this chapter. Divide and conquer algorithms can be used in many applications such as distance functions on sets of objects, finding closest pair of points in a plane, multiplying integers and smoothing a noisy signal.

5.1 A First Recurrence: The Mergesort Algorithm

This sections looks at the familiar mergesort algorithm to describe a recurrence relationship. Mergesort clearly follows the pattern of a divide and conquer algorithm, where it divides a list in halve, sorting each half by recursion and combining the results. The base case for recursion is when the input is only two numbers, so we can just compare the numbers to each other. To understand the running time and recurrence relation of this algorithm, we let T(n) be the worst case running time. Then we want to bound T(n) with a recurrence relation. Mergesort takes O(n) to divide the input, spends T(n/2) to solve each half, and O(n) to combine the solutions. This means the recurrence relation for T(n) is

T(n) ≤ 2 T(n/2) + cn.

The basic structure of this inequality consists of writing T(n) in terms of T(k), for k < n, and the combining time. There are two ways to solve the recurrence: by unrolling the recurrence and looking for a pattern or by substituting a guess and proving by induction. For unrolling the recurrence in mergesort, by looking at the first few levels, you see that the numbers of subproblems doubles at each level. When we sum up all the parts, we get a running time of O(n logn). Overall, we know that any T(n) satisfying the inequality above is bounded by O(n logn). If we guess the running time we want, we could verify it using substitution. This is shown on page 213.

5.2 Further Recurrence Relations

This section discusses other recurrence relations similar to the one in the last section. The general class of divide and conquer algorithms has q subproblems of size n/2 and combines the result in O(n) time. This results in

T(n) ≤ q T(n/2) + cn.

There are two cases of this method: when q = 1 or q > 2. An example of q > 2 is on pages 215-216. In general, I think the unrolling the recursion method makes more sense to me than the substitution. In this case, unrolling finds that each level has the next power of 3 number of problems that have sizes that are powers of 2. Note that the subproblems overlap. T(n) with q > 2 is bounded by O(n^log q). You can also verify this using partial substitution. In the case of one subproblem, the amount of work you have to do decreases instead of increasing like the previous cases. This case of T(n) is bounded by O(n). The value of q is importnat because the running time is mainly effected by the work in the recursion, thus q = 1 is faster and q > 2 is the slowest.

The last recurrence relation is when T(n) ≤ 2 T(n/2) + O(n^2) when it takes quadratic time to split up and combine the halves. This section was pretty confusing and abstract to me. I think concrete examples would help me a lot with these recurrence relations.

5.3 Counting Inversions

This problem arises when you are comparing two sets of rankings, for example to determine how close two people's movie preferences are. The best way to quantify things like this is to look at two people's rankings and see how “out of order” they are. This works by counting inversions, which is when i < j in one person's rankings but j < i in the other person's. Therefore, we want an algorithm that will count the number of inversions.

The brute force approach take O(n^2) time to check all pairs of numbers and see if they are an inversion. However, we can do better than this. We can use the divide and conquer approach to split the list into 2 lists of n/2 values, then find the inversions in each side. The trick is to be able to look at the inversions between the two halves in linear time. If we make the recursive step sort in addition to counting, it is a lot easier to see inversions between the two lists. We can do this with a pointer in each list, comparing the pointers, inserting the smaller one, and updating one of the pointers, which is similar to mergesort. We can easily count the inversions as we go because we know if the pointer in the second list is smaller then it is inverted with everything after the pointer in the first list. The algorithm for this is on pages 224-225 and runs in O(n logn) time as we hoped.

Chapter 5: Divide and Conquer

5.1 A First Recurrence: The Mergesort Algorithm

5.2 Further Recurrence Relations

5.3 Counting Inversions

5.4 Finding the Closest Pair of Points