Table of Contents

Chapter 5

Divide and Conquer

Comments

Sorry the comments have to come first, can't help it. TOO EXCITED! So this is how I view Divide and Conquer:

“Breaking a bundle of sticks into two halves is hard, but it is easier to break each of the sticks and then binding them together.” - Every time I think of the phrase “Divide and Conquer”.

“I'd rather kill each of the guards one by one stealthily than go jump in the middle of 10 guards and fight everyone at the same time.” - My view while playing Assassin's Creed.

“I prepare notes on the even numbered chapters and you prepare the notes for the odd numbered ones, and we share at the end.” - A conversation with one of my fraternity brothers, who is in the same Finance class as me.

5.1 A First Recurrence: The Mergesort Algorithm

Intuition: Divide the data into two halves, solve each part by recursion, then combine the two results to get the overall solution.

In mergesort, we need a base case so that we know when to stop dividing our data set and start returning a combined solution of the two parts. The runtime T(n) satisfies the following recurrence relation:

   For some constant c, 
        T(n) <= 2T(n/2) + cn
   when n>2, and
        T(2) <= c
        

This can also be written as T(n) ⇐ 2T(n/2) + O(n), suppressing the constant c.

Approaches to Solving Recurrences

Unrolling the Mergesort Recurrence

“Any function T(.) satisfying 5.1 (on page 211 of the book, or the inequalities above this section), is bounded by O(n log n), when n > 1.”

Substituting a Solution into the Mergesort Recurrence

Suppose T(n) ⇐ cn log2 n for all n >= 2 and we want to check whether this is true or not. This is true (Process given on page 213, using induction).

An Approach Using Partial Substitution

This is a weaker substitution process. Suppose we believe that T(n) = O(n log n), but we're not sure of the constant inside the O(.) notation. We still use this to write T(n) ⇐ kn logb n, for some constant k and base b. Now, we need to see if there is any value for k and b that will work on induction. In fact, there is! (Process on page 214)

The end result is T(n) ⇐ (kn log2 n) - kn + cn ⇐ kn log2 n

5.2 Further Recurrence Relations

For a divide-and-conquer algorithm that divides the given problem into q subproblems of size n/2:

For some constant c, T(n) ⇐ qT(n/2) + cn when n > 2, and T(2) ⇐ c

The Case of q > 2 Subproblems

“Any function T(.) satisfying the above case with with q > 2 is bounded by O(n^(log(2)q))”

The Case of One Subproblem

What if q = 1?

“Any function T(.) satisfying the above case with q = 1 is bounded by O(n)”

What if the cost of combination is quadratic when q = 2?

5.3 Counting Inversions

The Problem

This deals with how websites recommend you new things that you might be interested in based on the sites you have visited and the sites other people of “similar interest” have visited.

In general, when given n different numbers, we need to figure out how far this list is from being in an ascending order. Two indices i < j has an inversion if ai > aj.

Designing and Analyzing the Algorithm

The simplest method to do this is to count by brute force, which gives a total runtime of O(n²). But we will try to make it run with O(n log n) runtime.

Let's say we do the mergesort, but instead of just doing the mergesort, we rather perform a Merge-and-Count. This is exactly what we need to do.

We've already discussed mergesort, so we only need to talk about counting the swaps. This is easy. Every time you append an element from the first half to a new list, C, there is no increase in the count since the first list is supposed to be smaller anyways. But if the element added to C is from the second list, the number of swaps is the number of remaining elements in the first half.

Algorithm on page 224 and 225.

5.4 Finding the Closest Pair of Points

The Problem

Given n points in the plane, find the pair that is closest together. M. I. Shamos and D. Hoey came up with this problem in early 1970s, and this computation was the foundation for computational geometry. This is a hard problem and O(n²) sounds good. But they found an algorithm that solves the problem in O(n log n) time.

Designing the Algorithm

There are a set of points P represented by {p1, p2, p3, …., pn}, all with distinct coordinates, where pi has coordinates (xi, yi) and for two points pi, pj in P, d(pi, pj) is the distance between pi and pj. We need to find a pair of points pi and pj which minimizes d(pi, pj).

We first create a 2-D model of the problem and divide the problem into 2 halves, solving for each of the halves. The only problem left would be to check if the minimum distance lies between the points in 2 different halves.

We sort the elements according to their x coordinates, divide them into 2 different subproblems, then sort each of them according to their y coordinates and solve the problem recursively.

The only problem that's left now is the combination of the two subproblems in linear time. For the left half Q and the right half R, let the minimum distances of each be d(q0, q1) and d(r0, r1). The minimum of these two would be d. Now, if there exists any q in Q and r in R so that d(q, r) < d, then each of q and r lies within a distance d of L, the line that separates Q from R. (Proof in page 228)

We now consider only a set S, that consists of points that lie on Q and R within the distance d.

“If s, s', both in S, have a property that d(s, s') < d, then s and s' are within 15 positions of each other in the sorted list Sy.” (Proof on page 229)

Thus we get an algorithm on page 230.

Analyzing the Algorithm

“The algorithm correctly outputs a closest pair of points in P.” (Proof on page 231)

“The algorithm runs in O(n log n) time.” - The sorting by x and y coordinates take O(n log n) time and the remainder of the algorithm also takes O(n log n) time. Thus, the runtime is O(n log n). (Proof on page 231)

5.5 Integer Multiplication

The Problem

The normal integer multiplication takes O(n²) time, because you go through each digit in the first integer, multiply it by each digit in the 2nd integer, and add the product up by shifting each of the following one to a digit left.

Designing the Algorithm

Instead of multiplying the two n-bit numbers, the following algorithm solves four n/2-bit numbers. So, we get the recursive algorithm as written in page 233.

Analyzing the Algorithm

Unrolling the algorithm, we get T(n) ⇐ qT(n/2) + cn. Thus, the runtime would be reduced to O(n^(log(2) q)).

Questions and Comments

As of March 7th, Wednesday

No questions. Comments are on the top of the page :) And the chapter (so far) was easily readable, of course because of the class discussions, as usual.

As of March 14th, Tuesday

I would like to go through Integer Multiplication once more. I'd probably come meet you sometime this week about this. It was kind of hard to get the designing part, both from the book and from the class, though I understood the basics.

The rest of the section was easy to understand. Class helped a lot.