This section of the reading describes mergesort as the most canonical divide-and-conquer algorithm. A divide-and-conquer algorithm is an algorithm that divides the input into multiple pieces (usually 2), solves each subproblem using recursion, and then combines the results of those recursive calls into a solution. As with all recursion, the recursive calls in a divide-and-conquer algorithm need to have a base case so that the algorithm has a point at which it stops dividing and starts conquering. Usually this “bottom out” point is when the recursive method is called on an input of some predefined constant size, such as 1.
Divide-and-conquer algorithms are a little less straightforward in computing an algorithm's time complexity, but their efficiency can be expressed as a recurrence relation. Mergesort can be measured by the number of comparisons based on the input size. This measurment can be expressed as a function that computes the number of comparisons (or “time”) based on the input size n
:
T(n) ≤ 2T(n/2) + cn | if n > 2 |
T(2) ≤ c | if n = 2 |
A generalized form of the recurrence relation equation mentioned above is:
T(n) ≤ qT(n/2) + cn | if n > 2 |
T(2) ≤ c | if n = 2 |
where q
is the number of subproblems per step in the recursive algorithm. This gives us a few cases depending on what q
is…
A good way to quantify the differences between two comparable sets of data is to determine the degrees of difference between the two. A generalized way to do this is to calculate the number of inversions between the two. Assuming the data in question is organized like two lists of the same set of elements, an inversion is the minimum number of times that two elements need to trade places in the first list to become the second list. For example, if you have the following list:
2 | 4 | 1 | 3 | 5 |
---|
and you are comparing it to:
1 | 2 | 3 | 4 | 5 |
---|
you could quantify the differences between them by saying that it would take 3 inversions to get from the first list of data to the second list of data:
2 | 4 | 1 | 3 | 5 |
---|---|---|---|---|
1 | 4 | 2 | 3 | 5 |
1 | 2 | 4 | 3 | 5 |
1 | 2 | 3 | 4 | 5 |
We can use the algorithm on Pages 224-225 to count inversions. This algorithm has a time complexity of: , meaning that that is the the time it would take to find the number of inversions and perform them.