This is an old revision of the document!


5.1 The Mergesort Algorithm

Mergesort
  • Mergesort sorts a given list of numbers by first dividing them into two equal halves, sorting each half separately by recursion, and then combining the results of these recursive calls, in the form of two sorted halves, using the linear time algorithm for merging sorted lists.
  • Template for Divide and Conquer Algorithms
    • Divide the input into two pieces of equal size; solve the two subproblems on these pieces separately by recursion; and then combine the two results into an overall solution, spending only linear time for the initial division and final recombining.
  • Base case for recursion: once the input has been reduced to size 2, we stop the recursion and sort the two elements by simply comparing them to each other
  • T(n) denotes its worst case runtime of the template above, O(n) time to divide the input into two pieces, spends T(n/2) to solve each one, O(n) to recombine the solutions from the two recursive calls; runtime T(n) solves following recurrence relation
    • T(n) ⇐ 2T(n/2) + cn when n>2; T(2) ⇐ c
  • The recurrence relation does not explicitly provide an asymptotic bound on the growth rate of the function T; rather, it specifies T(n) implicitly in terms of its values on smaller inputs.
  • To obtain an explicit bound, need to solve the recurrence relation.
Approaches to Solving a Recurrence Relation
  • Unroll the recursion: accounting for the runtime across the first few levels, and identify a pattern that can be continued as the recursion expands; then, sum the runtime once all the levels of the recursion to arrive a total runtime
  • Substituting a solution: guess for the solution, substitute it into the recurrence relation, check if it works
Unrolling the Mergesort Recurrence
  • Analyze the first few levels
    • First level: single problem of size n takes at most cn time plus subsequent recursive calls
    • Second level: two problems size n/2 each takes time cn/2 (sum of cn) plus subsequent recursive calls
    • Third level: four problems of size n/4 each takes time cn/4 (total cn)
  • Identify a pattern
    • At level j of recursion, number of subproblems doubled j times (now a total of 2^j). Each subproblem has shrunk size by factor of two j times (each size n/2^j so each takes timecn/2^j).
    • Level j total runtime: 2^j(cn/2^j) = cn
  • Sum over all levels of recursion
    • Number of times the input must be halved in order to reduce its size from n to 2 is log2(n)
    • Total runtime: O(nlogn)
Substituting a Solution into the Mergesort Recurrence
  • T(n) is bounded by O(nlogn); we have agues for the runtime that we want to verify
  • T(n) ⇐cnlog2(n) for all n>= 2
    • Base case: holds for n = 2
    • By induction, T(m) ⇐ cmlog2(m) for all values of m less than n
    • Want to establish this for T(n)
      • T(n) ⇐ 2T(n/2) + cn ⇐ 2c(n/2)log2(n/2) + cn = cn[log2(n) – 1] + cn = cnlog2(n) – cn + cn = cnlog2(n)
      • This establishes what we want for T(n)
An Approach Using Partial Substitution
  • Somewhat weaker kind of substitution one can do, in which guesses the overall form of the solution without pinning down the exact values of all the constant and other parameters at the outset
  • Method can actually be useful in working out the exact constant when one has some guesses of the general form of the solution

5.2 Further Recurrence Relations

Description
  • More general class of algorithms is obtained by considering divide-and-conquer algorithms that create recursive calls on q subproblems of size n/2 and then comibine the results in O(n) time
  • If T(n) denotes the runtime of an algorithm designed in this style, then T(n) obeys the following recurrence relation
    • T(n) ⇐ qT(n/2) + cn when n > 2; T(2) ⇐c
The Case of q>2 Subproblems
  • Unroll the recursion
  • Analyzing the first few levels
    • First level: one problem of size n, cn time
    • Second level: q problems of size n/2, each cn/2 time (total qcn/2)
    • Third level: q^2 problems of size n/4, total time (q^2/4)cn
    • Work per level is increasing as we recurse
  • Identifying a pattern
    • Level j is q^j(c/2^j) = (q/2)^jcn
  • Summing over all levels of recursion
    • Log2n levels of recursion
    • Get a geometric sum whose solution is O(n^(log2q))
  • Running time is more than linear but still polynomial in n
The Case of One Suproblem
  • q = 1
  • Unroll the recursion
  • Analyzing the first few levels
    • First level: one problem of size n, cn time
    • Second level: one problem of size n/2, cn/2 time
    • Third level: one problems of size n/4, total time cn/4
    • Work per level is decreasing as we recurse
  • Identifying a pattern
    • Level j is one instance at size n/2^j, so runtime is cn/2^j
  • Summing over all levels of recursion
    • Log2n levels of recursion
    • Get a geometric sum whose solution is T(n) ⇐ 2cn = O(n)
  • Geometric series with a decaying exponent is a powerful thing: fully half the work performed by the algorithm is being done at the top level of recursion
  • When q = 1, runtime is linear; when q = 2, runtime is O(nlogn); when q > 2, runtime is polynomial bound with an exponent larger than 1 that grows with q.
  • When q = 1, the total runtime is dominated by the top level, whereas when q > 2 runtime is dominated by the work done on constant-size subproblems at the bottom of the recursion.
  • Recurrence is based on the following divide and conquer structure:
    • Divide the input in to two pieces of equal size; solve the two subproblems on these pieces separately by recursion; and then combine the two results into an overall solution, spending quadratic time for the initial division and final recombining.
  • T(n) ⇐2T(n/2) +cn^2 when n > 2; T(2) ⇐ c
  • Unroll the Recurrence
  • Analyzing the first few levels
    • First level: one problem of size n, cn^2 time
    • Second level: two problems of size n/2, each c(n/2)^2 time (total time of cn^2/2)
    • Third level: four problems of size n/4, each c(n/4)^2 time cn/4 (total time of cn^2/4)
    • Work per level is decreasing as we recurse
  • Identifying a pattern
    • Level j is 2^j subproblems at size n/2^j, so runtime is cn^2/2^j
  • Summing over all levels of recursion
    • Similar sum to q = 1 case
    • Log2n levels of recursion
    • Get a geometric sum whose solution is T(n) ⇐ 2cn^2 = O(n^2)

5.3 Counting Inversions

courses/cs211/winter2018/journals/nasona/chapter5.1520782459.txt.gz · Last modified: by nasona
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0