Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2012:journals:jeanpaul:chapter_fivesection_i [2012/03/06 15:25] – mugabej
+++ courses:cs211:winter2012:journals:jeanpaul:chapter_fivesection_i [2012/03/06 22:23] (current) – mugabej
@@ Line 12: / Line 12: @@
   * It then spends T(n/2) to solve each one since T(n/2) is the worst-case running time for an input of size n/2
   * Finally, it spends O(n) time to combine the solutions from the 2 recursive calls
-  * Thus, the following ** recurrence relation ** is satisfied by the running time:
+  * Thus, the running time satisfies the following ** recurrence relation **:
     * **T(n) ≤ 2T(n/2) + cn** for n >2
     * And T(2) ≤ c
+Thus, 2 is the base case of the recurrence relation in Mergesort algorithm. We assume n is even. To solve for the overall running time of the algorithm, we need to solve the recurrences involved in the algorithm.\\
+\\
+** Approaches to Solving Recurrences **
+There are two general approaches to solving recurrences:\\
+  * **To Unroll the Recurrence **: Account for the running time across the first few levels, and identify a pattern that can be continued as the recursion expands. Then we sum up the running times over all levels of recursion to get the overall running time.
+  * ** To substitute a Solution into the Mergesort Recurrence** : Start with a guess for the solution, substitute it into the recurrence relation, and then check that it works. This approach is formally justified by induction on n.
+===== Unrolling the Mergesort Recurrence =====
+**Analyze the first few levels**:\\
+\\
+>>>>>>>>>>>>>>> At the first level of recursion, we have a single problem of size n. It takes at most cn + time spent in all subsequent recursive calls.\\
+>>>>>>>>>>>>>>> At the next level, we have two problems of size n/2. Each takes at most cn/2 + time spent on subsequent recursive calls, for a total of at most cn.\\
+>>>>>>>>>>>>>>> A the third level, we have 4 problems each of size n/4, each taking time at most cn/4, for a total of at most cn.\\
+\\
+** Identifying the pattern **:\\
+\\
+>>>>>>>>>>>>>>> At level j of the recursion, the number of subproblems has doubled j time\\
+>>>>>>>>>>>>>>> So, there are now a total of 2^j subproblems.\\
+>>>>>>>>>>>>>>> Each subproblem has in turn decreased its size by a factor of two j times\\
+>>>>>>>>>>>>>>> So,each subproblem has size n/2^j and takes time at most cn/2^j\\
+>>>>>>>>>>>>>>> Thus level j costs us at most 2^j*(cn/2^j) = cn \\
+\\
+** Summing over all levels of recursion **:\\
+\\
+>>>>>>>>>>>>>>> We know that cn is the upper bound and applies to the total amount of work performed at each level.\\
+>>>>>>>>>>>>>>> The number of times the input must be halved to reduce its size from n to 2 is log<sub>2</sub>n\\
+>>>>>>>>>>>>>>> Thus, after summing the cn work over all logn levels of recursion, we get a total running time of O(nlogn).\\
+>>>>>>>>>>>>>>> So in brief, any function T(.) for which T(n) ≤ 2T(n/2) + cn is bounded by O(nlogn) when n>1.\\
+\\
+====  Substituting into the Mergesort Recurrence ====
+>>>>>>>>>>>>>>> Let's suppose we think that T(n) ≤ cn log<sub>2</sub>n for all n ≥ 2 and we want to check whether it's true.\\
+>>>>>>>>>>>>>>> The above claims holds for n=2 since cn log<sub>2</sub>n = 2c and we know by the recurrence relation that T(2) ≤ c\\
+>>>>>>>>>>>>>>> Now, we assume by induction that T(m)≤ cmlog<sub>2</sub>m for all m < n, and we want to make prove this for T(n).\\
+>>>>>>>>>>>>>>> To do this, we just compute after appropriate substitutions:\\
+>>>>>>>>>>>>>>> T(n) ≤  2T(n/2) + cn\\
+>>>>>>>>>>>>>>>>>>>>>> ≤ (2cn/2) log<sub>2</sub>(n/2) + cn\\
+>>>>>>>>>>>>>>>>>>>>>> = cn[(log<sub>2</sub>n)-1)] + cn\\
+>>>>>>>>>>>>>>>>>>>>>> = (cn log<sub>2</sub>n) - cn + cn\\
+>>>>>>>>>>>>>>>>>>>>>> = cn log<sub>2</sub>n\\
+>>>>>>>>>>>>>>> This argument establishes the bound T(n) we want, assuming it holds for smaller values m<n, and thus completes our induction proof.\\
+\\
+** An Approach Using Partial Substitution**:\\
+\\
+With this approach:
+>>>>>>>>>>>>>>> Guess the overall form of the solution without pinning down the exact values of all the constants and other parameters.\\
+>>>>>>>>>>>>>>> Suppose we believe that T(n) = O(nlogn) but we're  not sure of the constant inside the T(.) notation. So,\\
+>>>>>>>>>>>>>>> T(n)≤ kn log<sub>b</sub>nfor some constant k and base b that we solve for.\\
+>>>>>>>>>>>>>>> We try to use the induction as follows:\\
+>>>>>>>>>>>>>>> T(n)≤ 2T(n/2) + cn ≤ 2k(n/2) log<sub>b</sub>(n/2) + cn\\
+>>>>>>>>>>>>>>> Let's now choose 2 as our base since it can help us simplify our expression.So,\\
+>>>>>>>>>>>>>>> T(n) ≤ 2k(n/2)log<sub>2</sub>(n/2) + cn\\
+>>>>>>>>>>>>>>>>>>>>>>>>>> = 2k(n/2)[(log<sub>2</sub>n)-1] + cn\\
+>>>>>>>>>>>>>>>>>>>>>>>>>> = kn[(log<sub>2</sub>n)-1] + cn\\
+>>>>>>>>>>>>>>>>>>>>>>>>>> = (knlog<sub>2</sub>n) -kn + cn\\
+>>>>>>>>>>>>>>> Finally we ask ourselves if there is a choice of k that will make this expression be bounded by knlog<sub>2</sub>n\\
+>>>>>>>>>>>>>>> From the expression,it's obvious that we just need to choose a k that is at least as large as c, which gives us:\\
+>>>>>>>>>>>>>>> T(n) ≤ (knlog<sub>2</sub>n) -kn + cn ≤ knlog<sub>2</sub>n\\
+>>>>>>>>>>>>>>> Thus our proof is completed.\\
+\\
+As useful as it is, our this section was really short and straightforward. I give a 9/10.
-T(n/2) + T