Chapter 2
2.1
The authors start by talking about how, in designing algorithms, efficiency is key. But what exactly does efficiency mean? Well, that is not an easy question. In fact, it's so difficult a question to answer that it took the authors of this book–with their super-smart brains and their Ph.Ds from M.I.T. and Eötvös–three whole tries to make up their minds. Or maybe that's just them doing what they said they'd do in the preface and approach algorithm design like a “thought process.” It reminds me a little bit of Goldilocks and the Three Bears. The first working definition of efficiency (runs fast on real inputs) is too inflexible. The second working definition (qualitatively better worse-case performance than brute-force) is too vague. And the third proposed definition is just right…well not really. The authors point out that there are of course, exceptions to the definition they settle on (running an algorithm in “polynomial time.”) And that technically n^200 shouldn't be considered efficient. But they say it usually works, so that's a good definition. Plus it's negatable.
2.2
First, the authors state that, in measuring efficiency of an algorithm, it's not really important to count the exact number of steps you have for a given input size, it's sufficient to describe it in less detail than that. Then, they introduce the idea of asymptotic upper bounds–or big O…and define it mathematically, and give an example to illustrate. They explain that you can have many upper bounds…big O isn't necessarily going to be the smallest upper bound for the algorithm.
They do a similar thing for the omega–the asymptotic lower bound–definition, illustration. And then they put the omega together with the big o dude and introduce the big theta tight bound, and explain why that's a useful thing to have because they are so precise. They show how to get it by taking a limit.
Then they run through, (and partially prove) properties of the bounding functions: transitivity, that adding two functions gives a function of the same order, that a function which is an asymptotic upper bound for another function is a tight bound for the sum of that function and itself.
Next, they run through noting properties of the different types of asymptotic bounds. For instance, the bound of polynomial functions is always determined by the highest exponent thingy in the function. They also point out that not all polynomial types are bounded by n raised to an integer power, like there can be O(n^1.59) Then, for logarithmic time they point out it doesn't really matter what the base of the log is because you can always translate between logs. O(log n) is sufficient.
This isn't true for exponentials, however. Just saying it's exponential time is sloppy because technically all exponentials are different. But that doesn't usually matter too much, because exponential is really slow anyway.
Section 2.3 Starts with the authors saying how you have to explore possible data structures in order to properly analyze an algorithm. They then overview some properties of arrays: O(1) access O(n) search if unsorted O(log n) if sorted
They then overview (doubly linked) lists–quicker to delete/insert from end/beginning (book doesn't say O(1)?) but O(n) to add. Lists are better for things that change. O(n) to convert between array and list.
So, in implementing the Gale-Shapely algorithm they choose to represent preferenced with arrays, free men with a linked list, and pretty much what we talked about in class.
Section 2.4 Starts by recommending programmers consider “search space” (how much time the brute force method would take) when they begin to conduct analysis on a problem.
Then they forget about the whole brute force thing for awhile and analyze algorithms that run in O(n) time, like computing the maximum of a list, and merging two sorted lists. They talk about the strategy they use for proving the bound on the merging as an “accounting” scheme.
Then it talks about O(n*log(n)) time, and merge sort.
Then it talks about quadratic time, which arises when you need to work with pairs of inputs, and when you see a nested for loop. Finding the closest pair of points in a set.
Then cubic time, which happens with “more elaborate nested for-loops.” Finding whether two subsets are disjoint.
Then O(n^k): finding independent sets in a graph (n^k is number of subsets)
Then “beyond polynomial time”, like O(n^2*2^n) like for finding largest independent set in a graph.
n! is scarier than 2^n, n! is like all the perfect matchings in the Stable Matching Problem, for instance. But the authors brag how they were able to solve it in O(n^2), and later, with n! matchings of bipartite matching problem they can solve it in O(n^3). They must be really smart.
Also, n! is search space for travelling salesman problem for which there is no known efficient solution.
2.5 Introduce priority queues. Priority queues can sort in O(n) priority queue operations, so if each operation is log n, then O(n*log n). Implementing such a priority queue, however, can be a heap of trouble. So, the authors introduce heaps. Heaps “combine benefits of sorted array and list.” I still don't understand why initializing the heap is O(n). Not a lot of elaboration in the text.
I have a feeling these last three sections would have been much more readable if it had been new material. It seemed to overlap almost entirely what I had already learned in the lecture. Probably it's easier to learn this material in the lecture, because you'd passively learn it from the book because it's all there laid out for you–but you get to learn it actively in the lecture because the problems are posed and your mind has to actively explore for an answer until another student utters it or it's explained by Dr. Sprenkle.