This is an old revision of the document!

Chapter 2 – Basics of Algorithm Analysis

My notes on the assigned sections of Chapter 2 of Algorithm Design by Jon Kleinberg and Éva Tardos. This chapter details the resource requirements for algorithms, talking about the time and space that they use, later developing run-time bounds for some basic, popular algorithms.

2.1 – Computational Tractability

Section 1 of chapter 2 attempts to define what efficiency is in terms of an algorithm. The initial proposed definition of efficiency from the book is when an algorithm is “implemented, it runs quickly on real input instances”. Bad algorithms can run fast with small test cases, and good algorithms can run slowly if they are coded poorly. Furthermore, this definition doesn't take into account how an algorithm scales with increasing input. So, a second definition is proposed: “an algorithm is efficient if it achieves qualitatively better worst-case performance, at an analytical level, than brute-force search”. So, we use polynomial time as a definition of efficiency. With polynomial time, when the input size increases by a constant factor, the algorithm should slow by a constant factor C. “If the input size increases from N to 2N, the bound on the running time increases from cN^d to c(2N)^d”. This marks a slow-down of a factor of 2^d. The third definition of efficiency says that polynomial time is efficient. With large constants or high exponents, polynomial time won't run efficiently.

This section was very readable, and I would give it a score of 10/10 on both readability and my interest in it.

2.2 – Asymptotic Order of Growth

This section covers the ways in which we can classify an algorithm's order of growth. This classifiers are O, Ω, and Θ. O(.) defines the asymptotic upper bound for an algorithm, Ω(.) defines the asymptotic lower bound for an algorithm, and Θ defines the asymptotically tight bound for an algorithm. An asymptotically tight bound means that O(.) == Ω(.). Asymptotic growth rates have many properties including transitivity: if f = O(g) and g = O(h), then f = O(h) and if f = Ω(g) and g = Ω(h), then f = Ω(h). They also interesting results when you add 2 functions, like f = O(h) and g = O(h), then f + g = O(h). For polynomials, asymptotic rate of growth is determined by their higher order term. Algorithms can be polynomial time even if they aren't written as n raised to an integer power (like O(nlogn)). For logarithms, “the base of the logarithm is not important when writing the bounds in asymptotic notation”. Finally, with exponentials, the terminology gets very sloppy, but when someone refers to an algorithm as exponential, then they mean that the running time grows as fast as some exponential function.

Overall I thought this section was pretty readable and very interesting. I'd give it a 7/10 for readability and a 10/10 for how interesting I thought it was. A couple of the proofs were a bit difficult to follow, but with a slow, careful read I was able to understand them.

2.3 – Implementing the Stable Matching Algorithm Using Lists and Arrays

This section goes over the actual way to program the G-S Stable Matching Algorithm with commonly used data structures. We can do this with some preprocessing of data structures as well. The 2 simple data structures that will be used are lists (linked list implementation; O(i) access, O(1) adding, O(i) deleting) and arrays (O(1) access, O(1) adding but might have to double space in memory for a resize, and O(n) deleting). Preprocessing these two data structures is easy as you can convert between arrays and lists in O(n) time. We previously showed that the algorithm runs in O(n^2) time, so we need to be able to implement the following 4 things in constant time for that to happen:

Identify a free man
For a man m, we need to find the highest-ranked woman to whom he hasn't proposed
For a woman w, we need to find out if she's currently engaged, and if so, ID her current partner
For a woman w and two men m and m', we need to decide which w prefers

Free men will be implemented as a linked list because we will need to easily manipulate its contents (lends better to a linked implementation so we don't have to deal with shifting). Here we can delete first man in list in constant time and insert another man in the list in constant time. To ID the highest-ranked woman to who m hasn't proposed to, we can have an array for all men Next[m], where it is initialized to 1 and incremented for each proposal regardless of outcome. To decide if w is currently engaged and to get her partner, we can have an array of size n Current[w], which is w's current partner (null if she doesn't have one). The naive way to determine which out of m and m' w prefers is to walk through w's list one by one to compare the two, giving us a O(n) runtime. However, if we initialize an nxn array Ranking at the beginning, where Ranking[w, m] contains rank of m in the sorted order of w's preferences, we can compare m and m's ranking in O(1) time.

This section was explained very well and I'd give it a 10/10 in both readability and my interest. The only thing I'm still a tiny bit confused about is how the algorithm is still O(n^2) with the Ranking initialization. Is it because this initialization technically isn't part of the algorithm, so you don't count it in the algorithm runtime?

2.4 – A Survey of Common Running Times

Both bounds on running time and on problem size are important.

Some common linear time algorithms are computing the maximum and merging two sorted lists. In the case of computing the maximum, every element has to be examined, so it must be O(n). For merging two sorted lists, you could mush them together, then run a sorting algorithm, but it would be very inefficient. So, you treat them like two sorted card decks, picking up a card from the deck with the lowest value, then placing it in another list, running in at most 2n iterations, so O(n). O(nlogn) time algorithms happen any time an algorithm splits its input into two equal-sized pieces and solves each piece recursively, then combines the solutions in linear time. An example of this is Mergesort. O(nlogn) also frequently happens when the most expensive step in an algorithm is sorting the input. Quadratic time algorithms often occur with two nested loops and when a search over all pairs of input items spends constant time on each pair. Cubic time also frequently occurs with nested loops (3 instead of 2; or 2 with a linear operation in inner-most loop). O(n^k) algorithms occur when, for any constant k, we search over all subsets of size k.

Common bounds that come up frequently: 2^n and n!. 2^n occurs as a running time for search algorithms that consider all subsets. N! arises where the search space is all the ways to arrange n items in order (grows more rapidly than 2^n).

Overall, this section was pretty readable and interesting. I'd give it a 7/10. The one nagging question I have is why the total number of subsets of an n-element set is 2^n. This is probably a very simple question and basic answer, but for some reason I'm having a hard time visualizing it.

Table of Contents

Chapter 2 – Basics of Algorithm Analysis

2.1 – Computational Tractability

2.2 – Asymptotic Order of Growth

2.3 – Implementing the Stable Matching Algorithm Using Lists and Arrays

2.4 – A Survey of Common Running Times

2.5 – A More Complex Data Structure: Priority Queues