Chapter 2
2.1: Computational Tractability
Summary of the Section
The two main components of computational efficiency are efficiency in running time and the amount of space (memory) used by an algorithm. A good way to gauge the efficiency of an algorithm is to analyze its worst case runtime. Additionally, in comparison to a brute search algorithm, an algorithm should be more efficient and have more of an intellectual weight or flavor to the structure of the algorithm. The text then settles on a definition of efficiency: an algorithm is efficient if it has a polynomial running time. Finally, we find out that problems that have polynomial-time algorithms as a solution have moderately growing polynomials.
Notes on the Section
Worst case analysis of an algorithm does a reasonable job of capturing algorithm’s efficiency in practice
Natural “brute-force” algorithm for the Stable Matching Problem plows through all perfect matchings by enumeration, checking to see if its stable
General extrapolation: “brute-force” plows through all possibilities of enumeration in solving the problem and checks to see if the answers meet the requirements of the problem
brute force search algorithms typically have exponential runtime
The concept of a “reasonable running time”: what defines it?
Good algorithm: input size increase by a constant factor, the algorithm should only slow down by some constant factor C
Polynomial running time / polynomial-time algorithm: cN^d
problems for which polynomial-time algorithms exist almost always have algorithms with running times proportional to moderately growing polynomials (n, nlogn, n^2, n^3)
worst-case, polynomial-time bounds is only an abstraction of practical situations
An algorithm is efficient if it has a polynomial running time
This definition is more of an absolute notion than alternatives
Closely connected to the idea each problem has an intrinsic level of computational tractability: some admit efficient solutions, others do not
Questions
There was no true definition of computational tractability given in the section. As I understand it, computational tractable problem would be one that can be solved with an algorithm with polynomial runtime. Is this correct?
Are there any situations in which a brute force algorithm would be better? For instance, if you knew your data size was really small.
Is there a way to tell if you have an algorithm with the best possible polynomial runtime to solve a problem?
Discuss anything that makes more sense after reading it again, after it was presented in class (or vice versa)
At first, I was confused on what the text meant by a “brute force” search algorithm. I had an idea of what it was, but after hearing the significance of comparing your algorithm to a brute for search algorithm, and making sure it performs better. Ergo, through that explanation I better understood what a brute force search algorithm was: an algorithm that checks every possible solution to the problem until it finds the right one, which has exponential runtime.
How Readable the Section was
On a scale of 1 to 10, this section was a solid 7 because I thought that the book did a good job of explaining why using worst case runtime is important, and why polynomial-time algorithms are ideal. However, I found the language to be unclear at times.
2.2: Asymptotic Order of Growth
Summary of the section
We typically express the worst case runtime of an algorithm on inputs of size n, which grows at a rate that is at most proportional to some function f(n). T is asymptotic upper bound by f when there exist constants c > 0 and n0 > 0 so that for all n>=n0, we have T(n) ⇐c*f(n). T is asymptotically lower bounded by f when there exist constants E > 0 and n0 >= 0 so that for all n>=n0, we have T(n) >= E * f(n). If a runtime is bound by both O(f(n)) and Ω(f(n)), then we can say f(n) is an asymptotically tight bound for T(n). Then, the text explained the properties of asymptotic growth rates, which include transitivity and sums of functions. Finally, the chapter ends with an explanation of the asymptotic growth rates of polynomials (T(n) is O(n^d) where d is the degree), logarithms(logb(n) = O(n^x) for b>1 and x>0), and exponentials (n^d = O(r^n) for r>1 and d>0).
Notes on the Section
Questions
The book mentioned O(•) and Ω(•). What is the significance of the “•”? Is it because this type of asymptotic upper bound/lower bound is used so often?
Does the book not mention factorial runtimes because they would run so slowly that if your algorithm runs that way, then that would definitely not be an ideal situation in which you would keep it that way? If an algorithm has factorial runtime, at that point, would a brute-force search be better because its runtime is usually 2^n?
Discuss anything that makes more sense after reading it again, after it was presented in class (or vice versa)
In class it definitely became clearer/easier to understand with how translate all of the symbols in the section into words, which made reading back over the notes I took in the section sink better because those symbols actually meant something to me. Moreover, I was unclear on what exactly it meant to be an asymptotic upper bound and an asymptotic lower bound. I understood that for upper bounds the f(n) would be greater than the T(n) and for lower bounds the f(n) would be less than the T(n). However, I was not sure if in order to be an asymptotic growth rate, the entire function f(n) would have to be greater than or less than T(n); they could never cross. However, after seeing the slide in class with the graph whose f(n) and T(n) would intersect multiple times, it became clear that was not the case. There just needs to be a point at which f(n) from then on is always greater than or always less than T(n).
How readable the section was
On a scale of 1 to 10, this section was a 6 on the first read because all of the symbols made it difficult to completely comprehend all of the information. However, after reading it again, the section became more of an 8 because once I got a hang of the notation and after going to class to hear the terms used out loud, my reading had more of a natural flow to it. Thus, the section was easier to read a second time.
2.3 Implementing the Stable Matching Algorithm
Summary
In order to implement the Gale-Shapely algorithm in O(n^2) runtime, we need to consider the data structures we need to use and how they will be manipulated, and, in turn, limit the number of computational steps. We only need two (of the simplest) data structures to implement the Gale-Shapely algorithm in O(n^2): arrays and lists. Linked lists are good for maintaining a dynamically changing set of data. However arrays are not good for sets of data that are constantly changing because resizing results in O(n) runtime. Now, for the implementation, the set of unmatched men will be kept track of as a list because the set is constantly changing as the algorithm is performed. Engagements will be kept track of in 2 arrays of length n. Who the men will propose to will be be tracked in an array of integers, indicating what position on m's list he will propose to next. Finally, the preference lists will be implemented as two 2D arrays. However, the structure of how the data is accessed is different in the women's array than in the men's array in order to maintain O(k) runtime with the operations within the while loop.
Outline of the Gale-Shapley Algorithm Implementation
The motivation is that we need to figure out what data structures we should use to implement the Gale-Shapely algorithm O(n^2) runtime.
To start off, the preference lists for each man and each woman will be represented as two arrays, one for women’s preference lists and one for men’s preference lists
Each of the following four things (which would be performed inside a while loop that executes O(n^2) times) need to be performed in constant time in order to maintain an overall O(n^2) runtime of the algorithm
Identify a free man m.
We need to be able to identify, for m, the highest-ranked woman to whom he has not yet proposed.
We need to decide if w is currently engaged; if she is, identify her current partner.
For w, m, and m’, we need to decide which of m or m’ is preferred by w.
selecting a free man
identify highest ranked woman to whom m has not yet proposed
after m proposes to w, identify if w is engaged to an m’
We will maintain another array of length n where each position is w’s current partner m’.Then, use a null symbol to indicate if a woman is not currently engaged. To initialize the data structure, we need to assign each woman a null symbol because everyone will be free at the start of the algorithm.
O(k)
Determine which of m’ and m is preferred by w
If we make the woman’s preferences array structured like the men’s it will result in O(n) runtime, which is not good, especially because we know we can do better. We are aiming for O(n^2). If we structure the women's preference array like the men's, we would end up with O(n^3) runtime. First, we should create an n x n array where each position contains the rank of m in the sorted order of w’s preferences. The total initial time investment of this step is O(n). To decide which of m’ and m is preferred by w, compare the values of the positions of m and m’ in w’s array.
O(k)
Implementing the algorithm by using these data structures allows the Gale-Shapely algorithm to run in O(n^2) runtime overall.
Arrays vs. Lists
Questions
If we consider the problem that we had on Problem Set #1, how would allowing ties change our data structures? Would the women's preference array would merely have to allow two men to have equal rankings? Would the men's array have to be a 3D array (an array of arrays of arrays) to accommodate ties?
Is there a way to implement the Gale-Shapely algorithm in O(n^2) runtime if we use objects to represent the men and women with the data of preference lists, etc. within the object?
After discussing the implementation of the algorithm in class with the various data structures, I had a hard time comprehending the women's preference list implementation because it was backwards of the implementation of the men's preference list. However, after reading the text, it became clear that you merely need to compare the position of man m to the position of man m' in w's array within the women's preference array, which would make the step of determine if w prefers m or m' constant time O(k). The women's preference array is merely the inverse of the man's preference array.
The section overall, I would give a solid 9 out of 10 on the readable scale. The progression from refreshing the reader on lists and arrays to using them to implement the Gale-Shapley algorithm was extremely helpful and definitely made the implementation section of the reading easy to comprehend. Additionally, the implementation section of the reading was broken up into four parts, which were the main actions within the while loop organized in the order that the steps would be visited. The only reason the section is not a 10 is because I had to re-read the women's presences list description twice to be able to fully comprehend its implementation.
2.4 Common Runtimes
Summary
Linear time happens when an algorithm spends a constant amount of time on each object. Quadratic time arises when nested loops are involved; both loops are O(n) so you multiply the runtimes together to get O(n^2). O(nlogn) comes about when typically in a divide and conquer situation: the algorithm splits the input into two pieces, solves each piece recursively, then merges the results together into a single output. O(logn) happens when an algorithm divides the active portion of the possible input in half on each iteration. O(n^k) occurs when for any constant k when we search over all subsets of size k. O(2^n) is the runtime of a natural brute-force search algorithm, and O(n!) comes about when the algorithm operates on each possible permutation of inputs.
Notes
Most problems have a natural “search space”. A unifying theme in algorithm design is the search for algorithms whose performance is more efficient than a brute-force enumeration of the search space. When we get a new problem, we have to think about two different bounds: the runtime we hope to achieve and the size of the problem’s natural search space (the runtime of a brute-force search algorithm).
Linear Time
Process the input in a single pass, spending a constant amount of time on each item of input encountered.
Constant work per element yields a runtime of O(n).
Example: computing the maximum in a set of numbers, constant work per element
Example: merging two sorted lists, if you look at the top card on each stack, you know that the smaller of these two should go first the output pile; so you could remove this card, place it on the output, and iterate on what’s left.
Usually with linear runtime algorithms, people typically describe it as constant work per element. But, that is not true. It is more like each iteration involves a constant amount of work, so the total runtime is O(n).
O(nlogn) Time
O(nlogn) is the runtime of any algorithm that splits its input into two equal-sized pieces, solves each piece recursively, and then combines the two solutions in linear time.
Sorting: The merge sort algorithm divides the set of input numbers into two equal-sized pieces, sorts each half recursively, and then merges the two sorted halves into a single sorted output list.
O(nlogn) algorithms are so common because, typically, the most expensive step is to sort the input.
Quadratic Time
Quadratic time arises naturally from a pair of nested loops: an algorithm consists of a loop with O(n) iterations, and each iteration of the loop launches an internal lop that takes O(n) time. When you multiply these together, you get O(n^2) runtime.
Example: find a pair of n points in a plane that are the closest together
Cubic Time
More elaborate nested loops lead to algorithms that run in O(n^3) time.
Example: want to know if some pair of the sets S1, S2,….., Sn of size n are disjoint
Each of the sets has a maximum size of n, so the innermost loop takes time O(n). Looping over the sets Sj involves O(n) iterations are this innermost loop, and looping over the sets Si involves O(n) iterations around this. Multiply these three factors of n together, and you get O(n^3).
O(n^k) Time
We get a runtime of O(n^k) for any constant k when we search over all subsets of size k.
Example: find independent sets in a graph: The natural brute-force algorithm enumerates all subsets of k nodes, and for each subset S it would check whether there is an edge joining ant two members of S. The total runtime is O(k^2n^k) but because k is a constant we can drop it, so the total runtime is O(n^k).
Total number of k-element subsets in an n-element is C(n, k): n “choose” k
Beyond Polynomial Time
Two types of bounds come up frequently, which are 2^n and n!
2^n comes about naturally as a runtime for search algorithms that must consider all of its subsets
2^n is the natural size of a search space for many problems, and for many of them we can find efficient polynomial-time algorithms. But, sometimes you can by-pass the brute-force search algorithm and sometimes you can’t.
Search spaces of n! come about for two reasons: n! is the number of ways to match up m items with n other items (despite this we were able to solve the stable matching problem in O(n^2)) or the search space consists of all ways to arrange n items in order (Traveling Salesman Problem)
Sublinear Time
Smaller than linear
These situations happen when the input can be “queried” indirectly rather than read completely. The goal is to minimize the amount of querying that must be done.
Example: binary search
We shrink the size of the active region by a factor of two with every probe. So, after k probes it has a size at most of ((1/2)^k)n.
When k = log2(n) the size of the active region has reduced to a constant, the algorithm bottoms out and we can search the remainder of the array directly in constant time.
Binary search is O(logn) because of the successive shrinking of the search region.
O(logn) happens when an algorithm does a constant amount of work to throw away a constant fraction of the input. O(logn) such iterations suffice to shrink the input down to constant size, at which point the problem can be solved directly.
Questions
When depicting exponential runtime do you want us to have a specific base like O(2^n) or would a general O(n^k) suffice?
Are there any instances in which O(n^2) arises without the presence of nested loops?
In class, we didn't really talk about O(2^n) or O(n!) in too much depth. It was really interesting and helpful to see that in the reading. Until doing the reading, I wasn't quite sure why O(2^n) was such a common or important runtime, but now it is obvious why. It is the natural runtime of a brute-force search algorithm, which we try to improve upon in all of the algorithms that we create in this class. Moreover, the way by which a factorial runtime comes about was more explicitly explained in the reading than in class. Although, we would hope to never have an algorithm that runs this efficiently, it is still important to know how it comes about. Factorial runtime comes about as the number of ways to match up m items with n other items and the search space consists of all ways to arrange n items.
This section was a 9 out of ten in readability because the runtimes were explained very clearly. The only thing that got a little confusing was when they explained some of the examples. However, overall, the language was very precise and the common runtimes were explained in great detail.
2.5 Priority Queues
Summary
The priority queue consists of elements that have a priority value, also know as a key, and each time we select an element from the set, we take the element with the highest priority. We want to be able to add, delete, and select in O(logn) time. The heap is the best data structure to implement the priority queue. A heap is a balanced binary tree: the tree will have a root, and each node can have up to two children, a left and a right child. Elements in a heap are said to be in heap order if the key of any element is at least as large as the key of the element at it parent node in the tree; this is known as heap order. We will use an array to implement the heap; an element in the array with index i has a left child at position 2i and a right child at position i. When adding an element, we must employ the help of an algorithm called Heapify-up in which you add the new element at the end of the current elements in the list and compare it to its parent and swap them if the child's priority is higher. It is called recursively, until it gets to its correct spot. This runs in O(logn) time. When deleting way need to employ the help of an algorithm called Heapify-down in which you delete the item from the given position and then shift the items accordingly by identifying the highest priority child. This is also recursive and executes in O(logn). The heap combines the benefits of a sorted array and list.
Notes
Main goal in the book is to seek algorithms that improve qualitatively on brute-force search. We use polynomial-time solvability as the concrete formulation of this. Once we have an efficient algorithm to solve a problem, it can be possible to find improvements by being mindful of the implementation, and by possibly using more complex data structures. The priority queue is one of the most broadly useful sophisticated data structures.
The Problem
The priority queue is designed for applications in which elements have a priority value (key), and each time we select an element from the set, we take the element with the highest priority.
Priority Queue
Maintains a set of elements S, where each element v has an associated value key(v) that denotes the priority of the element v
Smaller keys represent higher priorities
Support addition and deletion of elements, and the selection of an element
Each of the processes has a priority but processes do not arrive in order of their priorities. We have a current set of active processes and we want to extract the one with the highest priority to run.
We want to be able to add, delete, and select the element with the smallest key in O(logn)
A sequence of O(n) priority queue operations can be used to sort a set of n numbers.
Create a priority queue and insert each number with its value as the key. Extract the smallest number one by one until all numbers have be extracted.
With a priority queue that can perform insertions and extraction of the element with the minimum key in O(logn), we can sort n numbers in O(nlogn) time.
A Data Structure for Implementing a Priority Queue
We will use a heap to implement the priority queue.
Using an unsorted list with a pointer to the minimum to implement the priority queue results in O(n) extraction of the minimum.
Using a sorted list with a pointer to the minimum to implement the priority queue results in O(n) addition of an element.
The heap combines the benefits of a sorted array and list. A heap is a balanced binary tree. The tree will have a root, and each node can have up to two children, a left and a right child.
Elements in a heap are said to be in heap order if the key of any element is at least as large as the key of the element at it parent node in the tree.
Heap order: For every element v, at a node i, the element w at i’s parent satisfies key(w) ⇐ key(v)
We can use points, each node of the heap can keep the element it stores, its key, and three pointers pointing to the two children and the parent of the heap node.
Or we can completely avoid using pointers by using an array H indexed i = 1,…,N if a bound N is known in advance. H[1] is the root and for any node at position i, the child nodes are at leftChild = 2i and rightChild = 2i+1
Implementing Heap Operations
The heap element with the smallest key is at the root, so it takes O(1) to identify the minimal element.
To add a new element we will use Heapify-up.
Heapify-Up
Heapify-up(H, i):
If i>1 then
Let j = parent(i) = [i/2]
If key[H[i]] < key[H[j]] then
Swap the array entries H[i] and H[j]
Heapify-up(H, j)
Endif
Endif
Heapiy-up fixes the heap property in O(logn) time assuming the array H is almost a heap with the key of H[i] too small. Using Heapify-up we can insert a new element in a heap of n elments in O(logn)
Be applying Heapify-up(j) recursively, we will produce a heap as required. This process dollow the tree-path from position i to the root, so it takes O(logn)
ExtratMin(H) requires the definition of another, more broad, process Delete(H, i), which will delete the element at position i.
After deleting the element at position i there will only be n-1 elements and a hole at H[i] so we will need to employ the process Heapify-down to fill the hole and maintain the heap-order.
Heapify-Down
Heapify-down(H, i):
Let n = length(H)
If 2i > n then
Terminate with H unchanged
Else if 2i < n then
Let left = 2i, and right = 2i+1
Let j be the index that minimizes key[H[left]] and key[H[right]]
Else if 2i = n then
Let j = 2i
Endif
If key[H[j]] < H[i]] then
Swap the array entries H[i] and H[j]
Heapify-down(H, j)
Endif
The process Heapify-down(H, i) fixes the heap property in O(logn) time assuming that H is almost a heap with the key value of H[i] too big. Using Heapify-up or Heapify-down we can delete a new element in a heap of elements in O(logn) time
Swapping the elements w and v takes O(1) time
The algorithm repeatedly swap the element originally at position i down following a tree-path, so in O(logn) iterations the process results in a heap.
Implementing Priority Queues with Heaps
StartHeap(N): O(N)
Insert(H, v): O(logn)
FindMin(H): O(1)
Delete(H, i): O(logn)
ExtractMin(H): a combination of FindMin and Delete; O(logn)
Delete(H, Position[v]: delete an element v; O(logn)
ChangeKey(H, v, a): changes the key value of element v to key(v) = a; O(logn)
Questions
Should the heap methods Delete(H, i) be used only in the context of ExtractMin(H) when a heap is implementing a priority queue because wouldn't extracting any element besides that with the highest priority violate the essence of a priority queue?
I'm pretty sure the answer is yes, but I'm just double checking? Can a priority queue have multiple elements with the same priority? If so, how does the heap implemented queue decide which element to move to the root when the minimum is extracted when both of the children have the same priority? I'm pretty sure this would have to be dealt with in the part of Heapify-down that figures out the minimum of the two children of a deleted item.
After going to class and having Professor Sprinkle really break down the components of Heapify-down, what exactly was happening in that algorithm became way more clear. She really emphasize that the three different if/elif statements were for the cases of how many children a node had. If 2i > n then meant that an element had no children. Else if 2i < n then meant that an element had two children. Finally, Else if 2i = n then meant that an element had one child. The reason for having Heapify-down also became more clear after going to class. It is now apparent that Heapify-down is there to fill the hole that deleting an element results in whilst maintaining the heap order.
This section's readability was an 8 out of 10. The progression from the definitions of priority queues and heaps to the ultimate explanation was very precise and natural. The only reason that it wasn't a 10 is because it took me a while to fully grasp the concepts of Heapify-up and Heapify-down. Ultimately, the section used very clear language and precise explanations of priority queues and heaps, and it included a lot of definitions and explanations of properties of different data structures, which were super helpful in going back over my notes.