This is an old revision of the document!

Chapter 2: Basics of Algorithm Analysis

This chapter will attempt to serve as an introduction on understanding the resources required for run times and space usage of different algorithms and how they compare to one another. This will involve discussion of basic algorithms while progressing into more complicated and useful algorithms that require the use of sophisticated data structures.

2.1: Computational Tractability

This section started off by defining that all algorithms are fundamentally discrete. That is, they “involve an implicit search over a large set of combinatorial possibilities” with the goal of finding the most efficient solution that solves the problem at hand. The rest of the section discussed the term “efficiency” and possible definitions for it. It settled upon saying that an algorithm is efficient if it has a polynomial run time, with this definition being an “absolute notion.” This means that it allows us to have a strict definition of whether or not a problem has an efficient solution. One of my main takeaways was that the best way to compute and declare run time is by using the worst-case run time. This way, you don't have any disagreements as to what average run time means and it is rare that you will achieve best-case run time, unless it has a tight bound, so you would not want to use that.

2.2: Asymptotic Order of Growth

Here the different forms for declaring bounds of a run time were stated. The upper bound is denoted by big-O notation, meaning that the run time will be no worse than O(*), the star being the run time. Ω represents the lower bound, meaning run time is no better than Ω(*). Tight bound is represented by Θ. I had not previously known what the tight bound was, but I know now that it represents if O(*) and Ω(*) are the same, then the tight bound will be the same as well. Also, though the base does not matter when declaring logarithmic run times, it matters greatly for exponential run times.

2.3: Implementing the Stable Matching Algorithm Using Lists and Arrays

In this section we analyzed the pros and cons to implementing the stable matching problem using lists and arrays. What we are looking for when doing so is the simplest way to implement it while using an efficient structure that will best suit the needs at hand.

The simplest structure that we could use is the array, because it is very easy to implement in pretty much every coding language. The run times are desirable as well, as we can find elements in constant time, search for items in linear time and search in a sorted array in logarithmic time. The draw back to using an array, though, is the fact that it doesn't change size automatically. To do this, we would have to use insert and delete methods that would significantly cost us time. One question I was wondering is just how much time this would cost us?

The more efficient and preferable structure to use in our case would be a (doubly) linked list. This is because it uses pointers that allow us to directly insert into the front and back of the list constant time. But, we can not find an element in constant time, but rather in O(i), because we have to iterate through until the desired value is found.

I thought it was very interesting how they said that in order to implement the GS Algorithm in quadratic time, we must be able to implement it in constant time. I also wonder why a stack would not be a good implementation for the preference lists, as I feel like being able to just pop off the preferred person (for the men) would be relatively quick and simple.

The fact that we can make step 4 of the GS Algorithm (find out of m or m' is preferred by w) work in constant time is pretty cool to me. At face value, I would have thought it impossible, but the nxn array is a pretty useful structure that we put to use, and this whole section cleared up a lot of what I was still not 100% sure about after the discussion in class. I would give it a 6 of 10 for readability, because it was interesting, although most of it was review and reinforcing the ideas previously discussed.

2.4: A Survey of Common Running Times

The goal of this section was to take a look at the most common run-time bounds and the ways that we interpret them. It started off by talking about the “search space” (or set of all possible solutions) and one of the main goals of algorithm design is to find a more efficient way than a brute-force enumeration of the search space to run the algorithm. So, the best way to do this is by coming up with a bound on both the run time you hope to achieve as well as the worst case run time, which would be the brute force algorithm.

For O(n), we will usually process the input in a single pass, as the book tells us. The example provided is finding the max of a list of numbers, so we iterate through the list, checking each number and storing the highest one until the list is finished. This gives us a linear run time. A slightly more advanced example of this run time is given when merging two sorted lists. Here we would provided a pointer in both lists at the smallest elements. We would append to our sorted final list the smaller of the two, moving the pointer each time a item is appended to the final list. Once one list was empty, we would simply append the rest of the elements from the other list to the final list. Because we can check to see which item is bigger and also append an item in constant time, then we are left with a linear run time, as it will take 2n time to iterate thought the amount of items in both lists.

Next is O(nlogn) time. As the book describes it, this is predominately involved in algorithms that split the input into two equal-sized pieces, solves each piece recursively and then combines the two solutions in linear time. The area we most commonly see this run time is in sorting algorithms.

In quadratic run time, we usually have two sets of input that are each iterated through n times, giving us O(n^2). Or, we can have nested loops, where the first loop runs on linear time and so does the nested loop, giving us O(n^2).

Cubic time is not as widely seen, but is definitely an important run time. This can involve a triple nested loop, which is highly uncommon but can be practical.

Polynomial time describes algorithms that work over subsets of size k. The example given here is finding independent sets of a graph. Therefore, this would check each node in a graph to see if it had a set of size k. This would give us an O[(k^2)(n^k)] run time, but because k is a constant, we know that it will just be O(n^2).

Looking beyond polynomial time, we have run times of exponential and even factorial, but these are highly inefficient and must be avoided at all costs. Exponential is usually found in search algorithms that consider subsets, while factorial is found in algorithms that try to match n items with other n items and also in ordering algorithms.

Lastly, we have sublinear time, which would be the best run times we as programmers could ask for, though they are difficult to achieve. The book describes these as most commonly being algorithms where items can be “queried,” or where we don't have to iterate through each item in a set. The binary search algorithms is the most popular computational model that exemplifies an O(logn) run time. This is because we can search for a single item by probing into the sorted array and seeing which way we need to go (left or right) depending on whether the item we probed is smaller or larger than the desired item. In doing so, we totally ignore half of the array. This way, we achieve a sublinear run time.

After reading this section, I'd still like to see some other examples of O(nlogn), O(n^k), O(2^n), and O(logn) algorithms, although they all make a lot more since, especially O(n^k) and O(logn).

Overall readability I'd say was a 9/10 because it was great review for the simpler run times I already knew and good explanations for the ones I was less familiar with. I would have liked to see an example for each run time though, so I could not give it the full 10/10.

2.5: A More Complex Data Structure: Priority Queues

The priority queue is a widely useful and efficient way of sorting values. This uses keys, or priorities, to sort values that have the most importance first, at the lowest keys. Because we want to be able to perform operations on n elements in O(logn), we will use a heap, as lists and arrays would both force us to perform operations in O(nlogn) time.

In a heap, we have a tree where the lowest keys are at the top (index 1) and every key below is greater than or equal to the key above (index = 2i for left child and 2i + 1 for right child). This makes it perfect to use with a priority queue, as the highest priorities will be at the beginning of the heap and we won't need to iterate through the entire thing to find them.

The procedure described to add elements to a heap is called “Heapify-up.” In this, we will insert an element into the correct position, checking and making sure that the keys above are less than our inserted key and that the ones below are greater. We will divide our input by 2 (i / / 2) to find the parent and then check if the input is smaller than its parent. If so, we will swap them and recursively run the algorithm. If not, then it will stay. We start Heapify-up by adding the element after the last current element in our heap. This corrects the heap in O(logi) time.

If we delete an element in a heap, the hold must be filled. If the element that we move into the hole is too small, we can use Heapify-up to fix this. But, if it is too large, we must use a new procedure, called “Heapify-down,” to fix the problem. Heapify-down will check to see if 2i (input i) is greater than the length of the heap and if so will terminate. Otherwise it will check the children of i and swap our current position with the smaller child and then continue to recursively run until it is in the correct position. This will run in O(logn).

This section as a whole seemed more of a review, as I felt as if I had a pretty good grasp on it after we covered it in class. I like the priority queue and its usage with heaps, as they are quite efficient and seemingly simple to implement and navigate, serving their respective purposes well. I would give this chapter an 8/10 as it was mainly review but was quite thorough in its explanations.

Table of Contents