Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2018:journals:beckg:ch2 [2018/01/15 23:50] – beckg
+++ courses:cs211:winter2018:journals:beckg:ch2 [2018/01/30 04:48] (current) – beckg
@@ Line 49: / Line 49: @@
     * For every //r > 1// and every //d > 1//, we have //n<sup>d</sup> = r<sup>n</sup>//. That is, every exponential grows faster than //every// polynomial.
     * For the most part, this renders algorithms useless.
 This was a nice section, also a 9 out of 10. Very readable, and I particularly like how they are getting down into the mathematical foundations a bit more now.
+===== 2.3: Implementation of Stable Matching with Lists and Arrays =====
+Our previous analysis of the G-S Stable Matching Algorithm was at a relatively high level of abstraction--finding the //O(n<sup>2</sup>)// bound based simply on the maximum number of iterations of the ''While'' loop. To dive deeper into analysis of the actual number of computational steps of any algorithm, we must consider the actual implementation--primarily, what data structures we will use and how they will interact. Here, we will discuss an implementation of stable matching using simply lists and arrays. First, the pros and cons of the two are weighed, and their advantages in certain situations considered (omitting from copying down here, at this point in the CSCI minor it's all ingrained pretty permanently).
+=== Implementing ===
+Importantly, because we know the ''While'' iteration can run //n<sup>2</sup>// times in the worst-case, to have an implementation run in proportional time, we need each iteration to run in constant time. For simplicity, each man and woman are associated to a number from //1// to //n// at the outset. We use two arrays to represent the preference lists of men and women. Note the space required for this is //O(n<sup>2</sup>)//, because each person must rank the //n// members of opposite sex. Now, we proceed to find a way to do the following in constant time:
+  - Identify a free man (maintain set of free men as linked list; then access to first, deletions, and additions are all in constant time)
+  - For man //m//, id the highest ranked woman to whom he hasn't proposed (single array, with one index for each man; each value initialized as "1", then updated to always refer to index of the woman to whom he should propose next)
+  - For woman //w//, must check if she is engaged, and if so, who her current partner is (similar to above: single array with an index for each woman; values initialized to a //null// symbol, and then updated to be the index of the man to whom she is engaged)
+  - For woman //w//, must be able to see whether she prefers //m// or //m'// (at start of algorithm, create //n x n// array which contains a row for each woman whose position indices correspond to the ID's of each man; the values at those indices contain the numerical ranking that the woman gave to the corresponding man; then, given two men, we can access their respective rankings and compare them in constant time)
+The last 2-d array is essentially the preference lists of the women, but ordered by the men's ID's themselves, not by ranking. Note that we can create this at the outset with a single pass through the women's preference lists--proportional to //O(n<sup>2</sup>)// in the first place. With the above structures, we implement the algorithm in //O(n<sup>2</sup>)// time.
+I found this section //very// readable, a 9/10. Their use of the ''monospacing'' in reffering to data structures and values within them is incredibly helpful in understanding what they are trying to convey.
+===== 2.4: Survey of Common Running Times =====
+Over the course of analyzing algorithms, a number of running times come up quite frequently, such as //O(n), O(n<sup>2</sup>),// and //O(n log n)//. This is not a coincidence, as these tend to originate from very common algorithm designs and structures. We will go over these in the coming section. Additionally, note that a good starting point in much of algorithm analysis is the //natural search space//: the set of all possible solutions. Thus, the point of algorithm design is to find algorithms which perform better than brute force searching our way through all of them.
+=== Linear Time ===
+Algorithms that run in //O(n)// time are at most always a constant factor times the input size. Many of these end up being "one-pass" algorithms, in which we access everything and compute what we need by passing through it once. For example, simply computing the maximum of a random list of numbers.
+However, as we learned in class, these could be more nuanced, as is the case of merging two sorted lists. Here, we always compare the smallest two items of each list, and remove the smaller, adding it to the end of our third merged list. The important part of proving this linear is noting that on each comparison, one of the total //2n// items (supposing two lists of //n// length) is added to the new list. Therefore, there are only //2n// iterations total.
+=== "Linearithmic" Time ===
+Running time of //O(n log n)// is also very common. Specifically, this is seen in //Mergesort// and other good sorting algorithms. The key property that gives rise to this time is the splitting of the input data in two, and solving the problem on each half recursively. We also saw in class how this is the time of finding the maximum time interval given random time stamps in an input. This costs //O(n log n)// to sort the times, then processes them through once to find maximum interval, which is linear. So, it ends up being //O((n log n) + n) = O(n log n)//.
+=== Quadratic Time ===
+Quadratic time arises naturally from the //Closest-Pair Problem//--finding the closest pair of points given an input of many in a given plane. This is because it simply is the full natural search space: search over all pairs of input items (//n<sup>2</sup>// total number of pairs), and spend constant time per pair. We will later in Ch. 5 go over a more efficient algorithm for this problem, though.
+Importantly, quadratic time arises from //nested loops// which must each iterate through //O(n)// times.
+=== Cubic and Higher Order Polynomial Time ===
+We get higher orders of polynomial times from similar situations. For example, we arrive at //O(n<sup>3</sup>)// cubic time by looking through //n// subsets of a length //n// set to see if any are disjoint. This is simply a triple of nested loops (for each pair of sets, search through both looking for a pair), and the same cubic time would be attained by searching a set of planar points for the triplet with closest distance.
+The both examples quickly generalizes to //O(n<sup>k</sup>)// time, as that occurs whenever searching over all subsets of size //k// (e.g. instead of looking for pairs or triplets, we look for "//k//-tuplets" from //n// planar points). A noteworthy example of this is the //Independent Set// problem: given a graph of //n// nodes, does there exist an independent set of //k// nodes? This requires simply searching all combinations of //k// nodes and seeing if they are independent (not connected via edge to another).
+=== Beyond Polynomial Time ===
+The Independent Set problem gets a lot more complicated much quicker if we instead try to find the independent set of //maximal size//. In this case we arise at the exponential //O(2<sup>n</sup>)// time. This arises naturally in cases where //all// subsets must be considered.
+Lastly, the fastest growing of those that we consider is the //O(n!)// runtime. This factorial time arises from the natural search space of matching //n// items with //n// others (e.g. the Stable Matching search space), as well as from all possible ways of arranging //n// items in order. An example of the latter is the famous Traveling Salesman problem.
+=== Sublinear Time ===
+Back into comfortable run times, the most common sublinear time is the logarithmic //O(log n)// time, of which the most famous example is the binary search. Crucially, these typically require a certain pre-existing knowledge of the data set, just as the binary search requires that the data set be sorted ahead of time. So, algorithms like this can often require preprocessing.
+This was a good section, 9/10. Definitely interesting and explains everything quite well.
+===== 2.5: More Complex Data Structure: Priority Queues =====
+As we know, a priority queue (PQ) is one that maintains a set of elements, each of which has an associated numeric value, or key. The smaller this key, the higher the priority. These support insertion and deletion and selection of element with smallest key. As is shown in the book, a sequence of //O(n)// priority queue operations can be used to sort a list of //n// numbers. So, if we can get each PQ operation to work in //O(log n)// time, then we will have an implementation of sorting in //O(n log n)// time.
+This PQ implementation takes the form of a //heap//. A heap is a binary tree whose keys are in //heap order//, meaning that the values of each node's children are greater than or equal to the parent node's value. An easy implementation of this is an array, where we call the first index 1 (which is the root of the heap), and then define for each parent node //i//, its left child is at //2i// and its right child at //2i + 1//.
+To implement the heap operations, we define ''Heapify-up'' and ''Heapify-down''. The former operates for a given node by checking if its parent's value is greater than its own, and if so, swapping the two. This is then repeated recursively until the condition is met or the value makes its way to the heap root. Heapify-down work's similarly, just by checking if the given node's value is greater than either of its children's values, and if so, swapping with the "smaller" of the children. Then this is done recursively down. Both of these guarantee the heap order and balance is kept upon insertion and deletion in //O(log n)// time due to a particular bit of cleverness: we only add or delete from the very last spot in the array. Then, to keep it balanced, say, for deletion of an arbitrary spot on the left side of the heap, we simply swap that deleted value with the value in the last spot, delete the one that was to be deleted, and then let the one from the end that we swapped into its location Heapify-up/down to its rightful place.
+=== Implementing PQs with Heaps ===
+Initializing the PQ array runs at most //O(N)//, where //N// is a constraint on the input size by being the size of the array. Due to the above Heapify operations, we can then insert and delete any element in //O(log n)// time. Finding the minimum value then takes //O(1)// time because it is simply the root of the heap--always the first element in our array. Thus, extracting the minimum value simply means removing that minimum value, therefore takes //O(log n)// time.
+So, as mentioned in our goal, we have found an implementation of the PQ for which each operation is //O(log n)//. So, we can then use this to implement a sort in //O(n log n)// time.
+This was a good section which complemented class very well. They seemed to get a little more caught in notation which led to some confusion at times, but still a 8/10 in my book.