Cory's Journal

Preface Summary

Algorithmic ideas are useful within and far beyond computer science. They do not only have many applications; algorithms are also a great way to see the over-arching goals of the field of computer science in general. Dealing with algorithms, there are two main components: getting to the mathematical center of the problem, and then identifying the appropriate algorithm design techniques. At their most effective, algorithmic ideas don't just provide solutions to well-posed problems; they allow us to express the underlying questions effectively. This section was a very simple introduction to the ideas we will be learning throughout the course. I do not have any questions on the introduction, but I do believe that I am beginning to get a feel for how important this course is to understanding major concepts of computer science (and also for being able to effectively tackle technical interviews). We are also supposed to rank each section that we summarize on a scale of 1 to 10, 10 being most interesting and readable. This section was roughly (3/10). In the future, I will just put this number at the end of each summary with the understanding that I am ranking the section on this type of scale.

1.1 Summary

The Stable Matching Problem was the result of two mathematical economists, Gale and Shapley, trying to find a college admissions or job recruiting process that was self-enforcing. In a situation where matches repeatedly change based on individuals' preferences, a lot of chaos may be generated, and both sides of the matches can end up unhappy with the process and the outcome. The Stable Matching Problem is a method to reduce this chaos. To understand the essence of the problem, we eliminate real-life complications and just look at a more bare-bones version: Consider a set M = {m1, …, mn} of n men and a set of W = {w1, …, wn} of n women. M x W denotes the set of all possible ordered pairs of the form (m, w) where m is a man in the set M and w is a woman in the set W. A matching S is a set of ordered pairs each from M x W where each member of M and each member of W appears in at most one pair in S. A perfect match S' is a matching with the property that each member of M and W appears in exactly one pair in S'. Adding preferences means that each m in set M ranks all women. m prefers w to w' if m ranks w higher than w' in his preference list. Likewise, all women rank all the men. Our goal is a perfect matching with no instabilities. S is stable if it is perfect and there is no instability with respect to S. It is possible for an instance to have more than one stable matching. To go through the algorithm, initially all m in M and w in W are single, or free. While there is a man m who is free and hasn't proposed to every woman, you choose one of the free men, m. Have him propose to each w in the order that they appear on the man's preference list. If w is free, (m, w) become engaged. Otherwise, w is currently engaged to m'. From here, if w prefers m' to m, m remains free and keeps searching. Otherwise, w prefers m to m' and (m, w) become engaged while m' becomes single. This process is repeated until there are no single men, or until free men have proposed to each woman. (7/10) My question for section 1.1: How useful would this solution be in a real-life situation with multiple real-life complications?

2.1 Summary

A major focus of the book is finding efficient algorithms for computational problems, which seems to encompass all of computer science. So to approach this specifically, first we identify broad themes and design principles in the development of algorithms. A property shared by many of the problems we study is that they are discrete, meaning that they involve a search over many combinatorial possibilities with the goal of efficiently finding a solution to satisfy clearly delineated conditions. Algorithms must be efficient. Particularly, space or memory used by an algorithm is a big issue. The proposed definition of efficiency is that an algorithm is efficient if, when implemented, it runs quickly on real input instances. This definition omits where and how well we implement an algorithm. This definition also does not account for how an algorithm may scale as the problem grows. Analyzing the worst-case of an algorithm has often been found to do a reasonable job of capturing its efficiency in practice. And if the worst-case can be found to be efficient, the algorithm is efficient. The search space on the Stable Matching Problem, for example, is very large, and the brute-force algorithm would plow through all perfect matchings in numerical order. This means that brute-force is a very slow, inefficient solution to a problem. So, another proposed definition for efficiency is that “an algorithm is efficient if it achieves qualitatively better worst-case performance, at an analytical level, than brute-force search.” Polynomial running time is when an algorithm holds to the principle where for two constants c > 0 and d > 0, on every input instance of size N, its running time is bounded by cN^d primitive computational steps. Another proposed definition of efficiency is that an algorithm is efficient if it has a polynomial running time. This is also good because it becomes negatable; it is possible to express the notion that there is no efficient algorithm for a particular problem. (6/10) This leads to my question: Just how efficient is the Gale-Shapley solution to the Stable Matching Problem?

2.2 Summary

An algorithm's worst-case running time on size n inputs grows at most at a rate proportional to some function f(n). Algorithms can be expressed in pseudo-code, that resembles a high-level programming language. We want to express the growth rate of running times and other functions so that constant factors and low-order terms are not counted; for example, with a running time of 1.62n^2 + 3.5n + 8, we'd say that it grows like n^2 up to constant factors.

Asymptotic Upper Bounds

Let T(n) be a function, or the worst-case running time of an algorithm with input size n. With another function f(n), T(n) is O(f(n)) if the function T(n) is bounded above by a constant multiple of f(n); T(n) = O(f(n)). In other words, T(n) = O(f(n)) if there are constants c > 0 and nO ≥ 0 so that for all n ≥ n), we have T(n) ≤ c*f(n). In this case, T is asymptotically upper-bounded by f. Functions can have many upper bounds.

Asymptotic Lower Bounds

To show that the upper bound is the best possible running time, we show that for arbitrarily large input sizes n, T(n) is at least a constant multiple of some specific function f(n). Thus, T(n) = Ω(f(n)) if there exist constants ε > 0 and nO ≥ 0 so that for all n ≥ nO, we have T(n) ≥ ε*f(n). Here, T is asymptotically lower-bounded by f. ε must be fixed, independent of n.

Asymptotically Tight Bounds

If we can show that T(n) is both O(f(n)) and Ω(f(n)), then we've found the “right” bound: T(n) grows exactly like f(n) to within a constant factor. The notation to express this is as follows: if T(n) is O(f(n)) and Ω(f(n)), then T(n) is Θ(f(n)), and we say f(n) is an asymptotically tight bound for T(n). These are nice things to find on worse-case running times because they characterize the worst-case performance of algorithms precisely up to constant factors. One can obtain such bounds by closing the gap between upper and lower bounds.

Some properties that these asymptotic growth rates have are transitivity and sums of functions (we can get results by adding the two functions). Some functions that come up repeatedly in the analysis of the algorithms are polynomials, logarithms, and exponentials.

Though I do not have a direct question for this section, I am still having difficulty fully grasping the concept of these bounds: how they are found, and what exactly we use them for. I will continue to study them in hopes of fully understanding them as soon as possible. (8/10)

2.3 Summary

For each algorithm, the choice of data structure is up to the algorithm's designer. An array has these properties: -We can find the ith element in the arry in O(1) (constant) time. -Looking for an element e in the array is O(n) (linear) time(if unsorted). -If the elements in the array are sorted, we can find the element e in the array in O(logn) time with binary search. Arrays are less good for maintaining a dynamic list of elements that are frequently added to or removed from. Linked lists are an alternative, where each element points to the next in the list. Each pointer must be maintained, and the list can be traversed in time proportional to its length. A doubly linked list can also be used, elements pointing to previous and next elements. Linked lists are not as useful for finding the ith elements, because this takes O(i) time. Arrays and linked lists can be used to implement the Stable Matching algorithm. With these, the G-S algorithm can be implemented in O(n^2) time. Question: Is there an even faster way? Interest: 8/10

2.4 Summary

Linear Time: Algorithms that run in O(n) time have running times at most a constant factor times the size of the input. This is generally found by processing the input with a single pass. O(nlogn) time is a common running time: It is the running time of any algorithm that splits its input into two equal-sized pieces, solves each piece recursively, and then combines the solution in linear time. Another common running time is quadratic time, or O(n^2). This is commonly found by performing a search over all pairs of input items and spending constant time per pair, or from a pair of nested loops. Cubic (O(n^3)) time can be seen with more elaborate sets of nested loops. It is unclear if the algorithms that lead to this are practical on inputs of reasonable size. Finally, we have O(n^k) time running time when for any constant k, we search over all subsets of size k. Question: Are there other running times between these that just aren't as common? Interest: 9/10

2.5 Summary

Priority queues are useful when describing how to implement graph algorithms. It is designed for applications where elements have a priority value, or key, and each time an element is selected from a set, we want to take the one with highest priority. A priority queue is a data structure that maintains a set of elements where each element has an associated value key that denotes the element's priority. A heap is used to implement a priority queue. The heap data structure combines benefits of a sorted array and a list. We think of a heap as a balanced binary tree; the tree will have a root and each node can have up to 2 children. The keys in the binary tree are in heap order if the key of any element is at least as large as the key of the element of its parent. Heapify-up is used to “fix” the tree if this condition is not met. Heapify-down is used if the key is too big and one of its children. Together, Heapify-down and Heapify-up ops can efficiently implement a priority queue. Question: What are priority queues good for compared to arrays or linked lists? Interest: 8/10

3.1 Summary

A graph G is a way of encoding pairwise relationships among objects. In a graph we have a collection V of nodes and a collection E of edges, where two nodes are connected to each other. If we want to represent asymmetric relationships between nodes (one node points to the other, but not vice versa), we use a directed graph. When we want to emphasize that a graph is not directed, we call it an undirected graph, but by default, “graph” refers to an undirected graph. Some useful examples of graphs are as follows: Transportation networks (maps of routes between airports), Communication networks (maps with computers as nodes and network connections as edges), Information networks (World Wide Web graph), Social Networks (nodes of people who interact, edges are their connections or interactions), and Dependency Networks (directed graphs that show interdependencies among a collection of objects, like a food web). A traversal of a sequence of nodes connected by edges is called a path. The path is simple if all its vertices are distinct. A cycle is a path in which all nodes but the last are distinct (if there are more than 2 nodes). An undirected graph is connected if there a path to every node from every other node. The distance between two nodes is the minimum number of edges required to get to one of the nodes from the other. An undirected graph is a tree if it is connected and doesn't contain a cycle. A node x is the parent of node v if x directly precedes v on its path from an original point r (in an oriented tree). Node w is the child of v if v is the parent of w, and w is a descendant of v if v lies on the path from the root r to w. Furthermore, a node y is a leaf if it has no descendants. Such rooted trees are fundamental in computer science because they lead to the notion of a hierarchy. Q: Are there other types of graphs that we will see? Interest: 6/10 (seen this several times before in other classes, but it's a nice review)

3.2 Summary

We want to find an efficient algorithm that determines s-t connectivity, or in other words, determines whether there is a path from node s to node t in a graph G. This could also be called the Maze-Solving Problem. The simplest algorithm for determining s-t connectivity is breadth-first search that explores outward from s in all possible directions, adding nodes one layer at a time. This is continued until no new nodes are found. Breadth-first search naturally produces a tree T rooted at s on the set of nodes reachable from s. A tree produced this way is called a breadth-first search tree. All nodes reachable from s are found by the BFS, and are known as the connected component of G containing s. Another way to find nodes reachable from s is the approach that may be taken if G were actually a maze: follow the first edge leading from v and continue until a node whose neighbors have all already been explored is found. Then, you must backtrack to a node with an unexplored neighbor and resume the search from there. This algorithm is called the depth-first-search (DFS) because it explores G by going as deeply as possible in one direction and only backtracks when necessary. A tree built from this is called a depth-first search tree of the connecting component R. Both DFS and BFS build the connected component containing s and achieve qualitatively similar levels of efficiency. DFS visits the same set of nodes as BFS, but usually in a different order. For any two nodes s and t in a graph, their connected components are either identical or disjoint. Q: Is there another type of search not based on the connected components? Would that be much less efficient? Interest: 7/10

3.3 Summary

BFS and DFS differ essentially in that one uses a queue and one uses a stack. There are two main ways to represent graphs: by an adjacency matrix and by an adjacency list representation. Adjacency lists are used mainly in our book. An adjacency matrix is an n x n matrix A where A[u, v] is equal to 1 if the graph contains the edge (u, v) and 0 otherwise. The matrix A is symmetric if the graph is undirected. This method allows us to check if a given edge is in the graph in constant time. However, this takes up Θ(n²) space, and many algorithms are needed to examine all edges incident to a given node. An adjacency list works better for sparse graphs with many fewer than n² edges. This representation has a record for each node, with a list of the nodes to which each node has edges. This adjacency list only requires O(m+n) space. A queue is a set from which we extract elements in first-in, first-out (FIFO) order. A stack is a set from which we extract elements in last-in, first-out (LIFO) order. An adjacency list is ideal for implementing a breadth-first search, and can also be useful for depth-first searches. Discovering a node is the first time it is seen in an algorithm, while exploring the node means that all incident edges to v are scanned, resulting in the potential discovery of further nodes. There is a distinction between the two in both BFS and DFS. The overall running time of both with an adjacency list is just O(m + n). Q: When is an adjacency matrix most useful? Interest: 7/10

3.4 Summary

A bipartite graph is a graph where the set of nodes V can be partitioned into sets X and Y such that every edge has one end in X and the other in Y. If a graph is a bipartite, it cannot contain an odd cycle. In fact, odd cycles are the only obstacle for a graph being bipartite. Say that each edge must connect to one “red” and one “blue” node. Pick a node in the graph and color it red. Then, all of its neighbors must be colored blue. It then follows that all the neighbors of THESE nodes must be red, their neighbors must be blue, and so on until the whole graph is colored. If this works, the graph is bipartite. If an edge connects to two ends of the same color, then there is nothing that can be done, and the graph is not bipartite. This coloring process is essentially identical to a BFS. Following this, with the graph G and the layers produced by the BFS L1, L2,… exactly one of the following two things must hold: 1)There is no edge of G joining two nodes of the same layer. In this case G is a bipartite graph in which the nodes in even-numbered layers can be colored red, and the nodes in odd-numbered layers can be colored blue. 2) There is an edge of G joining two nodes of the same layer. In this case, G contains an odd-length cycle, and so it cannot be bipartite.

3.5 Summary

Many basic definitions and algorithms, like the adjacency list representation and graph search algorithms (like BFS, DFS) have applications with directed graphs, not just undirected. Directed graphs are represented with a version of the adjacency list that is used for undirected graph. But now, each node has two lists associated with it: one list of nodes to which is has edges, and another list of nodes from which it has edges. The graph search algorithms are almost the same in directed graphs as in undirected, and the running time is still O(m+n). Directed graphs are strongly connected if, for every two nodes us and v, there is a path from u to v and a path from v to u. If there is a path from u to v and from v to u, the two nodes are mutually reachable. If u and v are mutually reachable, and v and w are mutually reachable,then u and w are mutually reachable. For any two nodes s and t in a directed graph, their strong components are either identical or disjoint.

3.6 Summary

If an undirected graph has no cycles, each of its connected components is a tree. A directed graph with no cycles is called a directed acyclic graph, or a DAG. DAGs can be used to encode precedence relations or dependencies in a natural way. If G has a topological ordering, then G is a DAG. Likewise, if G is a DAG, then G has a topological ordering. In every DAG G, there is a node v with no incoming edges. Computing a topological ordering of a graph can be done (and is done in the book) in O(n^2) time.