This is an old revision of the document!

Anne Bailey's Journal

Preface:

Algorithms unravel the secrets of every science. They problem solve through mathematical cores and identify the nature of the problem they are solving. The point of algorithms are to both provide solutions and to construct the framework underlying the problems they are used to evaluate. We see this framework while trying to develop the most efficient possible algorithms. The point of studying algorithms is to better understand how to efficiently answer questions as they are posed. I would consider this preface a 3/10 in terms of how interesting I find it. (3/10)

1.1: Stable Matching

A self-enforcing match is one in which agents acting in their own self interest could not break a match for mutual gain on each side. The idea developed by Gale and Shapely behind this algorithm is that individual self-interest will reinforce an algorithm to assure that any matches made will not be broken after all assignment have been doled out. It is easiest to approach a one-to-one match so that applicants and acceptances are equal and mutual. Though matching is just assigning a single set of outcomes, perfect matching means that we follow this one-to-one rule. Preferences are the basis of self-interest that influences stability; without preferences no one would care with whom they were matched and matching could be a rolling process in pursuit of convenience and efficiency alone. To match perfectly, we must start trying to match persons and giving every person the chance to break their initial agreement in favor of another person – who may either accept or reject the person. If both are better off in their own self-interest by switching, this has eliminated an unstable match from the process. The algorithm to match men and women exclusively and heterosexually in marriage specifically favors men – they act in order of preference when women do not. This process goes on for n2 times at worst, which is a fairly long run time. Since the algorithm will not terminate until all are matched and it checks at each point for stability, the algorithm accomplishes what it sets out to do by matching a stable set of couples. In the long run, all orders of running of the algorithm will yield the same stable matching. (7/10)

2.1: Computational Tractability

Studying algorithms helps introduce students to the structure of computational theory; they are effectively discrete, at most countably infinite, since computers may not truly evaluate infinite data sets. This would require an infinite amount of memory with which the computer could store data to analyze its implications. For this reason, efficient algorithms are a cornerstone of effective computation: without an efficient algorithm, a computer will waste time and memory resources that could be used for other problems. Efficiency is defined by how poorly a case may go. Though an average-case may be more effective in evaluation of runtimes and memory use, worst-case scenarios are helpful in avoiding a brute-force method of computation. To find a case with lower run time at worst is often a better choice. “Proposed definition of efficiency (3): An algorithm is efficient if it has a polynomial running time.” This still does not assure that the most efficient algorithm will run better in every practical case. “There is no efficient algorithm for a particular problem.” My question here is how no algorithm may be especially efficient – does a lack of proof suggest this conclusion, or is it purely subjective? (9/10)

2.2: Asymptotic Order of Growth

Algorithms are mainly expressed in terms of pseudo-code and best/worst possible running times. The growth rate of running times with the addition of inputs is important because over time, an algorithm that significantly slows with inputs will not effectively use computer space. For this reason, with x being a mathematical expression denoting running time, O(x) is the worst case running time, Omega(x) is a best case running time, and if they are equal, they are equal to Theta(x), which is the tight bound that shows the actual running time of an algorithm. This is because all running times will be less than or equal to O(x) and greater than or equal to Omega(x). Due to the inequality, if they meet in the middle, the running time must be the point at which they meet. These tight bounds may yield from limits as n goes toward infinity, or they may be calculated from ratios with other functions whose tight bounds are known. For polynomial-time algorithms, the order is of a constant d such that O(nd) is the asymptotic running time regardless of the size of the inputs. The same logic applies to logarithms with O(nx¬¬) and exponential algorithms with O(rn). (8/10)

2.3: Implementing the Stable Matching Algorithm Using Lists and Arrays

We use algorithms to look at specific problems in which we use different data structures. Our prior Stable Matching algorithm ran, at most, with O(n^2). We can achieve this using lists and arrays. Choosing a data structure is, however, subjective. We can weigh the running time trade-offs in order to select the best data structure for the problem at hand. Arrays are not great for keeping lists that change constantly since they cannot grow and the nodes are stuck in a specific placement in the list. Linked lists are better for lists that change frequently since it is possible to insert and delete elements in a sorted order with shorter running times. We must iterate through a list until the ith element to access it, though, so this is one such trade off of using a linked list. Access of the ith element of an array is O(1), whereas this list implementation is O(i); this is an illustration of one of the running time trade offs. To translate between the two forms, it takes O(n) running time. To use arrays and linked lists to implement Stable Matching will should use arrays for preference lists, indices to refer to each man and woman. In this style, we should assign a specific data structure to every pertinent piece of information. Whichever structure requires the highest runtime to access the data will combine with the amount of run-throughs to determine the worst-case running time. These steps will combine to make the G-S algorithm run in O(n^2) at worst with implementations of arrays and doubly linked lists. (4/10 – We already knew this – No questions)

2.4: A Survey of Common Running Times

To build an algorithm, we should know what running time the algorithm will require, both by nature of the amount of information, and by how long it will take to use this information to attain an answer. O(n) is normally the size of the input times constant access, so it is likely that this type of algorithm will have to look through all data once. This can occur in a maximum-finding program that goes through elements one by one and records the so-far maximum at every data point. This order also applies to merging two sorted lists since it involves simply going through each list and adding the next smallest item until each list is empty. O(n log n) applies to the algorithms that split inputs in half, solves each recursively, and outputs a combination of the two solution with O(n) time. Sorting can be solved this way (Mergesort) by dividing inputs into two pieces, solving, and then merging the two sorted lists. This will also be common where an algorithm’s longest step is sorting. O(n^2) will apply to most algorithms where you must run through n inputs n times. It seems to be a brute force response to algorithms that comes from nested loops. Cubic time will describe a triple nested loop of any type where there are three data sets to compare. O(n^k) follows the above pattern to describe any search k times over n amounts of items. It is not a quick running time, but it can be necessary in a brute force approach to algorithm solving. Above and beyond polynomial time will often be O(2^n) or O(n!), which will come from comparing subsets of different sizes. n! grows more quickly than 2^n, but both grow so quickly that there will often be a better running time solution available. Finally, some cases will be smaller than linear but larger than constant. For example, binary searches run in this method.

My question about this section is how we can know with certainty that a proposed solution to an algorithm is efficient. The section provided a wealth of examples without sketches, and it felt like a bit much information in one place. (7/10)

2.5: A More Complex Data Structure: Priority Queues

Our approach to algorithms is to find a brute-force solution, then to improve upon it as best possible. A good goal is polynomial time, or better. The algorithm to evaluate in this section is the priority queue implementation. For this, we have a priority value (key), and we always want to select the highest priority from the list. Things won’t come in in order of priority, so it must be that processes are assigned a position in a data structure at the addition independently of the priority it has been assigned. “O(n) priority queue operations can be used to sort a set of n numbers” since extracting the smallest number may be executed n times with O(n). Therefore, if items can be put in and taken out with O(logn), then it will sort n numbers with O(nlogn) running time. The data structure that will yield this result is a heap, since it can be implemented as a balanced binary tree of maximum height logn. “Heap order: For every element v, at a node I, the element w at I’s parent satisfies key(w)⇐key(v).” An array of n nodes can maintain this heap with parent nodes at i, and children at 2i and 2i+1. When we add items or move items in the heap, we use heapify-up and heapify-down to move around the items and make sure that they are in the proper nodes such that their priorities are correct in relation to the other nodes’ keys. A table of running times for implementing priority queues with heaps is available on page 64. (7/10 – No questions since we already studied this in class)

3.1: Basic Definitions and Applications

A graph G is a way of showing pairwise relationships, called edges, amongst nodes. Edges can be undirected and display mutual relationship, or edges may be directed and relay asymmetric relationship. An edge between two nodes (u and v) in an undirected graph may be shown as e = {u,v} and u and v have no particular order. For a directed graph, however, edges are represented as e = (u,v) which show u and v as an ordered pair for which u directs toward v. These graphs model communication and transportation networks well in both directed and undirected forms, depending on the nature of the relationship tracked. A path in an undirected graph is a sequence of nodes such that each consecutive pair is joined by an edge in the graph. Simple paths only hit any given vertex once at most, and a cycle is a path in which the sequence of nodes visited in the path is visited cyclically – at least one node will be in the path more than once. Undirected graphs are connected if there is always some path between u and v, even if it reaches through other nodes. Strong connection means that every node has a path directed out to every other node. We can use connection to see how objects in a graph are related, and through how many steps we must go to connect any items. Trees are undirected graphs that do not contain a cycle. By picking a root in the tree below which to place all other nodes, we can view the tree as a hierarchical structure. With such hierarchy, the direction of relationships gains a new meaning, such as with how one node above another may be its parent, and the node below a child thereof. Trees of n nodes have n-1 edges. (7/10 – very helpful, very readable)

3.2: Graph Connectivity and Graph Traversal

Within graphs, we will want to be able to evaluate the connectivity of two elements without depending on visual representation, so we must therefore evaluate the breadth-first and depth-first searches as they allow us to answer questions about such connectivity. Breadth-first searches select the initial node as a root, then work out from the initial node in layers of how closely connected nodes are to the root (how many edges it takes to get from the initial node to the one for which we are evaluating connectivity). The strength of this search is that we will find both the shortest path from one node to another. Therefore, in whatever layer of nodes the target node occurs, it is connected as closely as possible to the original node. Worst case, this algorithm should run in O(n+m) since for each node in the connected graph, the algorithm will try to run and add a new element to a new layer with every single edge. (n = number of nodes, m = number of edges). Depth-first search explores single paths at a time instead of multiple layers of connections at once. It will look at long and twisting paths before coming closer to the root node in question. This search will also run at worst on a connected graph with O(n+m) since for every edge the algorithm will look at what is on the other end of that edge. When no more edges exist, the algorithm spends on looking for a child and then going back up to a previous node, and checking for an unexplored edge off of that node. Every single node will be checked at least once, and every edge will be eventually traversed. I’m not entirely certain where the O(n+m) comes from in DFS; I completely understand that each edge traversal adds to the time, but it seems that each node would be visited as such just by following the edges. (7/10)

3.3: Implementing Graph Traversal Using Queues and Stacks

Graphs can be represented as adjacency matrix or list representations, which are matrices or lists that name between which nodes an edge will exist. The number n of vertices and m of edges are the parameters on which a graph is built. O(m+n) as stated in my entry on 3.2 will be a good runtime for the searches through different graphs; it functions as linear time and in connected graphs, O(m+n) = O(m) since m is greater than or equal to n-1. This answers my only lingering question from the previous section. Adjacency matrices take up Theta(n^2) space, so if the graph is not strongly connected according to the definition in 3.1, there may be better ways to condense this data. Adjacency lists, however, take only O(m+n) since each edge contributes twice to the sum, and there are m connections by edges to take into account, we are looking at the total amount of connections to evaluate being 2m, which is O(m). DFS and BFS both are critical as to the order in which they take and consider elements. Queues are first in, first out while stacks are last in, first out data structures. The DFS will work well as a stack since we are constantly pushing things back off of our list of explored elements to retrace steps through nodes and take paths that we could have taken before. The BFS will run in O(m+n) on a queue or stack implementation since the time spent on setting up lists and managing an array about which nodes have already been discovered will be O(n) and the time on edges O(m): this will therefore run as O(m+n). Finally, we can use these searches to find all of the connected components in a graph by setting them to run while adding newly discovered nodes to an array or list. (9/10 – I found this section very helpful in clearing up my confusions on BFS and DFS and their implementations)

3.4: Testing Bipartiteness: An Application of Breadth-First Search

Testing for nodes to fit cleanly into two layers in which nodes from the same layer are never connected. Every edge has a node in each of the layers. Pick a node s, color it red, and declare all of its neighbors blue. If you can go through the whole graph using this method without ever changing a node’s color, we have a bipartite graph. This approach is effectively a Breadth First Search since we look in layers at which nodes are connected. Further, in this algorithm we color nodes based on which layer they are in, which is a simple constant action. Then, we run through the edges and see if any have both nodes in a single layer. This process has the same runtime as your basic breadth-first search, O(m+n) with n nodes and m edges in the graph. We can use this algorithm to split any distinct groups of data. I have no questions about this problem, it is a very straightforward algorithm. (5/10 – This was a very readable section, but we already learned this well.)

3.5: Connectivity in Directed Graphs

Edge (u,v) in a directed graph will go from node u to node v, it is the same type of relationship as that of following hyperlinks in a Wikipedia page. Once you follow a hyperlink in the middle of a Wikipedia article, the link doesn’t appear for you to go back along the same path to the page you were just reading. We use adjacency lists and matrices to keep track of relationships in directed graphs. One list keeps track of nodes that lead to a target node, and another list will keep track of nodes to which the target node leads with an edge originating at the target. Breadth First Search will be similar to how it worked in an undirected graph, but instead we will only explore edges that lead out from a node. The nodes in a layer j appear by the shortest path from s to that node in j – the shortest path will have j edges. The worst case will be that this breadth first search runs the same as the original, O(m+n). Strong connectivity means that each node leads to all other nodes and all other nodes lead back to it. To test this, it takes only linear time to run through and see if all of the nodes are properly connected. Running a BFS search also works, since we can run a BFS in the normal graph G, then in the reverse graph Grev. Failing to reach every node in either of the searches means there is no strong connectivity. To run this twice has the original runtime of O(m+n). A strong component is the set of nodes to which a node s is mutually reachable. I found this section very helpful in understanding directed graphs and how to traverse them. (8/10)

3.6: Directed Acyclic Graphs and Topological Ordering

Undirected graph with no cycles is always a tree for each connected component. Directed graphs can have complex structures without directed cycles. This is called a DAG. They are very useful for prerequisites and other dependent relationships. A topological ordering of such a DAG is any order of nodes such that a node must always come before any to which it leads. If this exists, the graph is a DAG by implication. To create such a topological ordering we use an algorithm that evaluates relationships. First, we need a root node with no incoming edges, which exists in every DAG. Then, by induction we can create a topological ordering with O(n2) run time in total. An O(m+n) algorithm exists with deletion of nodes with no left incoming edges not yet explored. I found this piece well written and helpful. (7/10)

4 Preface: Greedy Algorithms

Greedy algorithms build up solutions slowly by picking optimal pieces to a solution that fit a certain optimization criterion. It follows any path that is most optimal to build up the best possible solution. Therefore, our greedy algorithm should produce an optimal solution – one on which we cannot improve. We use such an algorithm for finding shortest paths. We also use such algorithm for a multitude of other applications that require we choose the best possible combination of solution outputs. (1/10)

4.1: Interval Scheduling: The Greedy Algorithm Stays Ahead

Interval scheduling, which was first addressed in the first chapter, looks to collect a group of compatible (non-overlapping) tasks taking up the most possible collective time or tasks, depending on the value of each task versus the time it takes to complete. Starting with the earliest task not be optimal since accepting a long first task will reject a lot of tasks that overlap. The smallest interval of time doesn’t guarantee that conflicts will be minimized. Least conflicts does not guarantee that the best choice doesn’t have the most conflicts with very short other tasks that all happen at the same time. We eventually realize that the earliest finish time is a good way to look at which collection of tasks will be optimal. We run through to collect the next closest finishing time that doesn’t conflict with any prior selections. This will always run in a way that forms an optimal solution, since any other choices could not improve on the maximum size of the solution already found. The runtime can be O(nlogn) since we sort the n requests in finishing order, then in O(n) find an optimal solution with our greedy algorithm. Similarly, finding an algorithm that schedules every interval in the order that will choose as many intervals as possible, looks for a maximum depth of intervals and labels each. This is always using a minimum number of labels. (9/10)

4.2 Scheduling to Minimize Lateness: An Exchange Argument

We want to use a resource and process n requests to use that resource for certain intervals of time. Each request i hopes to be done by deadline di¬ when it has length ti. Lateness for request i is the finish time f(i) = s(i) + ti where s(i) is the time at which we start it. We sum these all for our order chosen to finish the requests and try to minimize the total lateness of all of the requests. Our greedy algorithm is going to go in order of earliest deadlines first. This algorithm leaves no gaps of unused time, and all solutions produced will have the same maximum lateness L. An inversion is a job with an earlier deadline scheduled before a job with a later deadline. Optimal solutions produced by our specific algorithm should not have inversions. Swapping a solution with such an inversion will not increase maximum lateness. I finally understand what the deal with inversions is now. (7/10)

4.4 Shortest Paths in a Graph

This is the application of greedy design to shortest paths on graphs. We have a start node s and want to find the shortest path to each other node. We go through each incident node and build up shortest paths. We then go through the incident nodes and those edges not yet explored from them. If we run across a node that we can get to through others that has already been visited by a shorter path than already recorded, we rerecord the path length. Newly explored nodes get the length of the shortest path to the node prior plus the length of the edge from that node to itself. This is dijkstra’s algorithm, and it runs well so long as edges do not have negative lengths. It runs with a total of O(mn) times since it runs through each node and each edge attached to each node to determine minimum path length. With heap-based priority queues this can run in O(mlogn). I’m not very clear on the implementation run times given on this problem. (7/10 – didn’t enjoy, but learned a lot from it)

4.5 The Minimum Spanning Tree Problem

Problem: we want to build a communication network over a graph of n nodes connected by individual relationships. There is a cost of each edge between two nodes. The minimum cost solution to our problem will be a tree over all the nodes in V, our connected graph. If the solution is a tree, it is a spanning tree, and a lowest cost solution (with the lowest sum of all the weights of the edges) is a minimum spanning tree.

Kruskal’s: Start with no edges, build a spanning tree by inserting edges by increasing weights as long as they do not create a cycle. Produces a minimum spanning tree.

Prim’s: Start with root node s and build a tree out by adding the cheapest edge to our current tree that adds a new node. Produces a minimum spanning tree. We can implement both of the above in O(mlogn) time. Prim’s uses a heap-based priority queue to achieve this time.

Reverse-Delete: Start with a full tree and delete the edges with the largest cost first as long as the edge deleted would not disconnect the graph. Produces a minimum spanning tree.

Cut Property: all distinct edges. e = minimum-cost edge with one end in any subset S and the other in V-S. Then every minimum spanning tree contains e.

Cycle Property: When C is a cycle, and e is the most expensive edge in the cycle, e will not be in any minimum spanning tree. (8/10 – not particularly readable)

4.6 Implementing Kruskal’s Algorithm: The Union-Find Data Structure

Find the set of connected components. We look at a graph over time as edges are added. Therefore, we will store in Union-Find for easy search and update functions. Find(u) will return the name of the set containing u. As we add edges, if u and v are already in the same set we have no problem. Otherwise, we use Union(Find(u),Find(v)) to merge the connected components of the nodes on each end of the edge being added before adding the edge itself. This structure has three operations: MakeUnionFind(S), Find(u), and Union(A,B). These have O(n), O(logn), and O(logn) respective running times. We make an array Component to hold the name of the set that holds each element. We set the set in which an element is located and find elements in constant time, and we merge two sets in at most O(n) time. k Union operations will take at most O(klogk). We want to improve this to being at worst O(logn) at the cost of increasing the Find time to O(logn) and evening out the running times. We will use pointers to make this happen. Union will take O(1), MakeUnionFind O(n), and Find O(logn). Every node will have a pointer to the name of the set to which it belongs. I don’t quite get why we have O(logn) here for Find. We use this data structure to implement Kruskal’s by sorting edges by cost with time O(mlogm), and m is less than or equal to n2, so it is also O(mlogn). After sorting, we use the structure to keep track of connected components. We go through at most 2m Finds and n-1 Union operations, so we have a total of O(mlogn) runtime. (9/10)

Table of Contents