Joey's Journal

Chapter 2

2.1 This chapter starts off by critiquing the proposed definition of efficiency: “An algorithm is efficient if, when implemented, it runs quickly on real input instances.” Upon reading that proposed definition, I knew that it was wrong (and not just because the text said “proposed” definition). Sure enough, the text continued and stated that even bad algorithms can run fast on small data sets and good hardware, and good algorithms can run slowly if they aren't coded well. The text states that we need an definition of efficiency that is platform-independent, instance-independent, and of predictive value with respect to increasing input sizes. The next proposed definition of efficiency given by the text is “An algorithm is efficient if it achieves qualitatively better worst-case performance, at an analytical level, than brute-force search”. This definition isn't good either, because it still does not give a concrete definition of what is efficient… even non-efficient algorithms might run faster than a brute-force search. The third proposed definition of efficiency is an algorithm that has a polynomial running time. This seems to be pretty good because according to the text, in practice this has held true. Another reason it is good is because there is a possibility of “no efficient algorithm for a particular problem.” The main point of this section is our measure of efficiency needed to be concrete, platform-independent, instance-independent, and of predictive value with respect to increasing input sizes. The polynomial run time definition achieves those criteria.

2.2 This section starts off discussing how to express efficiency. It gives the example of “On any input of size n, the algrithm runs for at most 1.62n^2 + 3.5n + 8 steps. As we discussed in class, this example is far too exact because as n gets arbitrarily large, the coefficients' value is out paced by the size of n and become negligible. The text then starts explaining about upper bounds and lower bounds. Without going into as much detail as the text, the asymptotic upper and lower bounds are not affected by a coefficient (as I stated a moment ago). The text then proves that the asymptotic rate of growth is determined by the highest order-term. What this means is that f(n) = n^2 + n^3 has an upper bound of n^3.

2.3 The first point made in this section is that we have to take into account how the data will be represented an manipulated in an implementation of the algorithm because this is also has to do with the running time. This section compares and contrasts arrays and lists and gives advantages and disadvantages of both. The text goes on and suggests that each time we need to use a data structure, we should analyze the running time of that piece of the algorithm and try to keep as small as possible.

2.4 There are some common running-time bounds when it comes to algorithm analysis and they are usually due to come common distinct element of the algorithm: O(n), O(n^2), and O(n log(n)). Linear time is at most a constant factor times the size of the input. It arises from “one-pass” type situations. O(n log(n)) running times commonly arise when the algorithm splits up a input into two equal sizes, solves each piece recursively, and then combines that processed data back together. O(n^2) running times often come from a nested for each loop. Reasons for cubic running time and O(n^k) are very similar to reasons for quadratic running times… usually nested for loops k deep. Other common running times that are not polynomial and therefore not efficient are 2^n and n!. Those running times usually arise on brute-force algorithms. Sub-linear running times like O(log(n)) can arise in algorithms like the binary search tree, where not all of the input need be read.

2.5 This section introduces the idea of a priority queue. A priority queue is a data structure that is one of the most broadly useful sophisticated data structures. We will be using the priority queue to sort items and are aiming for running-time of O(n log(n)). Our implementation of the priority queue will be a heap. A heap is a binary data structure in which each child has a greater key than the parent. The definition provided by the text is “Heap Order: For every element v, at node i, the element w at i's parent satisfies key(w) ⇐ key(v). We will put the heap into an array and define children of node i to be 2i and 2i+1. The text then gives the algorithms for adding and removing items from the heap. Here is the algorithm for adding an element to the heap as provided by the text (O(log(n)):

Heapify-up(H, i):
  If i > 1 then
    let j=parent(i)=Floor[i/2]
    If key[H[i]] < key [H[j]] then
      swap the array entries H[i] and H[j]
      Heapify-up(H, j)
    Endif
  Endif

And here is how to remove an element from the heap (O(log(n)):

Heapify-down(H, i):
  Let n = length(H)
  If 2i > n then 
    Terminate with H unchanged
  Else if 2i < n then 
    Let left = 2i, and right = 2i + 1
    Let j be the index that minimizes key[H[left]] and key[H[right]]
  Else if 2i = n then
    Let j = 2i
  Endif
  If key[H[j]] < key[H[i]] then
    swap the array entries H[i] and H[j]
    Heapify-down(H, j)
  Endif

Then the text gives the different functions and running times of the heap:

StartHeap(N) - O(N)
Insert(H, v) - O(log(n))
FindMin(H) - O(1)
Delete(H, i) - O(log(n))
ExtractMin(H) - O(log(n))

Thoughts The part about asymptotic upper and lower bounds of algorithms was kind of dense, and I didn't really understand how the same algorithm could have two upper bounds at the same time… Everything else was pretty clearly conveyed in class and the reading just reinforced it. Learning about the heap structure this section was pretty cool.

Readable/Interesting 7

Chapter 3 - Graphs

3.1 Basic Definitions and Applications A Graph is a way of encoding pairwise relationships among a set of objects. It consists of nodes and edges to join the nods. A directed graph consists of edges that are ordered pairs. Edges “leave” and “enter” nodes. The term graph alone means undirected. A node is also called a vertex. One of the fundamental operations in a graph is traversing a sequence of nodes connected by edges. The distance between two nodes is the shortest path between them. A graph is strongly connected if there is a path from every node to every node. An undirected graph is a tree if it contains no cycles. Deleting any edge from a tree disconnects it.

3.1) Every n-node tree has exactly n-1 edges.
3.2) Let G be an undirected graph on n nodes. Any two of the following statements implies the third: (i) G is connected, (ii) G does not contain a cycle, (iii) G has n-1 edges.

3.2 Graph Connectivity and Graph Traversal Determining s-t connectivity is to determine if nodes s and t are connected. This is also called the maze-solving problem. Breadth-first search (BFS) = Explore outwards from s in all possible directions, adding nodes one “layer” at a time.

3.3) For each j >= 1, layer L(j) produced by BFS consists of all nodes at distance exactly j from s. There is a path from s to t iff it appears in some layer.
3.4) Let T be a breadth-first search tree, let x and y be nodes in T belonging to layers L(i) and L(j) respectively and let (x, y) be an edge of G. Then i and j differ by at most 1.

Depth-first search (DFS) = continue down the branches, etc, until reaching a dead-end.

Algorithms:
  BFS
    R will consist of nodes to which n has a path
    Initially R = {s}
    While there is an edge (u, v) where u∈R and v∉R
      Add v to R
    Endwhile
  End
  
  DFS(u):
    Mark u as "Explored" and add u to R
    For each edge (u, v) incident to u
      If v is not marked "Explored" then
        Recursively invoke DFS(v)
      Endif
    Endfor

3.6) For a given recursive call DFS(u), all nodes that are marked “Explored” between the invocation and end of this recursive call are descendants of u in T.
3.7) Let T be a depth-first search tree, let x and y be nodes in T , and let (x,y) be an edge of G that is not an edge of T. Then one of x or y is an ancestor of the other.
3.8) For any two nodes s and t in a graph, their connected components are either identical or disjoint.

I think that is the most significant part of this subsection (connected components are either identical or disjoint).

3.3 Implementing Graph Traversal Using Queues and Stacks There are two ways to represent graphs: adjacency lists and adjacency matrix. We will be using adjacency lists. Running times of algorithms dealing with graphs will be in terms of nodes (n) and edges (m). Linear time would be O(m+n). When working with connected graphs, O(m+n) = O(n) because connected graphs have at least m >= n-1 edges.

3.10) An adjacency matrix uses O(n^2) space, while an adjacency list uses O(m+n) space.

The chapter does a bit of explanation of queues and stacks and FIFO and LIFO and states that both can be implemented with doubly linked lists, so the insertions and deletions can be done in constant time. BFS takes advantage of queues, while DFS takes advantage of stacks. BFS and DFS can be implemented in time O(m+n) if the graph is given by the adjacency list representation.

3.4 Testing Bipartiteness: An Application of Breadth-First Search A bipartite graph is one where the node set V can be partitioned into sets X and Y in such a way that every edge has one end in X and the other end in Y.

3.14) If a graph is bipartite, then it cannot contain an odd cycle.

Proving the above isn't hard. Imagine a cyclic graph with an odd number of nodes. You cannot make alternating nodes different colors, because eventually there will be a collision. In order to implement testing bipartiteness, use a BFS coloring alternating nodes different colors. If there is a collison, then this is not a bipartite graph.

3.15) Let F be a connected graph, and let L(1), L(2), ... be the layers produced by BFS starting at node s. Then exactly one of the follwing two things must hold.
  (i)There is no edge of G joining two nodes of the same layer. In this case G is a bipartite graph in which the nodes in even-numbered layers can be colored red, and the nodes in odd-numbered layers can be colored blue.
  (ii) There is an edge of G joining two nodes of teh same layer. In this case, G contains an odd-length cycle, and so it cannot be bipartite.

3.5 Connectivity in Directed Graphs Representing directed graphs is a lot like representing undirected graphs. You use an adjacency list, but you keep track of the edges coming from and leaving each node. In such a way, an algorithm examining each node can see both. This will be useful in determining if a graph is strongly connected. To determine if a graph is strongly connected, do a BFS on G starting with s. Then do a BFS on G(edges reversed) starting with s. If any nodes are undiscovered in either BFS, then the graph is not strongly connected.

3.16) If u and v are mutually reachable, and v and w are mutually reachable, then u and w are mutually reachable.
3.17) For any two nodes s and t in a directed graph, their strong components are either identical or disjoint.

3.6 Directed Acyclic Graphs and Topological Ordering

If an undirected graph has no cycles, then it is a tree and it is called a Directed Acyclic Graph, or DAG. DAGs can be used to encode precedence relations or dependencies in a natural way. For example, ordered tasks or the pipeline of computing tasks. If there was a cycle in a directed graph representing dependencies, then no task in the cycle could get done becuase none could be done before another. A topological ordering of G is an ordering of its nodes that all the nodes point “forward”.

3.18) If G has a topologoical ordering, then G is a DAG.
3.19) In every DAG G, there is a node v with no incoming edges. (Have to start somewhere).
3.20) If G is a DAG, then G has a topological ordering.

Readiblity/Interest Very readible, pretty interesting. 8.

Chapter 4 - Greedy Algorithms

A greedy algorithm builds up a solution in small steps, choosing a decision at each step myopically to optimize some underlying criterion. There are two methods for proving that a greedy algorithm produces an optimal solution to a problem. One is to establish that the greedy algorithm stays ahead. At each step of the algorithm, the algorithm preforms better than any other algorithm. The other way is the exchange argument… one considers any possible solution to the problem and gradually transforms it in to the solution that is found by the greedy algorithm without hurting its quality. The greedy algorithm must have found a solution that is at least as good as any other solution.

4.1 Interval Scheduling A subset of the requests is compatible if no two of them overlap in time, and optimal if the subset is as large as possible.

The greedy approach: Use a simple rule to select the first request. Then reject all other requests that are not compatible, and so on and so forth. This is difficult. Three unsuccessful algorithms are outlined on page 117. The solution to this scheduling problem is to select first the request that finishes first, and then continue doing that, eliminating requests that are not compatible with the requests we have chosen. A visual example is on page 119. The proof that this algorithm returns the optimal set is explained on page 121. A variation of this scheduling problem is the weighted scheduling problem, whereas some tasks are worth more than others.

A further variation of the scheduling problem arises if we have many identical resources available and we wish to schedule all the requests using as few resources as possible. This problem is called the Interval Partitioning Problem. Some terminology: the depth of a set of intervals is the maximum number that pass over any single point on the timeline. In fact, in any instance of Interval Partitioning, the number of resources needed is at least the depth of the set of intervals. The greedy algorithm here assigns a label to each interval, where the labels are 1 through the depth of the set. Overlapping intervals are labeled with different numbers. The labels correspond to the resource to which it is assigned.

4.2 Scheduling to Minimize Lateness There is a single resource, but several somewhat flexible requests. Requests have to be completed in a contiguous time interval but can be scheduled at any time before the its fixed deadline. The goal of this greedy algorithm is going to be to schedule all the requests, or jobs, in such a way to minimize maximum lateness. The solution to this problem is simply to sort the jobs by the deadline. The algorithm does not even look at the length of the jobs. This will create a job list with no idle time, or time that the machine is not running.

4.3 Optimal Caching This problem arises when dealing with memory. Which pieces of data should you keep at hand? The cache maintenance algorithm determines what to keep in the cache and what to evict from the cache when new data needs to be brought in. To set up this problem, assume that the memory can hold n pieces of data and the cache can hold k < n pieces of data. Also, assume that the cache holds some set of k items. We want to have as few cache misses as possible. The cache algorithm determines an eviction schedule which specifies which items should be evicted from the cache at which points in the sequence. The Farthest-in-Future Algorithm determines the eviction schedule, but it requires knowledge of memory references in the future (which is not necessarily known in practice). In real life, the best variant of this algorithm is the Least-Recently-Used Algorithm which evicts the thing that was referenced longest ago.

4.4 Shortest Paths in a Graph We are given a weighted, directed graph with a designated start node, s, and we a re to determine the shortest path from s to an end node. The solution to this is Dijkstra's Algorithm. Basically, it figures out the shortest path from s to every other node, and when it is finished, it has determined the shortest path to the end node. This algorithm is proved correct by proving that it “stays ahead” of any other algorithm. For any path to node v that the algorithm determines, it is shorter than every other possible path to v. Using a priority queue, Dijkstra's Algorithm can be implemented on a graph with n nodes and m edges to run in O(m log n) time.

4.5 The Minimum Spanning Tree A minimum spanning tree will have no cycles. It'll be a tree. There are 3 Greedy Algorithms to solve this problem.

Kruskal's Algorithm: Start with no edges and build a spanning tree by successively inserting edges from E in order of increasing cost. Moving through the edges, insert edge e as long as it does not create a cycle when added to the edges already inserted. If an edge will cause a cycle, ditch it.
Prim's Algorithm: Start with a node s and try to greedily grow a tree from s outward. At each step, simply add the node that can be attached as cheaply as possibly to the partial tree.
Reverse-Delete Algorithm: Start with a full graph and delete edges in order of decreasing cost. Going from each edge e, delete it as long as it doesn't disconnect the graph.

To guarantee that an edge is not in the minimum spanning tree, assume that all edge costs are distinct. Let C be any cycle in G, and let edge e=(v,w) be the most expensive edge belonging to C. Then e does not belong to any minimum spanning tree of G. Prim's Algorithm g

Chapter 5 - Divide and Conquer

Divide and conquer refers to a class of algorithmic techniques in which one breaks the input into several parts, solves the problem in each part recursively, and then combines the solutions to these sub-problems into an overall solution. A recurrence relation usually bounds the running time recursively in terms of the running time on smaller instances. The main hurdle in solving problems efficiently is often just improving on brute-force methods.

5.1 A First Recurrence: The Mergesort Algorithm The mergesort is a sorting algorithm that executes in O(n log n) time. Its abstract behavior is that of many common divide-and-conquer algorithms:

Divide the input into pieces of equal size; solve the two sub-problems on these pieces separately by recursion; and then combine the two results into an overall solution, spending only linear time for the initial division and final recombining.

There needs to be a base case for the recursion. In Mergesort, once the input has been reduced to size , we stop the recursion and sort the two elements by simply comparing them to each other. The running time Tn) satisfies the following recurrence relation:

for some constant c,
  T(n) <= 2T(n/2) + cn
when n > 2, and
  T(2) <= c.
  
or,

T(n) c= 2T(n/2) + O(n)

There are two methods to solve a recurrence:

Unrolling the Mergesort Recurrence
1. Analyze the first few levels
2. Identify the pattern
3. Sum over all levels of recursion
Substitute a solution
1. Here, we have a running time that we believe to be correct. All you have to do is plug it into the recurrence and verify

5.2 Further Recurrence Relations Did we cover this in class?

5.3 Counting Inversions This a problem of rankings, the type of which might occur in websites that make use of collaborative filtering, in which they try to match your preferences with those of other people on the internet. Once websites identify people with “similar” tastes to yours, it can recommend new things that other people like you have liked. In such rankings, 2 people do not always have the same rankings of things (duh). The way to tell how different the rankings are, is to label your interests from 1 - n, and then use those labels to label the other person's interests and count how many things are out of ordered, or inverted. We saw that two indices i < j for an inversion if a(_i) > a(_j).

The brute force method of solving this is to compare everything with everything, which is O(n^2) time. It can be done in O(n log n) time, however. The way to solve this is to split the problem in half, do each half separately, then combine the results. So when splitting up, the algorithm should also sort and count inversions. This will make the other step easier. This can be accomplished in O(n log n) time. Now, there is a count going of inversions, and the two lists are sorted and all values in list B are . Then we can just combine them like in merge sort, adding the minimum value of the two lists, and count the inversions on the way. Every time an element from list A, there are no inversions counted, but every time an element from list B is added, the inversion count is incremented by the number of elements left in list A.

5.4 Finding the Closest Pair of Points This problem is to find the closest pair of points on a plane. Again, the brute force method is O(n^2) time. We can beat that with a divide and conquer algorithm. First, split up the plane into two sides with about the same number of points on each side. Then find the minimum distance on each side. After that, compare the two minimums to get the temporary minimum. This is not the minimum, because the closest points might be on opposite sides of the graph. To check against this, set δ to be that minimum distance. If there are two points that have a distance less than δ, than they must be at least that close to the line L dividing the plane. Consider the subset Z of the plane consisting of all points within distance δ of L. partition Z into boxes with squares horizontal and vertical with side length δ/2. There cannot be two points in the same box, because the distance between those points would actually be δ. So each box contains at most one point. If the distance is farther than δ, or more than 3 boxes away, between two points then the distance is greater than δ and it will not be the closest pair of points. There is a limit to the number of boxes we have to compare each box to, so it is O(n), and not O(n^2) time.

5.5 Integer Multiplication In this algorithm, more than two recursive calls are spawned at each level. The algorithm we learn in elementary school is O(n^2). The improved algorithm goes like this. Break the two numbers, x and y, into terms of the higher order bits and lower order bits for each. This makes 4 problems that are half the size of n. For these problems, continue to recur, each time, breaking the problem down into more 3T(n/2) problems.

Question: Is the multiplication algorithm we used in computer organization theoretically faster? I think it's O(n). If there are two number x and y being multiplied, just set y' to be y and x' to be x. Then while x' is not 0, add x' to y' and decrement 1 from x'. After that while loop, the answer to that multiplication is y'. I can see that this is practically much slower than the other algorithms, but isn't it theoretically faster?

Readability: I thought this was a pretty tough chapter, especially the integer multiplication part. 4. Interest: 7.

Chapter 6 - Dynamic Programming

The idea of Dynamic Programming is to implicitly explore the space of all possible solutions, by carefully decomposing things into a series of subproblems, and then building up correct solutions to larger and larger subproblems. “Dangerously close” to brute force, but it explores possible solutions implicitly instead of explicitly.

6.1 Weighted Interval Scheduling Problem - A Recursive Procedure

This is a different animal than Interval Scheduling because now the intervals have weights. No greedy algorithm works for this.

The goal for the algorithm is to select a subset of mutually compatible intervals so as to maximize the sum of the values of the selected intervals. Two intervals are compatible if they do not overlap. p(j) will be defined like this: for an interval j, the largest index i < j such that intervals are disjoint. In other words =, i is the leftmost interval that ends before j begins.

When we look at these things, we have two obvious observations. We either use interval n or we do not use interval n. Also, no interval indexed strictly between p(n) and can belong to the optimal set, by the definition of p(n). In addition, there must exist an optimal solution.

The solution is this: Opt(j) = max(v[j] + Opt(p(j), Opt(j-1)), but could still result in exponential time.

We need to memoize the solution through Recursion. Memoization is the technique of saving values that have already been computed. The Memoized algorithm can be found on pg 256. It is basically the same as the above algorithm, but results are stored in an array rather than computed on each recursion step. In this algorithm, there can be at most O(n) calls, therefore, the algorithm is O(n). This will give an array filled with the optimal solutions of the sub-problems, and an optimal array can be found from this sub-problem in O(n) time.

6.2 - Principles of Dynamic Programming: Memoization or Iteration over Sub-problems

The way we solved the previous problem was to make an exponential recursive problem and then memoize it so that it takes only O(n) time instead of O(n^2) time.

When designing the algorithm, the key to efficiency is the array M. Once that array is filled, it is trivial to find the optimal solution. The algorithm to find the optimal solution is on pg 259.

6.3 Segmented Least Squares: Multi-way Choice

This has to do with the line of best fit through a piece of scientific or statistical data. The line which has the minimum error can be found with calculus. Now what if the data requires a number of line segments. Our problem is this: we need to formulate an algorithm that identifies a few points in the sequence at which a discrete change occurs (from one segment to another).

For this problem: OPT(j) = minimum cost for points p1, pi+1 , … , pj e(i, j) = minimum sum of squares for points pi, pi+1 , …, pj

To compute the problem: Last segment contains points pi, pi+1, … , pj for some i Cost = e(i, j) + c + OPT(i-1)

The recurrence is Opt(j) = min(e[i,j] + C + Opt(i-1)).

The algorithm for this problem is on pg 265.

6.4 Subset Sums and Knapsacks: Adding a variable

The knapsack problem is when we have a collection of items with weights and values. The goal is to fill the knapsack with items of maximum value, but is restricted by a weight limit.

Basically,

If w<w[i] then OPT(i, w) = OPT(i-1, w). Otherwise OPT(i,w) = max(OPT(i-1, w), v[i] + OPT(i-1, w-w[i])

Interest: 7 Readability: 5, I think the lectures helped the most with this section compared to the others (maybe not 5… that was difficult to get through as well).

Chapter 7 - Network Flow

7.0 - Introduction

Matchings in bipartite graphs can model situations in which objects are being assigned to other objects. One natural example is when the nodes in X represent jobs and the nodes in Y represent machines. A perfect matching is a way of assigning each job to a machine that can possess it, with the property that each machine is assigned exactly one job. To create a Bipartite Matching algorithm, we will develop network flow algorithms and then look at the bipartite graph as a special case. The algorithms will be applicable to a host of other problems as well.

7.1 - The Maximum-Flow Problem and the Ford-Fulkerson Algorithm

The problem: Consider a network where the nodes are switches, and the directed edges have capacities. Source nodes in this graph create traffic, and sink nodes absorb traffic. The traffic across this graph will be called flow, and we want to maximize the flow being sent/received. No edges enter the source and no edges leave the sink. In our models, we will only analyze a steady flow of traffic, not bursty traffic. So… given a flow network, how do we arrange the traffic so as to make as efficient use as possible of the available capacity? Or, given a flow network, find a flow of maximum possible value.

In order to solve this problem, we will use a method that uses a residual graph, whereas we can push flow backwards on edges that already carry flow , to divert it in a different direction. The Ford-Fulkerson Algorithm involves adding flow onto an edge, then checking for bottlenecks, then undoing the flow according to that bottleneck. The algorithm can be found on pg 344 of the text. The complexity of this algorithm is O(n+m) or just O(m).

7.2 - Maximum Flows and Minimum Cuts in a Network

For this section, we split the graph into two sections, and prove the maximum possible flow for each section. At some point the flow must cross from one section to the other, so if we can be sure that this is optimal, we can ensure that the entire thing is optimal. The value of every flow is upper-bounded by the capacity of every cut.

In every network, the maximum value of an s-t flow is equal to the minimum capacity of an s-t cut.

7.5 - A First Application: The Bipartite Matching Problem

In order to solve this one, just add a source node and a sink node, make the current graph directed from one grouping to the other, and then add in the source and sink so that everything flows in the same direction. Then it is as simple as just implementing the Ford-Fulkerson Algorithm on this graph. The algorithm will run in O(mn) time. This will find an a matching of the largest possible size.

Sometimes there is no perfect matching in a bipartite graph. We can check whether or not a matching is perfect if the flow coming into the sink or leaving the source is 1/2 the size of n.

7.7 - Extensions to the Maximum-Flow Problem

Imagine the situation in which there are multiple sinks and sources. In this case, instead of max flow, each node has a demand or a supply. In order to solve this problem, again, add a sink and source, or super-sink and super-source. The capacities of the edges going from the super-source to the supply nodes are equal to the supply offered by those nodes; and the edges going from the demand nodes to the super-sink are equal to the demand by those nodes. Then it is just a max flow problem.

Interest: 10 (taking Networks with Stough and this is pretty relevant)

Readiblity: 5 (the language could have been clearer)

Table of Contents