Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2018:journals:lesherr:home:chapter4 [2018/02/28 15:35] – [Section 2: Scheduling to Minimize Lateness] lesherr
+++ courses:cs211:winter2018:journals:lesherr:home:chapter4 [2018/03/11 16:30] (current) – [Section 8: Huffman Codes and Data Compression] lesherr
@@ Line 6: / Line 6: @@
 ===== Section 4: Shortest Paths in a Graph =====
 Another application of Greedy Algorithms is to find the shortest path in a graph between two nodes, which results in the creation of minimum-cost spanning trees. The setup of the shortest paths problem consists of a directed graph, with a designated start node s that has a path to every other node in G. Each edge of the graph is given a weight to denote the time it takes to traverse the edge. The length of the path between two nodes is the sum of the weights of all the edges that are traversed in the path. The goal of the problem is to determine the shortest path from s to every other node in the graph. An algorithm that can be used to find the shortest path between nodes in the graph is called Dijkstra's algorithm. Dijkstra's algorithm works by considering every node in the graph, and finding the set of edges in the graph that results in the shortest distance from the start node to the desired node. However, there are some limitations to this algorithm. The algorithm does not always find shortest paths if some of the edges can have negative lengths. A more complex algorithm - due to Bellman and Ford - is required for cases with negative path lengths. Dijkstra's algorithm can also be thought of as a 'continuous' version of Breadth-First Search for traversing a graph, where the tree created is the shortest path to any given node from the start node. This algorithm runs in O(mn) time. There are n-1 iterations over the nodes of the graph, where each iteration adds a new node to the current set. For each iteration, we have to consider each node that is not in the current set, and find all the edges between the set and the node to determine the minimum distance to the node. Thus you are traversing m edges of the graph for each node. However, if you change the implementation of the graph into a priority queue, Dijkstra's algorithm can be implemented to run in O(m) time. Each priority queue operation runs in O(logn), for a total overall time of O(mlogn) for the implementation. This section was rather difficult to follow, I would give it a 4/10.
+===== Section 5: The Minimum Spanning Tree Problem =====
+Another fundamental problem that can be solved using graphs is the Minimum Spanning Tree Problem. This problems focuses on a set of locations (nodes), and connections between them. For every pair of nodes in the graph, there exists a path between every pair of nodes. For each edge, there is a certain cost to traverse the edge. The goal of this problem is to find a subset of the current graph that is connected, and whose total cost (sum of the edges) is minimized. A theorem of this problem is 'Let T be a minimum-cost solution to the Minimum Spanning Tree problem. Then (V, T) is a tree'. We know that by definition a solution to this problem must be connected, and we will show that it will not contain any cycles. Since removing one edge won't make the graph no longer connected, we can remove the edge of the cycle with the highest cost and therefore we know that a solution must not have any cycles. If we allow for edges that have 0 cost, a solution may have extra edges, however there will always be a minimum-cost solution that is a tree. There are three main greedy algorithms that correctly find the minimum spanning tree of a graph. The first algorithm, known as Kruskal's algorithm, starts with a graph without any edges, and builds the minimum spanning tree by successively adding edges in order of increasing cost. You iterate through the edges in this order, and only add an edge to the graph if it does not create a cycle. The second algorithm, known as Prim's algorithm, uses an approach very similar to Dijkstra's algorithm. You start with a root node, and try to greedily grow a tree outwards At each step you add the node that can be added with the smallest cost to the tree. The third algorithm, known as the Reverse-Delete Algorithm, utilizes an approach that is the backwards version of Kruskal's algorithm. You start with a full graph, and delete edges in order of decreasing cost.  You decided to delete an edge as long as doing so does not disconnect the graph. All three of these algorithms will produce an optimal solution to the Minimum Spanning Tree Problem. 'The crucial fact about edge insertion  for these algorithms is known as the Cut Property. It states: "Assume that all edge costs are distinct. Let S be any subset of nodes that is neither empty nor equal to all of V, and let edge e = (v,w) be the minimum cost edge with one end in S and the other in V-S. Then every minimum spanning tree contains the edge e."' We can use the Cut Property to prove the optimality of both Kruskal's and Prim's algorithm. 'The crucial fact about edge deletion for these algorithms is known as the Cycle Property'. It states "Assume that all edge costs are distinct. Let C be any cycle in G, and let edge e = (v,w) be the most expensive edge belonging to C. Then e does not belong to any minimum spanning tree of G."' We can use the Cycle Property to prove the optimality of the Reverse-Delete Algorithm. Both Prim's and Kruskal's Algorithms can be implemented to run in O(mlogn) time, using a priority queue. This section made what we talked about in class much more clear, and was easy to follow. I'd give it an 8/10.
+===== Section 6: Implementing Kruskal's Algorithm: The Union-Find Data Structure =====
+A basic graph problem that we covered in a previous chapter is to find the set of connected components, which can be achieved using Depth-First search or Breadth-First search. When implementing Kruskal's Algorithm, we want to be able to develop a data structure to store a representation of connected components to check whether an edge will create a cycle. When considering an edge between two nodes, we need to check if the identities of the connected components containing the two nodes are different, and therefore can include the edge, otherwise we should omit the edge. The Union-Find data structure allows use to maintain disjoint sets. Given a node, the FIND operation will return the name of the set containing the node. We can use this operation to check if two nodes are contained in the same set. This structure will also implement an operation UNION, which will merge two distinct sets into one single set. Using these two operations, we are able to maintain connected components of the evolving graph as the edges are added. This structure does not support the deletion of edges from the graph. The Union-Find data structure supports three operations.The first, MAKEUNIONFIND(S) for a set, returns a Union-Find data structure on set S where all elements are in separate sets, which can be used at the very start of the algorithm. The second, FIND(u), returns the name of the set containing u. The third, UNION(a,b), will merge the sets a and b into a single set. The goal run time of the first operation is O(n), and for the second and third is O(logn). Now the main decision is what to implement the Union-Find data structure as. 'Consider the array implementation of the Union-Find data structure for some set S of size n, where unions keep the name of the larger set. The FIND operation takes O(1), MAKEUNIONFIND(S) takes O(n), and any sequence of k UNION operations takes at most O(klogk) time.''Consider the pointer-based implementation of the Union-Find data structure for some set S of size n, where unions keep the name of the larger set. A UNION operation takes O(1) time, MAKEUNIONFIND(S) takes O(n) time, and FIND takes O(logn).' Using the Union-Find data structure, we can implement Kruskal's algorithm on a graph with n nodes and m edges to run in O(mlogn) time. Understanding the setup of the Union-Find Data Structure was rather easy to follow, although it became more complex when discussing the pointer based implementation. I would give this section a 6/10.
+===== Section 7: Clustering =====
+Another area that minimal spanning trees can be applied to is clustering. Clustering is the problem of trying to classify or organize a collection of objects into coherent groups. 'A common approach to this problem is to define a distance function on the objects, with the interpretation that objects at a larger distance from one another are less similar to each other'. Given a relative 'distance' between each other, the clustering problem wants to divide the objects into groups so that objects within a group are close, and objects in different groups are far apart. 'The spacing of a k-clustering is the minimum distance between any pairs of points that lie in different clusters.' In this problem, we want to find a clustering with the maximum possible spacing. 'To find a clustering of maximum spacing, we consider growing a graph on the vertex set U. The connected components will be the the clusters, and we will try to bring nearby points together into the same cluster as rapidly as possible.' We add edges in order of increasing distance between endpoints, and only add an edge if it connects edges that are not in the same cluster. When adding an edge between two distinct connected components, you are merging the two clusters. 'In clustering lingo, the iterative merging of clusters is termed single-link clustering, a special case of heirarchical agglomerative clustering. Agglomerative means that we combine clusters, single-link means we do it as soon as a single link joins them together.' This procedure is the exact same as Kruskal's Minimum Spanning Tree Algorithm. When we seek a k-clustering, we stop the algorithm once we obtain k distinct connected components. 'The components C1, C2, ..., Ck formed by deleting the k-1 most expensive edges of the minimum spanning tree T constitute a k-clustering of maximum space.' This section was relatively easy to follow, and hopefully will become more clear after discussing it in class. I would give it a 7/10.
+===== Section 8: Huffman Codes and Data Compression =====
+Now, we consider a problem were we use a greedy rule to 'shrink the size of the problem instance, so that an equivalent smaller problem can then be solved by recursion.' In this manner, solving the smaller instance still leads to an optimal solution of the original larger instance, but the final result isn't returned until the whole recursion is completed. The problem we look at deals with 'data compression', which is an important part of digital communication. Computers encode data into sequences of bits (1s and 0s) so that it can process information and perform operations. The problem of data compression is finding the best way to store the data in the smallest amount of bits possible so that it can be transmitted in the least amount of time. The average bits per letter is calculated by taking the number of bits it takes to encode a letter and multiplying it by the frequency of that letter, then summing up all of the letters. The goal of data compression is to minimize the average number of bits per letter, so that letters that are more frequent can be encoded with less bits than other less frequent letters. One example of data compression is Morse Code, which was used to transmit messages using sequences of dots and dashes to represent letters. However, the encoding of words in this manner was ambiguous, as there was no way to distinguish when a letter had ended and a new one had begun. To deal with this issue, Morse Code transmissions had to involve short pauses between letters, however this is inefficient and can be done in a much better way. The issue with Morse Code is that there exists instances where one encoding of a letter is a prefix of an encoding of another letter, thus you don't know which letter is being represented without the pause in between. To optimize the data transmission, we want to find a way to encode all of the letters in the most efficient way possible such that no letter encoding is a prefix of another, thus removing any ambiguity of what the data represents. Additionally we want to continue to represent more frequent letters with less bits than other less frequent letters. To do this, we will represent the prefix codes using a binary tree, where a path from a node to the right represents a 1, and a path from a node to the left represents a 0.  'The encoding of alphabet S constructed from binary tree T is a prefix code'. To ensure that no one letter encoding is a prefix of another, all letters must be stored on leaf nodes of the tree, with no letters being stored on interior nodes. 'The binary tree corresponding to the optimal prefix code is full'. 'There is an optimal prefix code, with corresponding tree T*, in which the two lowest-frequency letters are assigned to leaves that are siblings in T*.' The algorithm to build this tree is as follows: If S has two letters, encode one with 0 and the other with a 1, else, let y and z be the two lowest-frequency letters, replace the two letters in the alphabet by creating a combined letter yz as a supernode with y and z as its children, and recursively continue this pattern until there is only one letter left in the alphabet. 'The Huffman code for a given alphabet achieves the minimum average number of bits per letter of any prefix code.' The running time of Huffman's algorithm relies on the use of a priority queue in the implementation, which has O(logn) operations, thus this algorithm can run in a time of O(nlogn). A major drawback of prefix codes is that they cannot adapt to changes in the text. This section was very easy to read and follow after having learned about it in class. I would give it a 9/10.