Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
courses:cs211:winter2014:journals:deirdre:chapter4 [2014/02/25 13:38] – [4.5 - Interval Scheduling: The Greedy Algorithm Stays Ahead] tobindcourses:cs211:winter2014:journals:deirdre:chapter4 [2014/03/05 04:55] (current) – [4.8 - Huffman Codes and Data Compression] tobind
Line 113: Line 113:
  
 ====== 4.5 - The Minimum Spanning Tree Problem ====== ====== 4.5 - The Minimum Spanning Tree Problem ======
 +**The Problem**
 +Suppose we have a set of locations to connect but we want to build it as cheaply as possible. For certain pairs //(vi, vj)//, we may build a direct link between //vi// and //vj// for a certain cost //c(vi, vj) > 0//. thus we can represent the set of possible links that may be built using a graph //G = (V,E)//, with a positive cost c_e associated with each edge //e = (vi, vj)//. The problem is to find a subset of the edges //T ⊆ E// so that the graph //(V,T)// is connected and the total cost sum(//ce//) is as small as possible.
 +  (4.16) Let T be a minimum-cost solution to the network design problem defined above. Then (V,T) is a tree.
 +
 +We will call a subset //T ⊆ E// a spanning tree of //G// if //(V, T)// is a tree. 
 +
 +**Designing Algorithms**
 +Three greedy algorithms for this problem:  
 +  * Kruskal's - Starts with no edges and builds a spanning tree by successively inserting edges from //E// in order of increasing cost. As we move through the edges in this order, we insert each edge //e// as long as it does not create a cycle when added ot the edges we've already inserted. If, on the other hand, inserting //e// would result in a cycle, we don't and move on.
 +  * Prim's - Like Dijkstra's. Start with a root node //s// and try to grow a tree from //s// outward. At each step, add the node that can be attached as cheaply as possibly. We maintain a set //S ⊆ V// on which a spanning tree has been constructed so far. Initially, //S = {s}//. In each iteration, we grow //S// by one node, adding the node that minimizes the attachment cost and including the edge //e = (u,v)// that achieves this minimum in the tree.
 +  * Reverse-Delete - Start with the full graph //(V, E)// and begin deleting edges in order of decreasing cost. Starting from the most expensive, we delete each edge //e// as long as it doesn't disconnect the graph. 
 +
 +**Analyzing the algorithms**
 +The three algorithms work by repeatedly inserting or deleting edges from a partial solution. So, when is it safe to include or eliminate an edge? (Note: assume all edge costs are distinct from one another. 
 +
 +  Cut Property: Assume all edge costs are distinct. Let S be any subset of nodes that is neither empty nor equal to 
 +  all of V and let edge e = (v, w) be the minimum-cost edge with one end in S and the other in V - S. then every minimum spanning
 +  tree contains the edge e.
 +  
 +  Cycle Property: Assume that all edge costs are distinct. Let C be any cycle in G and let edge e = (v,w) be the most 
 +  expensive edge belonging to C. Then e does not belong to any minimum spanning tree of G.
 +  
 +To eliminate the assumption that all distances are distinct, we just "perturb" them all by realllly small amounts. 
 +
 +**All three algorithms produce a minimum spanning tree of //G//.**
 +
 +
 +**Implementing Prim's**
 +The implementation of Prim's and Dijkstra's are almost identical. We need to be ale to decide which node //v// to add next to the growing set //S// by maintaining the attachment costs for each node //v ∈ V - S//. We keep the nodes in a pq with the attachment costs //a(v)// as the keys; we select a node with an ExtractMin operation and update the attachment costs using ChangeKey operations. There are //n// - iterations in which we perform ExtractMin and we perforem ChangeKey at most once for each edge. Thus,
 +  Using a pq, Prim's can be implemented on a graph with n nodes and m edges to run in O(m) time,
 +  plus the time for n ExtractMin and mChangeKey operations. Overall running time is O(m log n).
 +  
 +I found this section to be a 9.5. I really felt like I was confident in understanding the concepts.
 +  
 +====== 4.6 - Implementing Kruskal's: Union-Find ======
 +**The Problem** The Union-Fidn data structure allows us to maintan disjoint sets. Given a node //u//, the operation Find(//u//) will return the name of the set containing //u//. This op can be used to test if two nodes //u// and //v// are in the same set by checking Find(//u//) == Find(//v//). The data structure will also implement an operation Union(//X, Y//) to merge two sets X and Y. 
 +
 +Find returns the name of the component containing //u//. If the components aren't the same, we can add the edge. 
 +
 +UF will support three operations:
 +  * MakeUnionFind(//S//) will return a UF on set //S// where all elements are in separate sets. 
 +  * Find(//u//) will return the name of the set containing //u//.
 +  * Union(//X,Y//) will merge the sets //X// and //Y// into a single set. 
 +
 +The simple structure for UF is to maintain an array. For an array, Find takes O(one) (my one key won't work :( ), MakeUnionFind(//S//) takes O(//n//) and k Union operations takes at most O(//k// log //k//) time. 
 +
 +The better structure for UF uses pointers. Then, Union takes O(one), MakeUnionFind(//S//) takes O(//n//) time and Find takes O(log //n//) time.
 +
 +8.5
 +
 +====== 4.7 - Clustering ======
 +**The Problem** One common approach to clustering is to define a distance function on the objects, with the interpretation that objects at a larger distance from one another are less similar to each other. Now, given a distance function on the objects, the clustering problem seeks to divide them into groups so that, intuitively, objects within the same group are "close" and different groups are "far apart".
 +
 +Suppose we are given a set //U// of //n// objects, labeled //p1, p2, ..., pn//.  For each pair, //pi// and //pj//, we have a numerical distance //d(pi, pj)//. We require only that //d(pi,pi)//=0, that //d(pi,pj)// >0 for distinct //pi// and //pj//; and that distances are symmetric: //d(pi,pj)// = //d(pj, pi)//. If we want to divide the objects into //k// groups, we can say that is a //k//-clustering. We define the spacing of a //k//-clustering to be the minimum distance between any pair of points lying in different clusters. How can we efficiently find the //k//-clustering that has maximum spacing? 
 +
 +**Designing the Algorithm**  To find a clustering of maximum spacing, we consider growing a graph on the vertex set //U//. The connected components will be the clusters. So, we start by drawing an edge between the closest pair of points and continue adding edges between pairs of points in order of increasing distance //d(pi, pj)//. (Note: if we are about to add the edge //(pi, pj)// and find that //pi// and //pj// already belong to the same cluster, we will refrain from adding the edge. So, we'll never have a cycle.) 
 +
 +Look at that! We're doing exactly what Kruskal's Minimum Spanning Tree Algorithm would do if given a graph //G// on //U// in which there was an edge of cost //d(pi,pj)// between each pair of nodes //(pi, pj)//.The only difference is that we seek a //k//-clustering, so we stop the procedure once we obtain //k// connected components. In other words, we are running Kruskal's but stopping it just before it adds its last //k// - 1 edges.
 +
 +**Analyzing the Algorithm**
 +The components C1, C2, ..., Ck formed by deleting the //k// - 1 most expensive edges of the minimum spanning tree T constitute a //k//-clustering of maximum spacing. (proof p160)
 +
 +I give this section an 8.5. This made a lot of sense during class so I don't think I got too much more out of reading it.
 +====== 4.8 - Huffman Codes and Data Compression ======
 +**The Problem** Since computes operate on sequences of bits, one needs encoding schemes that take text written in alphabets and converts this text into long strings of bits. If we take the simple solution and say 2^5 = 32 which is more than enough options to cover basic 26 letters and symbols of English. But this is spending an average of five bits per symbol. Since some letters are used much more frequently, if we used less bits for them, we'd have a lower average of bits. 
 +
 +If we think of Morse code, more frequent letters are mapped to shorter bit strings like 'e' is a single dot. To deal with ambiguity, Morse code involves short pauses between letters.
 +
 +**Prefix codes**
 +The ambiguity problem in Morse code arises because there exist pairs of letters where the bit string that encodes one letter is a prefix of the bit string that encodes another. To eliminate this problem, it is necessary to map letters to bit strings in such a way that no encoding is a prefix of another encoding. Concretely, we say that a prefix code for a set //S// of letters is a function γ that maps each letter //x ∈ S// to some sequence of zeros and ones, in such a way that for distinct //x,y ∈ S//, the sequence γ(//x//) is not a prefix of the sequence γ(//y//). 
 +
 +To reconstruct text that uses a prefix code,
 +  * Scan the bit sequence from left to right.
 +  * As soon as you have enough bits to match an encoding of a letter, output that letter.
 +  * Delete the corresponding bits from the front of the message and repeat.
 +
 +**Optimal Prefix Codes**
 +Suppose that for each letter //x ∈ S//, there is a frequency //fx//, representing the fraction of letters in the text that are equal to //x//. In other words, assuming there are //n// letters total, //nfx// of these letters are equal to //x//. We notice that the frequencies sum to 1. 
 +
 +We want a prefix code that minimizes the average number of bits per letter. -> optimal prefix code.
 +
 +**Designing the algorithm**
 +We want to represent prefix codes using binary trees. For each letter //x ∈ S//, we follow the path from the root to the leaf labelled //x//, each time the path goes from a node to its left child, we write down a 0, and each time the path goes from a node to its right child, we write down a 1. The resulting string of bits is the encoding of //x//. This works the other way too. Given a prefix code γ, we can build a binary tree recursively. We start with a root. All letters whose encodings begin with a 0 will be leaves in the left subtree and all letters whose encodings begin with a 1 will be leaves in the right subtree of the root.  
 +
 +So the search for the optimal prefix code can be viewed as the search for a binary tree //T//, together with a labeling of the leaves of //T//, that minimizes the average number of bits per letter. The length of the encoding of a letter is simply the length of the  path from the root to the leaf labeled //x//. We will refer to the length of this path as the depth of teh leaf. So, we are looking for the labeled tree that minimizes the weighted average of the depths of all leaves where the average is weighted by the frequencies of the letters that label the leaves. As a first step, we note that the binary tree corresponding to the optimal prefix code is full. 
 +
 +
 +    Suppose that u and v are leaves of T* such that depth(u) < depth(v). Further, suppose that in a labeling of T* corresponding to an optimal prefix code, leaf u is labeled with y and leaf v is labeled wiht z. Then fy >= fz.
 +    
 +    There is an optimal prefix code, with corresponding tree T*, in which the two lowest-frequency letters are assigned to leaves that are siblings in T*.
 +    
 +Huffman's Algorithm:
 +   If S has two letters then
 +       Encode one letter using 0 and the other letter using 1
 +   Else
 +       Let y* and z* be the two lowest-frequency letters
 +       Form a new alphabet S' by deleting y* and z* and replacing them with a new letter w of frequency fy* + fz*
 +       Recursively construct a prefix code γ' for S' with tree T'
 +       Define a prefix code for S as follows:
 +           Take the leaf labeled w and add two children below it labeled y* and z*
 +
 +The Huffman code for a given alphabet achieves the min average number of bits per letter of any prefix code. <--- optimality (proof 174-5)
 +
 +By maintaining the alphabet //S// in a pq, using each letter's frequency as its key, and by extracting the min twice (two lowest freq numbers) and inserting a new letter whose key is the sum of those two, we can obtain a running time of //O(k// log //k)//.
 +
 +I thought this section was really interesting - 9.
  
-====== 4.6 - Interval Scheduling: The Greedy Algorithm Stays Ahead ====== 
courses/cs211/winter2014/journals/deirdre/chapter4.1393335530.txt.gz · Last modified: 2014/02/25 13:38 by tobind
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0