Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
courses:cs211:winter2018:journals:patelk:chapter4 [2018/03/05 23:52] – [Personal Thoughts] patelkcourses:cs211:winter2018:journals:patelk:chapter4 [2018/03/10 20:29] (current) – [4.8 Huffman Codes and Data COmpression] patelk
Line 344: Line 344:
 This section was not bad, because it was very general information. The nitty gritty details that I'm sure are related to clustering were not really covered. The overall idea of clustering makes a lot of sense and the basic algorithm was not complicated to follow (especially since it built off of the last section's Kruskal's Algorithm). I am looking forward to seeing more examples/applications of clustering.  This section was not bad, because it was very general information. The nitty gritty details that I'm sure are related to clustering were not really covered. The overall idea of clustering makes a lot of sense and the basic algorithm was not complicated to follow (especially since it built off of the last section's Kruskal's Algorithm). I am looking forward to seeing more examples/applications of clustering. 
 Readability: 7 Interesting: 7 Readability: 7 Interesting: 7
 +
 +===== 4.8 Huffman Codes and Data Compression =====
 +
 +  * Greedy rule to shrink the size of the problem instance so that an equivalent smaller problem can then be solved by recursion
 +  * Data Compression: area that forms part of the foundations for digital communication.
 +
 +**The Problem**
 +
 +  * __Encoding Symbols Using Bits__
 +    * Use a fixed number of bits to represent each symbol in the alphabet
 +    * Concatenate bit strings to form text
 +    * Tremendous waste to translate all letters into the same number of bits
 +      * use a small number of bits for frequent letters and larger number of bits for less frequent ones
 +    * Need an optimal way to take advantage of the nonuniform frequencies of the letters.
 +  * __Variable-Length Encoding Schemes__
 +    * Morse Code: translating each letters into a sequence of dots and dashes (0s and 1s, for comparison) -> maps more frequent letters to shorter bit strings
 +      * Dealing with Ambiguity: short pauses between letters
 +  * __Prefix Codes__
 +    * Ambiguity: bit string that encodes one letter is a prefix of the bit string that encodes another
 +    * Prefix Code: for set S of letters is a function γ that maps each letter x in S to some sequence of zeros and ones, in such a way that for distinct x,y in S, the sequence γ(x) is not a prefix of the sequence γ(y).
 +    * When scanning text, as soon as you've seen enough bits to match the encoding of some letter, output this as the first letter of the text.
 +  * __Optimal Prefix Codes__
 +    * For each letter x in S, there is a frequency fx, representing the fraction of letters in the text that are equal to x. Frequencies sum to 1.
 +      * Total length of encoding = sum, over all letters x in S, of the number of times x occurs times the length of the bit string γ(x) used to encode x. 
 +        * Average number of bits required per letter = ABL(γ0
 +
 +**Designing the Algorithm**
 +  * __Representing Prefix Codes Using Binary Trees__
 +    * Rooted binary tree T: number of leaves is equal to the size of the alphabet S; each leaf has a distinct letter in S
 +    * For each letter x in S, we follow the path from the root to the leaf labeled x; each time the path goes from a node to its left child, we write down a 0, and each time the path goes from a node to its right child, we write down a 1.
 +    * //The encoding of S constructed from T is a prefix code.//
 +      * For the encoding of x to be a prefix of the encoding of y: path from root to x would have to be a prefix of the path from root to y. Same as saying that x would lie on the path from the root to y, which isn't possible if x is a leaf.
 +    * Given a prefix code y, we can build a binary tree recursively
 +      * Start with root, all letters x in S whose encodings begin with a 0 = leaves in the left subtree of the root; opposite for encodings beginning with a 1; use recursion
 +    * A binary tree is full if each node that is not a leaf has two children: there are no nodes with exactly one child.
 +    * //Binary tree corresponding to the optimal prefix code is full//
 +      * Proof using an exchange argument
 +      * T = binary tree corresponding to the optimal prefix code
 +      * Suppose it contains a node u with exactly one child v
 +      * Convert T into T' by replacing node u with v.
 +        * if u = root, delete u and use v as root
 +        * otherwise, let w be the parent of u in T, delete u, and make v be a child of w in place of u
 +  * __A First Attempt: The Top-Down Approach__
 +    * Goal: produce a labeled binary tree in which the leaves are as close to the root as possible
 +    * Pack leaves as tightly as possible: want a "nearly balanced" split of the alphabet
 +      * Shannon-Fano Codes: no version of this top-down splitting strategy is guaranteed to always produce an optimal prefix code
 +  * //What If We Knew The Tree Structure of the Optimal Prefix Code?//
 +    * Assume we have the binary tree T* corresponding to an optimal prefix code - need to figure out the labeling
 +    * Suppose that u and v are leaves of T*, such that depth(u) < depth(v). Further suppose that in a labeling of T* corresponding to an optimal prefix code, leaf u is labeled with y in S and leaf v is labeled with z in S. Then fy >= fx.
 +      * Proof using an exchange argument
 +      * if fy<fz, then effect of the exchange is as follows: the multiplier on fy increases [from depth(u) to depth(v)] and the multiple on fz decreases by the same amount [from depth(v) to depth(u)].
 +        * change to the overall sum is a negative number, contradicting the supposed optimality of the prefix code.
 +      * Having a lower-frequency letter at a strictly smaller depth than some other higher-frequency letter rules out for an optimal solution
 +      * Take all leaves of depth 1, label them with the highest-frequency letters, do the same with each level of increasing depth -> leads to an optimal T*
 +    * w is a leaf of T*
 +      * If w were not a leaf, there would be some leaf w' in the subtree below it. But then w' would have a depth greater than that of v, contradicting our assumption that v is a leaf of maximum depth in T*.
 +        * There is an optimal prefix code, with corresponding tree T*, in which the two lowest-frequency letters are assigned to leaves that are sibling in T*.
 +  * //An Algorithm to Construct an Optimal Prefix Code//
 +    * Suppose y* and z* are the two lowest-frequency letters in S.
 +      * can lock them together because we know they end up as sibling leaves below a common parent
 +      * Replace y* and z* with meta-letter, obtaining an alphabet that is one letter smaller. Recursively find a prefix code for the smaller alphabet, and then "open up" the meta-letter back into y* and z* to obtain a prefix code for S:
 +//Huffman's Algorithm//
 +
 +{{:courses:cs211:winter2018:journals:patelk:optimalprefixcodesiblings.png?nolink&400|}}
 +
 +{{:courses:cs211:winter2018:journals:patelk:optimalprefixcodesiblings2.png?nolink&400|}}
 +
 +  * bottom-up approach: focuses on the leaves representing the two lowest-frequency letters, and then continues by recursion.
 +
 +**Analyzing the Algorithm**
 +  * //The Optimality of the Algorithm//
 +    * Recursive algorithm: proof by induction
 +    * Base: optimal for all two letter alphabets (one bit per letter)
 +    * By induction: optimal for all alphabets of size k-1
 +    * Consider: alphabet S of size k
 +    * The Huffman code for a given alphabet achieves the minimum average number of bits per letter of any prefix code
 +  * //Implementation and Running Time//
 +    * Can be polynomial time in k, the number of letters in the alphabet
 +      * recursive calls: sequence of k-1 iterations
 +      * each iteration consists simply of identifying the two lowest-frequency letters -> O(k)
 +      * combined: O(k^2)
 +    * Use a priority queue to maintain a set of k elements
 +      * allows for insertion and extraction 
 +      * Each iteration: extract the minimum twice; insert a new letter whose key is the sum of these two minimum frequencies
 +      * Using priority queues via heaps: each insertion and extraction: O(logk) -> over k iterations, O(klogk)
 +
 +**Extensions**
 +  * Data Compression: black and white images already use one bit per letter
 +    * need to be highly compressed: use a "fraction of a bit" for each white bixel
 +  * Cannot adapt to changes in text: frequency is stable for half the text, but then changes radically
 +    * Halfway into the sequence insert some kind of instruction to completely change the encoding
 +    * This will pay off in the long run: called adaptive compression schemes
 +
 +==== Personal Thoughts ==== 
 +
 +While arguably one of the more interesting topics of this textbook, this section was long and very dense. There was a lot of information that was hard to digest all at once. Our discussion in class was helpful, but not long enough to help make this section very clear. I will most likely need to review this chapter section again.
 +Readability: 5 Interesting: 8
  
courses/cs211/winter2018/journals/patelk/chapter4.1520293977.txt.gz · Last modified: by patelk
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0