Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2014:journals:deirdre:chapter4 [2014/03/05 02:35] – [4.8 - Huffman Codes and Data Compression] tobind
+++ courses:cs211:winter2014:journals:deirdre:chapter4 [2014/03/05 04:55] (current) – [4.8 - Huffman Codes and Data Compression] tobind
@@ Line 199: / Line 199: @@
 So the search for the optimal prefix code can be viewed as the search for a binary tree //T//, together with a labeling of the leaves of //T//, that minimizes the average number of bits per letter. The length of the encoding of a letter is simply the length of the  path from the root to the leaf labeled //x//. We will refer to the length of this path as the depth of teh leaf. So, we are looking for the labeled tree that minimizes the weighted average of the depths of all leaves where the average is weighted by the frequencies of the letters that label the leaves. As a first step, we note that the binary tree corresponding to the optimal prefix code is full.
+    Suppose that u and v are leaves of T* such that depth(u) < depth(v). Further, suppose that in a labeling of T* corresponding to an optimal prefix code, leaf u is labeled with y and leaf v is labeled wiht z. Then fy >= fz.
+    There is an optimal prefix code, with corresponding tree T*, in which the two lowest-frequency letters are assigned to leaves that are siblings in T*.
+Huffman's Algorithm:
+   If S has two letters then
+       Encode one letter using 0 and the other letter using 1
+   Else
+       Let y* and z* be the two lowest-frequency letters
+       Form a new alphabet S' by deleting y* and z* and replacing them with a new letter w of frequency fy* + fz*
+       Recursively construct a prefix code γ' for S' with tree T'
+       Define a prefix code for S as follows:
+           Take the leaf labeled w and add two children below it labeled y* and z*
+The Huffman code for a given alphabet achieves the min average number of bits per letter of any prefix code. <--- optimality (proof 174-5)
+By maintaining the alphabet //S// in a pq, using each letter's frequency as its key, and by extracting the min twice (two lowest freq numbers) and inserting a new letter whose key is the sum of those two, we can obtain a running time of //O(k// log //k)//.
+I thought this section was really interesting - 9.