Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2018:journals:ahmadh:ch4 [2018/03/11 06:26] – ahmadh
+++ courses:cs211:winter2018:journals:ahmadh:ch4 [2018/03/11 06:49] (current) – [4.8.3 Comments] ahmadh
@@ Line 216: / Line 216: @@
 We describe a greedy method to construct an optimal prefix code efficiently. First, however, it is useful to develop a tree-based means of representing prefix codes that exposes their structure more clearly than simply the lists of function values (γ).
-A labeled binary tree T naturally describes a prefix code. For each letter x ∈ S, we follow the path from the root to the leaf labeled x: each time the path goes from a node to its left child, we write down a 0, and each time the path goes from a node to its right child, we write down a 1. We take the resulting string of bits as the encoding of x.
+A labeled, //full// binary tree T naturally describes a prefix code. For each letter x ∈ S, we follow the path from the root to the leaf labeled x: each time the path goes from a node to its left child, we write down a 0, and each time the path goes from a node to its right child, we write down a 1. We take the resulting string of bits as the encoding of x.
 We claim that the encoding of S from T is a prefix code. This is true because in order for the encoding of x to be a prefix of the encoding of y, the path from the root to x would have to be a prefix of the path from the root
-to y. However, this is the same as saying that x would lie on the path from the root to y, which isn’t possible if x is a leaf node.
+to y. However, this is the same as saying that x would lie on the path from the root to y, which isn’t possible if x is a leaf node. What's more is that this relationship between binary trees and prefix codes works in the other direction as well.
+As such, the search for an optimal prefix code can be viewed as the search for a binary tree T, together with a labeling of the leaves of T, that minimizes the average number of bits per letter. Note that the length of the encoding of a letter x ∈ S is simply the length of the path from the root to the leaf labeled x. Thus we dre seeking the labeled tree that minimizes the weighted average of the depths of all leaves, where the average is weighted by the frequencies of the letters that label the leaves: ∑ (x ∈ S) f_x × depth(x). We will use ABL(T) to denote this quantity.
+=== 4.8.2.1 An Algorithm to Construct an Optimal Prefix Code ===
+	To construct a prefix code for an alphabet S, with given frequencies:
+	   If S has two letters then
+	      Encode one letter using 0 and the other letter using 1
+	   Else
+	      Let y* and z* be the two lowest-frequency letters
+	      Form a new alphabet S’ by deleting y* and z* and
+	        replacing them with a new letter ω of frequency f_y* + f_z*
+	      Recursively construct a prefix code γ’ for S’, with tree T’
+	      Define a prefix code for S as follows:
+	         Start with T’
+	         Take the leaf labeled ω and add two children below it labeled y* and z*
+	   Endif
+The proof of correctness and optimality for this algorithm can be found on pages 174-5 of the textbook, and is omitted here for redundancy purposes.
+This algorithm is referred to as Huffman's Algorithm, and the prefix code it generates is called a Huffman Code. With the most basic implementation, identifying the lowest-frequency letters can be done in a single scan of the alphabet, in time O(k), and so summing this over the k - 1 iterations gives O(k^2) time. However, this time can be improved upon. We can maintain the alphabet S in a priority queue, using each letter’s frequency as its key. In each iteration we just extract the minimum twice (this gives us the two lowest-frequency letters), and then we insert a new letter whose key is the sum of these two minimum frequencies. Our priority queue now contains a representation of the alphabet that we need for the next iteration. We can make each insertion and extraction of the minimum run in time O(log k); hence, each iteration--which performs just three of these operations--takes time O(log k). Summing over all k iterations, we get a total running time of O(k log k).
+==== 4.8.3 Comments ====
+I feel like it is safe to say that this was by far the most interesting section that I remember reading in the book. I am not sure why, but the problem seemed really practical/important to me, and I can totally see how big of a deal this algorithm was when it came out. The idea of encoding the most frequent letters using fewer bits seems like a no brainer. The section attracted and then kept my attention //throughout// the reading--and that is a first. 10/10. :)