Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2018:journals:devlinn:chapter4 [2018/03/03 19:38] – devlinn
+++ courses:cs211:winter2018:journals:devlinn:chapter4 [2018/03/12 21:00] (current) – devlinn
@@ Line 82: / Line 82: @@
 This section was also a bit challenging to understand. We started to discuss this in class. I would have liked more visuals in the book to guide me through the maximum clustering problem. I would rate it a 6.
+===== 4.8 Huffman Codes and Data Compression =====
+For this problem (data compression), the goal is to produce an algorithm that shrinks the size of a problem instance and uses recursion to solve the smaller problem. If we want to encode symbols using bits, we can create a greedy algorithm to efficiently solve this problem and reduce the average number of bits per letter. We would also like our encoding scheme to avoid ambiguity, so we need to make sure no encoding for a particular symbol is a prefix of another symbol's encoding.
+Another thing to consider when creating an optimal encoding of symbols is how frequently certain letters appear in everyday language. It is preferable to assign these letters a smaller number of bits to minimize our use of bits. To do this, we assign a frequency to every letter, with the total frequency of all letters summing to 1.
+To construct a greedy algorithm for this problem, we will use binary trees. This ensures that our encoding constructed from the binary tree T will be a prefix code. In addition, the binary tree which corresponds to the //optimal// prefix code is full. Here is the algorithm, //Huffman's Algorithm//, to construct an optimal prefix code, called a //Huffman Code//:
+    To construct a prefix code for alphabet S, with given frequencies:
+        If S has two letters then
+            Encode one letter using 0 and the other using 1
+        Else
+            Let y* and z* be the two lowest-frequency letters
+            Form a new alphabet S' by deleting y* and z* and replacing them with a new letter w of frequency fy* + fz*
+            Recursively construct a prefix code y' for S', with tree T'
+            Define a prefix code for S as follows:
+                Start with T'
+                Take the leaf labeled w and add two children below it labeled y* and z*
+            Endif
+When implemented using a priority queue, Huffman's Algorithm can run in //O//(//k//log//k//) time, where //k// is the number of letters in the alphabet. This unit is very clear to me. It was explained well in class and the examples we went over were straightforward. In addition, I find the logic behind this unit very intuitive. I rate this a 9.5 for my understanding.