Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
courses:cs211:winter2018:journals:donohuem:chapter4 [2018/03/11 21:56] donohuemcourses:cs211:winter2018:journals:donohuem:chapter4 [2018/03/11 21:59] (current) – [4.8 Huffman Codes] donohuem
Line 54: Line 54:
 ===== 4.8 Huffman Codes ===== ===== 4.8 Huffman Codes =====
 Our problem is to encode symbols (letters of the alphabet) into bits. However, there are two important stipulations we must meet. First, we want our encoding to be as concise as possible (minimize ABL). This means that frequently used letters like e, t, a, o, and i should be smaller in size than less frequently used letters. Second, there should be no ambiguity in our encoding. Encoded symbols should be unique and not confusable with other symbols. To meet these two requirements, we use //Optimal Prefix Codes//, where each prefix of an encoded letter is distinct and the size of each prefix corresponds to a letter's frequency. An algorithm that creates optimal prefix codes would utilize a binary tree. More specifically, we would take the two most frequent letters, make them leaf nodes in the tree, have their parent be their combined frequencies, then recursively repeat for the remaining letters. Huffman's Algorithm: Our problem is to encode symbols (letters of the alphabet) into bits. However, there are two important stipulations we must meet. First, we want our encoding to be as concise as possible (minimize ABL). This means that frequently used letters like e, t, a, o, and i should be smaller in size than less frequently used letters. Second, there should be no ambiguity in our encoding. Encoded symbols should be unique and not confusable with other symbols. To meet these two requirements, we use //Optimal Prefix Codes//, where each prefix of an encoded letter is distinct and the size of each prefix corresponds to a letter's frequency. An algorithm that creates optimal prefix codes would utilize a binary tree. More specifically, we would take the two most frequent letters, make them leaf nodes in the tree, have their parent be their combined frequencies, then recursively repeat for the remaining letters. Huffman's Algorithm:
-if S has two letters: + 
-Encode one letter as 0 and the other letter as 1 + 
-else: +     if S has two letters: 
-Let y* and z* be the two lowest-frequency letters +        Encode one letter as 0 and the other letter as 1 
-Form a new alphabet S’ by deleted y* and z* and replacing +     else: 
-them with a new letter w of freq fy* + fz* +        Let y* and z* be the two lowest-frequency letters 
-Recursively construct a prefix code y’ for S’ with tree T’ +        Form a new alphabet S’ by deleted y* and z* and replacing 
-Define a prefix code for S as follows: +         them with a new letter w of freq fy* + fz* 
-Start with T’ +        Recursively construct a prefix code y’ for S’ with tree T’ 
-Take the leaf labeled w and add two children below it +        Define a prefix code for S as follows: 
-labeled y* and z*+            Start with T’ 
 +            Take the leaf labeled w and add two children below it 
 +             labeled y* and z* 
 + 
 +The run time of this algorithm can be O(klogk) when it uses a priority queue, where k is the number of letters in the alphabet. The readability of this section is 6/10.
  
  
  
courses/cs211/winter2018/journals/donohuem/chapter4.1520805393.txt.gz · Last modified: by donohuem
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0