0% found this document useful (0 votes)

173 views40 pages

Huffman Coding

Huffman coding is an algorithm that creates a variable-length prefix code to encode messages. It builds a binary tree based on symbol frequencies, with more frequent symbols nearer the root. Each symbol is assigned a code consisting of the path from root to its leaf node. This results in shorter codes for more frequent symbols, allowing the entire message to be encoded using the fewest possible bits compared to any other prefix code.

Uploaded by

Ricardo Lazo Jr.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

173 views40 pages

Huffman Coding

Uploaded by

Ricardo Lazo Jr.

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 40

Applications of Trees

Encoding messages

 Encode a message composed of a string of characters

 Codes used by computer systems
 ASCII
• uses 8 bits per character
• can encode 256 characters
 Unicode
• 16 bits per character
• can encode 65536 characters
• includes all characters encoded by ASCII
 ASCII and Unicode are fixed-length codes
 all characters represented by same number of bits
Problems

 Suppose that we want to encode a message

constructed from the symbols A, B, C, D, and E
using a fixed-length code
 How many bits are required to encode each
symbol?
 at least 3 bits are required
 2 bits are not enough (can only encode four
symbols)
 How many bits are required to encode the
message DEAACAAAAABA?
 there are twelve symbols, each requires 3 bits
 12*3 = 36 bits are required
Drawbacks of fixed-length codes

 Wasted space
 Unicode uses twice as much space as ASCII
• inefficient for plain-text messages containing only ASCII characters
 Same number of bits used to represent all characters
 ‘a’ and ‘e’ occur more frequently than ‘q’ and ‘z’

 Potential solution: use variable-length codes

 variable number of bits to represent characters when frequency of
occurrence is known
 short codes for characters that occur frequently
Advantages of variable-length codes

 The advantage of variable-length codes over fixed-length is short codes

can be given to characters that occur frequently
 on average, the length of the encoded message is less than fixed-
length encoding
 Potential problem: how do we know where one character ends and
another begins?
• not a problem if number of bits is fixed!

A = 00
0010110111001111111111
B = 01
C = 10 ACDBADDDDD
D = 11
Prefix property

 A code has the prefix property if no character code is the prefix (start of
the code) for another character
 Example:

Symbol Code
P 000
01001101100010
Q 11
R 01 RSTQPT
S 001
T 10
 000 is not a prefix of 11, 01, 001, or 10
 11 is not a prefix of 000, 01, 001, or 10 …
Code without prefix property

 The following code does not have prefix property

Symbol Code
P 0
Q 1
R 01
S 10
T 11

 The pattern 1110 can be decoded as QQQP, QTP, QQS, or TS

Problem
 Design a variable-length prefix-free code such that the message
DEAACAAAAABA can be encoded using 22 bits
 Possible solution:
 A occurs eight times while B, C, D, and E each occur once
 represent A with a one bit code, say 0
• remaining codes cannot start with 0
 represent B with the two bit code 10
• remaining codes cannot start with 0 or 10
 represent C with 110
 represent D with 1110
 represent E with 11110
Encoded message

DEAACAAAAABA

Symbol Code
A 0
B 10
C 110
D 1110
E 11110

1110111100011000000100 22 bits
Another possible code

DEAACAAAAABA

Symbol Code
A 0
B 100
C 101
D 1101
E 1111

1101111100101000001000 22 bits
Better code

DEAACAAAAABA

Symbol Code
A 0
B 100
C 101
D 110
E 111

11011100101000001000 20 bits
What code to use?

 Question: Is there a variable-length code that makes the most efficient

use of space?

Answer: Yes!
Huffman coding tree

 Binary tree
 each leaf contains symbol (character)
 label edge from node to left child with 0
 label edge from node to right child with 1
 Code for any symbol obtained by following path from root to the leaf
containing symbol
 Code has prefix property
 leaf node cannot appear on path to another leaf
 note: fixed-length codes are represented by a complete Huffman tree
and clearly have the prefix property
Building a Huffman tree

 Find frequencies of each symbol occurring in message

 Begin with a forest of single node trees
 each contain symbol and its frequency
 Do recursively
 select two trees with smallest frequency at the root
 produce a new binary tree with the selected trees as children and
store the sum of their frequencies in the root
 Recursion ends when there is one tree
 this is the Huffman coding tree
Example

 Build the Huffman coding tree for the message

This is his message
 Character frequencies

A G M T E H _ I S

1 1 1 1 2 2 3 3 5

 Begin with forest of single trees

1 1 1 1 2 2 3 3 5
A G M T E H _ I S
Step 1

1 1 1 1 2 2 3 3 5
A G M T E H _ I S
Step 2

2 2

1 1 1 1 2 2 3 3 5
A G M T E H _ I S
Step 3

2 2 4

1 1 1 1 2 2 3 3 5
A G M T E H _ I S
Step 4

2 2 4

1 1 1 1 2 2 3 3 5
A G M T E H _ I S
Step 5

2 2 4 6

1 1 1 1 2 2 3 3 5
A G M T E H _ I S
Step 6

4 4

2 2 2 2 6
E H

1 1 1 1 3 3 5
A G M T _ I S
Step 7

8 11

4 4 6 5
S

2 2 2 2 3 3
E H _ I

1 1 1 1
A G M T
Step 8
19

11 8

6 5 4 4
S

3 3 2 2 2 2
_ I E H

1 1 1 1
A G M T
Label edges
19
0 1
11 8
0 1 S 01 0 1
E 110
6 5 4 4
H 111
0 1 S 0 1 0 1
_ 000
3 3 I 001
2 2 2 2
A 1000
_ I 0 1 0 1 H
G 1001 E
M 1010 1 1 1 1
T 1011
A G M T
Huffman code & encoded message
This is his message

S 01
E 110
H 111
_ 000
I 001
A 1000
G 1001
M 1010
T 1011

10111110010100000101000111001010001010110010110001001110
Huffman Coding

• an algorithm that takes as input the frequencies (which are the

probabilities of occurrences) of symbols in a string and produces as
output a prefix code that encodes the string using the fewest possible
bits, among all possible binary prefix codes for these symbols.
Huffman Coding
• This algorithm, known as Huffman coding, was developed by David
Huffman in a term paper he wrote in 1951 while a graduate student
at MIT.
• (Note that this algorithm assumes that we already know how many
times each symbol occurs in the string, so we can compute the
frequency of each symbol by dividing the number of times this symbol
occurs by the length of the string.)
Huffman Coding
• Huffman coding is a fundamental algorithm in data compression, the
subject devoted to reducing the number of bits required to represent
information.
• Huffman coding is extensively used to compress bit strings
representing text and it also plays an important role in compressing
audio and image files.
Example2:

 Use Huffman coding to encode the following symbols with the

frequencies listed: A: 0.08, B: 0.10, C: 0.12, D: 0.15, E: 0.20, F: 0.35.
What is the average number of bits used to encode a character?
 Solution: STEP1

0.08 0.10 0.12 0.15 0.20 0.35

A B C D E F
Example2:

 Solution: STEP2

0.18

0.10 0.08 0.12 0.15 0.20 0.35

B A C D E F
Example2:

 Solution: STEP3

0.18 0.27

0.10 0.08 0.15 0.12 0.20 0.35

B A D C E F
Example2:
 Solution: STEP4

0.38

0.20 0.18 0.27

0.10 0.08 0.15 0.12 0.35

B A D C F
Example2:
 Solution: STEP5

0.38 0.62

0.27
0.20 0.18 0.35

F
E

0.15 0.12
0.10 0.08

D C
B A
Example2:
1.00
 Solution: STEP6

0.38
0.62

0.20 0.18
0.35
0.27
E
F

0.10 0.08

0.15 0.12
B A

D C
Example2:
1.00
 Solution: STEP7
0

0.38
0.62 0
0

0.20 0.18
0.35
0.27
E 0
F
0
0.10 0.08

0.15 0.12
B A

D C
Example2:
1.00
 Solution: STEP8
1
0

0.38
0.62 0 1
0
1
0.20 0.18
0.35
0.27
E 0 1
F
0 1
0.10 0.08

0.15 0.12
B A

D C
Example2:
1.00
 Solution: STEP8
1
0

0.38
0.62 1
0
0 1 Symbol Code
A 111
B 110 0.20 0.18
0.27
0.35 C 011
D 010
E 10 E
F 0 1 0 1
F 00

0.10 0.08
0.15 0.12

B A
D C
Example2:

 Use Huffman coding to encode the following symbols with the

frequencies listed: A: 0.08, B: 0.10, C: 0.12, D: 0.15, E: 0.20, F: 0.35.
What is the average number of bits used to encode a character?
 Solution:
Symbol Code average number of bits
used to encode a
character
A 111 3*0.08
B 110 3*0.10
C 011 3*0.12
D 010 3*0.15
E 10 2*0.20
F 00 2*0.35
2.45
Try this!
 Construct a Huffman code for the letters of the English alphabet where
the frequencies of letters in typical English text are as shown in this
table.
Thank you
for learning discrete math
with me!

Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
No ratings yet
Huffman Codes and Its Implementation: Submitted by Kesarwani Aashita Int. M.Sc. in Applied Mathematics (3 Year)
28 pages
Embedded C Programming Basics
No ratings yet
Embedded C Programming Basics
16 pages
Embedded System LESSONPLAN
No ratings yet
Embedded System LESSONPLAN
7 pages
Ethernet (LAN) Address Resolution Protocol (ARP) Reverse Address Resolution Protocol (RARP)
100% (1)
Ethernet (LAN) Address Resolution Protocol (ARP) Reverse Address Resolution Protocol (RARP)
55 pages
3-3-Arithmetic Coding
100% (1)
3-3-Arithmetic Coding
71 pages
21985A0425 Report
No ratings yet
21985A0425 Report
24 pages
ARM INstruction Set
100% (1)
ARM INstruction Set
6 pages
MM Notes PDF
50% (2)
MM Notes PDF
800 pages
Architectural Support For HLL
67% (3)
Architectural Support For HLL
18 pages
Effect of Finite Register Length in FIR Filter Design: Dr. Parul Tyagi (Asso. Prof.) & Dr. Neha Singh (Asst. Prof.)
No ratings yet
Effect of Finite Register Length in FIR Filter Design: Dr. Parul Tyagi (Asso. Prof.) & Dr. Neha Singh (Asst. Prof.)
71 pages
Question Bank of Computer Network
No ratings yet
Question Bank of Computer Network
1 page
Chapter 1 Digital Systems and Binary Numbers
No ratings yet
Chapter 1 Digital Systems and Binary Numbers
63 pages
WSN Unit 5
No ratings yet
WSN Unit 5
22 pages
MVSR Engineering College Department of Computer Science and Engineering Course Name: Mini Project Course Code: Cs 218 Course Objectives
No ratings yet
MVSR Engineering College Department of Computer Science and Engineering Course Name: Mini Project Course Code: Cs 218 Course Objectives
2 pages
Instruction-Level Parallelism (ILP), Since The
100% (1)
Instruction-Level Parallelism (ILP), Since The
57 pages
Arm Processor Presentation
No ratings yet
Arm Processor Presentation
15 pages
ARM Introduction & Instruction Set Architecture: Aleksandar Milenkovic
No ratings yet
ARM Introduction & Instruction Set Architecture: Aleksandar Milenkovic
31 pages
Hardwired and Microprogrammed
No ratings yet
Hardwired and Microprogrammed
45 pages
Embedded System Kerala University Module 1 Notes
100% (1)
Embedded System Kerala University Module 1 Notes
13 pages
Chapter 15
No ratings yet
Chapter 15
21 pages
Hamming Code Examples
No ratings yet
Hamming Code Examples
12 pages
Lempel-Ziv-Welch (LZW) Compression Algorithm
No ratings yet
Lempel-Ziv-Welch (LZW) Compression Algorithm
12 pages
Image Filtering
0% (1)
Image Filtering
56 pages
Addition of Two BCD Numbers
No ratings yet
Addition of Two BCD Numbers
1 page
Intel 8086 Microprocessor Guide
No ratings yet
Intel 8086 Microprocessor Guide
79 pages
Data Compression Techniques
No ratings yet
Data Compression Techniques
41 pages
Read Chapter 3, The 8051 Microcontroller Architecture, Programming and Applications by Kenneth .J.Ayala
No ratings yet
Read Chapter 3, The 8051 Microcontroller Architecture, Programming and Applications by Kenneth .J.Ayala
40 pages
2nd Largest No - in An Array Using 8086
No ratings yet
2nd Largest No - in An Array Using 8086
9 pages
CD Unit 4 Compiler Design Jntuk r20
No ratings yet
CD Unit 4 Compiler Design Jntuk r20
17 pages
Unit-Iii Arm Application Development
No ratings yet
Unit-Iii Arm Application Development
36 pages
Microprocessors and Microcontrollers Notes - Programs For 16 Bit Arithmetic Operations For 8086 (Using Various Addressing Modes) - Studentboxoffice
100% (1)
Microprocessors and Microcontrollers Notes - Programs For 16 Bit Arithmetic Operations For 8086 (Using Various Addressing Modes) - Studentboxoffice
16 pages
Ch1-8086 Microprocessor
No ratings yet
Ch1-8086 Microprocessor
45 pages
Sim8085 PDF
100% (1)
Sim8085 PDF
2 pages
Coa Module 5
No ratings yet
Coa Module 5
17 pages
How To Start Programming For ARM7 Based LPC2148 Microcontroller
100% (1)
How To Start Programming For ARM7 Based LPC2148 Microcontroller
5 pages
21BCT0093 VL2022230504083 Ast08
No ratings yet
21BCT0093 VL2022230504083 Ast08
15 pages
Embedded Systems Design Guide
No ratings yet
Embedded Systems Design Guide
44 pages
15EC62 Arm MC and Es Course File 2018
No ratings yet
15EC62 Arm MC and Es Course File 2018
10 pages
Verilog: Hardware Description Language
No ratings yet
Verilog: Hardware Description Language
53 pages
Q. 8086 Programmer's Model: Register Organization (IMP)
No ratings yet
Q. 8086 Programmer's Model: Register Organization (IMP)
6 pages
Introduction To Embedded Systems: Bus Structure
No ratings yet
Introduction To Embedded Systems: Bus Structure
17 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Exp4-Add, Sub, Mul, Div of 8 Bit & 16 Bit
No ratings yet
Exp4-Add, Sub, Mul, Div of 8 Bit & 16 Bit
6 pages
Thumb
No ratings yet
Thumb
32 pages
Microprocessor and Assembly Language Lecture 1
No ratings yet
Microprocessor and Assembly Language Lecture 1
31 pages
HCL Interview Questions - Q&A
No ratings yet
HCL Interview Questions - Q&A
8 pages
The Language of Bits: Computer Organisation and Architecture
No ratings yet
The Language of Bits: Computer Organisation and Architecture
72 pages
Fpga Implementation of Binary Search 1
No ratings yet
Fpga Implementation of Binary Search 1
5 pages
03.embedded System - Question Bank
No ratings yet
03.embedded System - Question Bank
31 pages
Sample Eda Lab (Part-A) Manual: Simulation Output
No ratings yet
Sample Eda Lab (Part-A) Manual: Simulation Output
20 pages
Practice Programs - ARM 7
No ratings yet
Practice Programs - ARM 7
2 pages
ARM7 LPC214x Peripheral Guide
No ratings yet
ARM7 LPC214x Peripheral Guide
44 pages
Thumb Instructions
No ratings yet
Thumb Instructions
37 pages
Huffman
No ratings yet
Huffman
24 pages
Huffman Code
No ratings yet
Huffman Code
25 pages
11 Huffman Coding
No ratings yet
11 Huffman Coding
25 pages
Data Compression
No ratings yet
Data Compression
18 pages
Huffman Coding: Greedy Algorithm Guide
No ratings yet
Huffman Coding: Greedy Algorithm Guide
27 pages
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
No ratings yet
Data Structure: Huffman Tree:Project Submitted To: Sir Abdul Wahab
24 pages
Unite 4-Greedy Method - CSE
No ratings yet
Unite 4-Greedy Method - CSE
41 pages
Coa Unit 1 QB
No ratings yet
Coa Unit 1 QB
3 pages
Alibata
No ratings yet
Alibata
5 pages
Guidelines Paul Antonio
100% (2)
Guidelines Paul Antonio
4 pages
Goppa Mceliece
No ratings yet
Goppa Mceliece
41 pages
Adobe Scan May 01, 2024
No ratings yet
Adobe Scan May 01, 2024
15 pages
Module 2 Datarepresentation PartA Bhanu Chander
No ratings yet
Module 2 Datarepresentation PartA Bhanu Chander
28 pages
Maths
No ratings yet
Maths
11 pages
CA715-COA: Number System - Binary Codes - Complements (Binary and Decimal)
No ratings yet
CA715-COA: Number System - Binary Codes - Complements (Binary and Decimal)
14 pages
Incremental Redundancy, Fountain Codes and Advanced Topics: Suayb S. Arslan
No ratings yet
Incremental Redundancy, Fountain Codes and Advanced Topics: Suayb S. Arslan
57 pages
Fast Lempel-ZIV (LZ'78) Algorithm Using Codebook Hashing: Megha Atwal, Lovnish Bansal
No ratings yet
Fast Lempel-ZIV (LZ'78) Algorithm Using Codebook Hashing: Megha Atwal, Lovnish Bansal
4 pages
Exercise 3 - Binary Coded Decimal
No ratings yet
Exercise 3 - Binary Coded Decimal
3 pages
Computer Science and Entrepreneurship, Notes 9th Chapter 2
No ratings yet
Computer Science and Entrepreneurship, Notes 9th Chapter 2
14 pages
Sungha Jung - Waiting
No ratings yet
Sungha Jung - Waiting
9 pages
Call Log Report October 2020
No ratings yet
Call Log Report October 2020
12 pages
Dozenal System Db31315 - 0
No ratings yet
Dozenal System Db31315 - 0
6 pages
Checksum Networking
No ratings yet
Checksum Networking
12 pages
Cyrillic Alphabet Transliterated
No ratings yet
Cyrillic Alphabet Transliterated
1 page
ECE4007 Info Theory & Coding Exam
No ratings yet
ECE4007 Info Theory & Coding Exam
8 pages
Converting Between Mixed Number and Improper Fractions (Print)
No ratings yet
Converting Between Mixed Number and Improper Fractions (Print)
3 pages
New Alphabet Translation Program
No ratings yet
New Alphabet Translation Program
24 pages
AssignmentAnswerFormITEC1000 1
No ratings yet
AssignmentAnswerFormITEC1000 1
6 pages
Math 4 ST
0% (1)
Math 4 ST
3 pages
Class 21 Number Systems and Codes
100% (2)
Class 21 Number Systems and Codes
32 pages
Quiz I: Name
No ratings yet
Quiz I: Name
13 pages
Answer: Code Words Creates B. Invalid Codeword C. Valid Data D. Invalid Data
No ratings yet
Answer: Code Words Creates B. Invalid Codeword C. Valid Data D. Invalid Data
15 pages
Code À Barre Crystal Report
No ratings yet
Code À Barre Crystal Report
19 pages
Digital Arithmetic: Operations and Circuits: Chapter-6
No ratings yet
Digital Arithmetic: Operations and Circuits: Chapter-6
44 pages
Sundanese U1B80 PDF
No ratings yet
Sundanese U1B80 PDF
3 pages
Glyphicons CSS Icon Reference
No ratings yet
Glyphicons CSS Icon Reference
8 pages
MMW M2 PDF
No ratings yet
MMW M2 PDF
8 pages

Huffman Coding

Uploaded by

Huffman Coding

Uploaded by

Applications of Trees

 Encode a message composed of a string of characters

 Suppose that we want to encode a message

 Potential solution: use variable-length codes

 The advantage of variable-length codes over fixed-length is short codes

 The following code does not have prefix property

 The pattern 1110 can be decoded as QQQP, QTP, QQS, or TS

 Question: Is there a variable-length code that makes the most efficient

 Find frequencies of each symbol occurring in message

 Build the Huffman coding tree for the message

 Begin with forest of single trees

• an algorithm that takes as input the frequencies (which are the

 Use Huffman coding to encode the following symbols with the

0.08 0.10 0.12 0.15 0.20 0.35

0.10 0.08 0.12 0.15 0.20 0.35

0.10 0.08 0.15 0.12 0.20 0.35

0.20 0.18 0.27

0.10 0.08 0.15 0.12 0.35

 Use Huffman coding to encode the following symbols with the

You might also like