HUFFMAN CODING
(VARIABLE LENGTH CODE)
BY:-ADITYA
(CSE,3RD YEAR)
CONTENT
• Definition
• Types
• Variable length
• Problem
• Application
WHAT IS HUFFMAN CODING ?
Data can be encoded efficiently using Huffman Codes.
It is a widely used and beneficial technique for compressing data.
• Huffman’s greedy algorithm uses a table of the frequencies of occurrences of
each character to build up an optimal way of representing each character as a
binary string.
TYPES OF HUFFMAN CODING
• Fixed length code
• Variable length code
VARIABLE LENGTH CODE
It uses variatextble length encoding.
It assigns variable length code to all the characters.
The code length of a character depends on how frequently it occurs in the given
text.
The character which occurs most frequently gets the smallest code.
• The character which occurs least frequently gets the largest code.
MAJOR STEPS IN HUFFMAN CODING
There are two major steps in Huffman Coding-
Building a Huffman Tree from the input characters.
• Assigning code to the characters by traversing the Huffman Tree.
HUFFMAN TREE
Step-01:
Create a leaf node for each character of the text.
Leaf node of a character contains the occurring frequency of that
character .
HUFFMAN TREE
Step-02:
Arrange all the nodes in increasing order of their frequency value.
Step-03:
Considering the first two nodes having minimum frequency
Create a new internal node.
The frequency of this new node is the sum of frequency of those two
nodes .
Make the first node as a left child and the other node as a right child of
newly created node.
HUFFMAN TREE
Step-04:
Keep repeating Step-02 and Step-03 until all the nodes form a single
tree .
The tree finally obtained is the desired Huffman Tree.
IMPORTANT FORMULAS
Formula-01:
Formula-02:
Total number of bits in Huffman encoded message
= Total number of characters in the message x Average code length per character
= ∑ ( frequencyi x Code lengthi )
PROBLEM
A file contains the following characters with the frequencies as shown. If Huffman Coding is
used for data compression, determine-
Huffman Code for each character
Average code length
Length of Huffman encoded message (in bits)
Characters Frequencies
a 10
e 15
i 12
o 3
u 4
s 13
t 1
SOLUTION
First let us construct the Huffman Tree.
Huffman Tree is constructed in the following steps-
Step 1:-
SOLUTION
Step 2:-
Step 3:-
SOLUTION
Step 4 :-.
SOLUTION
Now
We assign weight to all
the edges of the
constructed Huffman Tree.
Let us assign weight ‘0’ to
the left edges and weight
‘1’ to the right edges.
SOLUTION
Huffman Code for each Huffmancharacter
Following this rule, the Huffman Code for each character is-
a = 111
e = 10
i = 00
o = 11001
u = 1101
s = 01
t = 11000
SOLUTION
2. Average Code Length-
Using formula-01, we have-
Average code length
= ∑ ( frequencyi x code lengthi ) / ∑ ( frequencyi )
= { (10 x 3) + (15 x 2) + (12 x 2) + (3 x 5) + (4 x 4) + (13 x 2) + (1 x 5) } / (10 + 15 + 12 + 3 + 4 + 13 + 1)
= 2.52
3. Length of Huffman Encoded Message-
Using formula-02, we have-
Total number of bits in Huffman encoded message
= Total number of characters in the message x Average code length per character
= 58 x 2.52
= 146.16
≅ 147 bits
APPLICATION
Huffman coding is used in conventional compression formats like GZIP, BZIP2,
PKZIP, etc.
• For text and fax transmissions
THANK
YOU
REFRENCE:-HTTPS://WWW.GATEVIDYALAY.COM/HUFFMAN-CODING-HUFFMAN-ENCODING