KEMBAR78
5.3. Entropy Coding: Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany | PDF | Code | Data Compression
0% found this document useful (0 votes)
105 views45 pages

5.3. Entropy Coding: Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany

- Entropy coding schemes use shorter codes for more probable symbols to shorten average code length and simplify encoding/decoding. They allow for lossless data compression. - Entropy coding relies on statistically independent events to achieve optimal compression according to information theory. Common entropy coding methods include Huffman coding and arithmetic coding. - Huffman coding assigns variable-length binary codes to symbols based on their probabilities, with shorter codes assigned to more frequent symbols. It constructs a prefix code from a probability distribution to allow for unique decoding.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
105 views45 pages

5.3. Entropy Coding: Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany

- Entropy coding schemes use shorter codes for more probable symbols to shorten average code length and simplify encoding/decoding. They allow for lossless data compression. - Entropy coding relies on statistically independent events to achieve optimal compression according to information theory. Common entropy coding methods include Huffman coding and arithmetic coding. - Huffman coding assigns variable-length binary codes to symbols based on their probabilities, with shorter codes assigned to more frequent symbols. It constructs a prefix code from a probability distribution to allow for unique decoding.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

5.3.

Entropy Coding
• The different probabilities for the appearing of single symbols are
used
– to shorten the average code length by assigning shorter codes
to more probable symbols => Morse-, Huffman-, Arithmetic Code
– to simplify the encoding/decoding by assigning simpler codes
to more probable symbols => e.g. the braille
• Entropy coding schemes are lossless compression schemes.
• Entropy coding procedures rely on statistically independent
information events to produce optimal results (maximum theoretical
compression).

Remark:
A prefix code (or prefix-free code) is a code in which no codeword
is a prefix to another codeword. This enables unique decodability
with variable code length without any separator symbol.

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 1


The Braille

Artikel bei Wikipedia

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 2


Morse Code
Example: Letter statistic compared to Morse Code
probability probability
Letter Morse Code
German English
e 16.65% 12.41% .
n 10.36% 6.41% _.
i 8.14% 6.46% ..
t 5.53% 8.90% _
a 5.15% 8.09% ._
o 2.25% 8.13% ___
x 0.03% 0.20% _.._
y 0.03% 2.14% _.__

• variable code length & nonprefix code


 needs separator symbol for unique decodability
– Example: “..-…”  eeteee, ini, ...
Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 3
Huffman Coding (1)
• Idea: Use short bit patterns for frequently used symbols
– Sort all symbols by probability
– Get the two symbols with the lowest probability, remove them from
the list and insert a "parent symbol" to the sorted list where the
probability is the sum of both symbols
– If there are at least two elements left, continue with step 2
– Assign 0 and 1 to the branches of the tree
• Example: 1380 bits are used for "regular coding"
Symbol Count regular
E 135 000
A 120 001
D 110 010
B 50 011
C 45 100

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 4


Huffman Coding (2)

Symbol Count
E 135
A 120
D 110
P1 95

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 5


Huffman Coding (3)

Symbol Count
P2 205
E 135
A 120

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 6


Huffman Coding (4)

Symbol Count
P3 255
P2 205

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 7


Huffman Coding (5)

Symbol Count Huffman pi


E 135 00 0,294
A 120 01 0,261
D 110 10 0,239  1015 bits are used for huffman coding
B 50 110 0,109
C 45 111 0,098

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 8


Huffman Coding (6)
Symbol Count Huffman pi
E 135 00 0,2935
A 120 01 0,2609
D 110 10 0,2391
B 50 110 0,1087
C 45 111 0,0978

• Entropy:

=> H ≈ 2,1944
• Average code length:
l = 2*p1+2*p2+2*p3+3*p4+3*p5 ≈ 2,2065

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 9


Adaptive Huffman Coding (1)
• Problem:
– The previous algorithm requires statistical knowledge (the
probability distribution) which is often not available (e.g. live
audio, video).
– Even when statistical knowledge is available, it could be too
much overhead to send it.
• Solution:
– use of adaptive algorithms
– here: Adaptive Huffman Coding which keeps up with estimating
the required probability distribution from previously
encoded/decoded symbols by an update procedure

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 10


Adaptive Huffman Coding (2)

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 11


Adaptive Huffman Coding (3)
• Important: encoder and decoder have to use exactly the
same initialization and update_model routines
• update_model does two things:
– increment the count
– update the resulting Huffman tree
• during the updates, the Huffman tree will be maintained its sibling
property, i.e. the nodes are arranged in order of increasing weights
• when swapping is necessary, the farthest node with weight W is
swapped with the node whose weight has just been increased to
W+1 (If the node with the weight W has a subtree beneath it, then
the subtree will go with it)

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 12


Adaptive Huffman Coding (4)
• Example n-th Huffman tree

• resulting code for A: 000


Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 13
Adaptive Huffman Coding (5)
• n+2-th Huffman tree
– A was incremented twice node
– A and D swapped

• new code for A: 011

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 14


Adaptive Huffman Coding (6)
• A was incremented twice

• The 4th (A) and the 5th node have to swap.


Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 15
Adaptive Huffman Coding (7)
• resulting tree after 1st swap

• The 8th (E) and the 7th node have to swap.

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 16


Adaptive Huffman Coding (8)
• n+4-th Huffman tree

• new code for A: 10

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 17


Huffman Coding Evaluation (1)
• Advantages
– The algorithm is simple.
– Huffman Coding is optimal according to information theory
when the probability of each input symbol is a negative power
of two.
– If pmax is the largest probability in the probability model, then the
upper bound is H + pmax + ld[(2 ld e)/e] = H + pmax + 0,086.
(see proof of Gallager)

This is a tighter bound than

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 18


Huffman Coding Evaluation (2)
• Limitations:
– Huffman is designed to code single characters only. Therefore at
least one bit is required per character, e.g. a word of 8 characters
requires at least an 8 bit code
– Not suitable for strings of different length or changing probabilities
for characters in a different context, respectively
Examples for both interpretations of that problem:
• Huffman coding does not support different probabilities for: "c" "h" "s"
and "sch"
• For a usual german text p("e") > p("h") is valid, but if the preceding
characters have been "sc" then p("h") > p("e") is valid
– (non-satisfying) solutions for both scenarios:
• Solution: Use a special coding where "sch" is one character only, but
this requires knowledge about frequent character combinations in
advance
• Solution: use different huffman codes with respect to the context, but
this leads to large code tables which must be appended to the coded
data

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 19


Arithmetic Coding (1)
• Arithmetic coding generates codes that are optimal
according to information theory, supports average code
word length smaller 1 and strings of different length.
• The probability model is separated from the encoding
operation:
– Probability model: Encode strings and not single characters.
Assign a "codeword" to each possible string. A codeword consists
of a half open subinterval of the interval [0,1). Subintervals for
different strings do not intersect.
– Encoding: a given subinterval can be uniquely identified by any
value of that interval. Use the value with the shortest
nonambiguous binary representation to identify the subinterval.
• In practice, the subinterval is refined incrementally
using the probabilities of the individual events.
– If strings of different length are used then extra information is
needed to detect the end of a string

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 20


Arithmetic Coding (2)
• Example I:
– Four characters a, b, c, d with probabilities p(a)=0.4, p(b)=0.2,
p(c)=0.1, p(d)=0.3. Codeword assignment for string "abcd":

 codeword (subinterval): [0.21360, 0.21600)

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 21


Arithmetic Coding (3)
• Example I: (cont.)
– Encoding:
Any value within the subinterval [0.21360, 0.21600) will now be
a well-defined identifier of the string “abcd".

– "0.00110111" will be the shortest binary representation of the


subinterval.

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 22


Arithmetic Coding (4)
• Example I: (cont.)
– Decoding: "0.00110111" resp. 0.21484375

– Receiver can rebuild the subinterval development but:


code is member of any further subinterval of [0.21360, 0.21600)
=> extra symbol necessary to mark end of string

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 23


Arithmetic Coding (5)
• Example II
– Propabilties:

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 24


Arithmetic Coding (6)
Example II: (cont.)
• Termination rules:
– "a" and "!" are termination
symbols
– Strings starting with "b" have
a length of max 4.
Bit Output
• Decode: 10010001010
1 -
0 b
0 -
1 b
0 a

0 -
0 a

1 -
0 b
1 bb
0 b
Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 25
Arithmetic Coding (7)
Remarks:
• The more characters a string has, the better the result concerning average
code word length.
• Implementing arithmetic coding is more complex than huffman coding.
• Alphabets with k characters and strings with length m destroy the idea of
huffman coding for increasing m (codebook with size k^m). Arithmetic coding
can be adapted.
• If the alphabet is small and the probabilities are highly unbalanced arithmetic
coding is superior to huffman coding.
• Efficient arithmetic coding implementations exclusively rely on integer
arithmetic.
• United States Patent 4,122,440; Method and means for arithmetic string
coding; International Business Machines Corporation; October 24, 1978
• A precise comparison between arithmetic coding and huffman coding can be
found in [Sayood].

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 26


Lempel - Ziv (LZ77) (1)
• Algorithm for compression of character sequences:
– assumption: sequences of characters are repeated
– idea: replace a character sequence by a reference to an earlier
occurrence

1. Define a
• search buffer = (portion) of recently encoded data
• look-ahead buffer = not yet encoded data
2. Find the longest match between
• the first characters of the look ahead buffer
• and an arbitrary character sequence in the search buffer
3. Produces output <offset, length, next_character>
• offset + length = reference to earlier occurence
• next_character = the first character following the match in the look ahead buffer

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 27


Lempel - Ziv (LZ77) (2)
• Example:

Pos 1 2 3 4 5 6 7 8 9 10
Char A A B C B B A B C A

Step Pos Match Char Output


1 1 - A <0,0,A>
2 2 A B <1,1,B>
3 4 - C <0,0,C>
4 5 B B <2,1,B>
5 7 ABC A <5,3,A>

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 28


Lempel - Ziv (LZ77) (3)
• Remarks:
– the search and look ahead buffer have a limited size
• the number of bits needed to encode pointers and length information depends
on the buffer sizes
• worst case: the character sequences are longer than one of the buffers
• typical size is 4 - 64 KB
– sometimes other representations of the triple are used
• next_char only if necessary (i.e. no match found)
• enabling dynamic change of buffer sizes
– LZ77 or variants are often used before entropy coding
• LZ77 + Huffmann coding are used by "gzip" and for "PNG"
• "PKZip", "Zip", "LHarc" and "ARJ" use LZ77-based algorithms
– LZW Patent Information
• Unisys U.S. LZW Patent No. 4,558,302 expired on June 20, 2003, the
counterpart patents in the UK, France, Germany, and Italy expired on June 18,
2004, the Japanese counterpart patents expired on June 20, 2004 and the
counterpart Canadian patent expired on July 7, 2004.
• Unisys Corporation also holds patents on a number of improvements on the
inventions claimed in the expired patents.
Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 29
Lempel - Ziv - Welch (LZ78, LZW) (4)
• LZ78:
– drops the search buffer and keeps an explicit dictionary
(build at encoder and decoder in identical manner)
– produces output <index, next_character>
– example, adaptive encoding "abababa" will result in:
Step Output Directory
1 <0,a> "a"
2 <0,b> "b"
3 <1,b> "ab"
4 <3,a> "aba"
• LZW:
– produces only output <index>
– dictionary has to be initialized with all letters of source alphabet
– new patterns are added to the dictionary
– used by unix "compress", "GIF", "V24.bis", "TIFF"
• In practice, limit size of the directory.
Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 30
Lempel - Ziv - Welch (LZW) (5)
• Example: wabbapwabbapwabbapwabbapwoopwoopwoo

 Encoder output sequence: 5 2 3 3 2 1

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 31


Lempel - Ziv - Welch (LZW) (6)

 Encoder output sequence: 5 2 3 3 2 1 6 8 10 12 9 11 7 16 5 4 4 11 21 23 4

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 32


Run-length Coding
• Run-length coding compresses: same successive symbols
• Example:

– "x" marks the beginning of a coded sequence


– special coding is required to code "x" as a symbol, here "xx" is
used.
– How to code a bit stream?
100101111100000001010010000000000011110
– Zero suppression is a special form of run-length coding

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 33


5.4. Quantization (1)
• one of the simplest and most general ideas in lossy
compression
• quantization
Process of representing a large - possibly infinite - set of
values with a much smaller set, i.e. represent each source
output using one of a small number of codewords.
• quantizer
An encoder-decoder pair with
– encoder: divides the range of values that a source generates into a
number of intervals each represented by a distinct codeword
(interval number) and maps all the source outputs that fall into a
particular interval by the representing codeword
– decoder: generates for every codeword a reconstruction value,
that in some sense best represents all the values in the interval
• uniform versus non-uniform quantization, scalar versus
vector quantization
Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 34
Quantization (2)
Uniform scalar quantization • uniform: equal interval width Δ
• quantization (encoding):

• reconstruction (decoding):

• quantization error:

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 35


Quantization (3)

non uniform scalar quantization: pdf-optimized, perceptual optimized, snr-optimized

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 36


5.5. Differential Encoding (1)
• Belongs to predictive and pre-coding techniques
• Takes advantage of the high level of correlation
between neighboring samples in sources such as
speech and images by encoding differences.
• The main objective with that is a decrease of the signal
variance. The smaller the signal variance is, the more
concentrated the signal amplitudes upon a certain value
is. This brings with it a decrease of the entropy and
therefore favors a following entropy coding.
• Common application as a part of composite
compression strategies, e.g. to compress low frequency
components (high sample-to-sample correlation)

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 37


Differential Encoding (2)
• Block diagram: simple differential encoding system (without quantization)

with

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 38


Differential Encoding (3)
• Simple differential encoding system (without
quantization):
– simplest prediction: predict each symbol directly by its
predecessor, i.e.

– general case:

– assumption: linear dependencies, i.e.

– optimal choice of ak - minimization of prediction error


 discrete Wiener-Hopf equations, involves autocorrelation
functions

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 39


Differential Encoding (4)
Example: simplest prediction, quantization involved
• source sequence: 6,2 9,7 13,2 5,9 8 7,4 4,2 1,8
• difference sequence: 6,2 3,5 3,5 -7,3 2,1 -0,6 -3,2 -2,4
• quantizer with output values: -6, -4, -2, 0, 2, 4, 6
• quantized difference sequence: 6 4 4 -6 2 0 -4 -2
• reconstructed sequence: 6 10 14 8 10 10 6 4
• error between original and reconstruction:
0,2 -0,3 -0,8 -2,1 -2 -2,6 -1,8 -2,2
• Problem: quantization error accumulates as process
continues

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 40


Differential Encoding (4a)
• Problem: the
quantization error (here
denoted by "q")
accumulates as the
process continues
• Reason: The encoder
generates a difference
sequence based on
original data; the
decoder adds back
quantized differences
onto a distorted version
of the original signal!

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 41


Differential Encoding (4b)
• Solution: force both sides
to use the same data, i.e.
force the encoder to use
the reconstructed
sequence, too

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 42


Differential Encoding (5)
• Block diagram: simple differential encoding system
(with quantization)

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 43


Differential Encoding (6)
• DPCM (differential pulse code modulation)
– linear prediction dependencies, predictor coefficients by solving
Wiener-Hopf equations
– adaptive variants (predictor and quantizer)
– invited by Bell Laboratories
– primarily known as a speech encoding system and widely used
in telephone communications
• DM (delta modulation)
– a very simplified version of DPCM with a 1-Bit (two level)
quantizer
– the two level quantizer with output values +/- Δ can only
represent a sample to sample difference of Δ

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 44


Differential Encoding (7)
• Differential encoding for two-dimensional signals:
– general (linear) case:

– raster scan, row by row:

Prof. Dr. Paul Müller, ICSY Lab, University of Kaiserslautern, Germany 45

You might also like