0% found this document useful (0 votes)

19 views17 pages

Chapter 5 Data Compression

Chapter 5 discusses basic data compression techniques, categorizing them into lossless and lossy methods. It covers concepts such as redundancy, variable length coding, Huffman encoding, run-length encoding, and quantization, explaining how these methods reduce data size while preserving or approximating the original information. The chapter highlights the trade-offs between compression efficiency and data fidelity, particularly in the context of different types of data.

Uploaded by

bakr khader

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views17 pages

Chapter 5 Data Compression

Uploaded by

bakr khader

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 17

Chapter 5

Basic Compression

 Data Compression
 Redundancy
 Variable Length Coding
 Huffman encoding
 Run Length Encoding (RLE)
 Quantization (Lossy)

1
Data Compression
Two categories
• Information Preserving
– Error free compression
– Original data can be recovered completely
• Lossy
– Original data is approximated
– Less than perfect
– Generally allows much higher compression

2
Basics
Data Compression
– Process of reducing the amount of data required to
represent a given quantity of information
Data vs. Information
– Data and Information are not the same thing
– Data
• the means by which information is conveyed
• various amounts of data can convey the same
information
– Information
• “A signal that contains no uncertainty”
3
Redundancy
Redundancy
– “data” that provides no relevant information
– “data” that restates what is already known
• For example
– Consider that N1 and N2 denote the number of “data
units” in two sets that represent the same information
– where Cr is the “Compression Ratio”
• Cr = N1 / N2
• EXAMPLE
N1 = 10 & N2 = 1 data can encode the same information Compression is Ratio
Cr = N1/N2 = 10 (or 10:1)
Implying 90% of the data in N1 is redundant
4
Variable Length Coding (1)
Our binary system (called natural binary) is not always that good at
representing data from a compression point of view

Consider the following string of data:

abaaaaabbbcccccaaaaaaaabbbbdaaaaaaa

There are 4 different pieces of information (let’s say 4 symbols)

a, b, c, d

In natural binary we would need at least 2 bits to represent this, assigning bits as
follows:
a=00, b=01, c=10, d=11

There
5 are 35 pieces of data, that is 35*2bits = 70bits
Variable Length Coding (2)
Now, consider the occurrence of each symbol:
a,b,c,d
Abaaaaabbbcccccaaaaaaaabbbbdaaaaaaa

a = 21/35 (60%)
b = 8 /35 (22%)
c = 5/35 (14%)
d = 1/35 (02%)

6
Variable Length Coding (3)
Idea of variable length coding: assign less bits to encode to
more frequent symbols, more bits for less frequent symbols

Bit assignment using VLC:

a=1, b=01, c=001, d=0001

Now, compute the number of bits used to encode the same

data:
21*(1) + 8*(2) + 5*(3) + 1*(4) = 56bits

a = 21/35 (84%)
So, we have a compression ratio of 70:56 or 1.25, meaning
b = 8 /35 (22%)
c = 5/35 (14%)
7
20% of the data using natural binary encoding is redundant
d = 1/35 (02%)
David Huffman

Huffman encoding

This is an example of error free coding, the information is completely the same,
the data is different

• This idea of variable length coding is used in many places

– Area codes for cities in large countries (China/India)

• Prof. David A. Huffman developed an algorithm to take a data set and compute
its “optimal” encoding (bit assignment)
– He developed this as a student at MIT

• This is very commonly applied to many compression techniques as a final stage

• There are other VLC techniques, such as Arthimetic coding and LZW coding
8
(ZIP)
Huffman encoding
Is the same of VLC, but using the Histogram
amount to measure the frequency for each level
in the Gray image.

See next example:

Note
The image coded with 3 bits per pixel

9
Huffman encoding
codew Gray
lp l p
ord level
pi=hi/n
0.000 0
• p(i) is the probability of 0.012 1
occurrence for a gray level i. 0.071 2
• h is the frequency of occurrence
0.019 3
of a gray level i
• n is the total number of pixels in 0.853 4
the image 0.023 5
0.019 6
0.003 7

Digital image processing a practical introduction using java, Nick Efford

10
Huffman encoding
codew Gray
lp l p
ord level
1111
0 0.000 0
4 0.853 11
0
2 0.071 1.317 1111
0
5 0.023 0.042 0 0.147 0.012 1
0
3 0.019 1 0 0.076 1
6 0.019 10 0.071 2
0 0.034 1
1 0.012 0.015 1 1101 0.019 3
0
7 0.003 0.003 1 0 0.853 4
0 0.000 1
1 1100 0.023 5
1110 0.019 6
1111
0.003 7
10

Digital image processing a practical introduction using java, Nick Efford

11
Huffman encoding
codew Gray
lp l p
ord level
1111
6 0.000 0
11
1111
5 0.012 1
0
2 10 0.071 2
4 1101 0.019 3
1 0 0.853 4
4 1100 0.023 5
4 1110 0.019 6
1111
6 0.003 7
10

Digital image processing a practical introduction using java, Nick Efford

12
Huffman encoding
x
codew Gray
lp l p
ord level
1111
0.000 6 0.000 0
11
1111
0.060 5 0.012 1
0
0.142 2 10 0.071 2
0.076 4 1101 0.019 3
0.853 1 0 0.853 4
0.092 4 1100 0.023 5
0.076 4 1110 0.019 6
1111
0.018 6 0.003 7
10

Digital image processing a practical introduction using java, Nick Efford

13
Huffman encoding
codew Gray
lp l p
ord level
The compression
ratio achieved by 1111
0.000 6 0.000 0
Huffman coding is 11
3/1.317=2.28 1111
0.060 5 0.012 1
That mean there is a 0
56% data redundancy 0.142 2 10 0.071 2 +
0.076 4 1101 0.019 3
0.853 1 0 0.853 4
0.092 4 1100 0.023 5
0.076 4 1110 0.019 6
1111 1.317
0.018 6 0.003 7
10

Digital image processing a practical introduction using java, Nick Efford

14
Run-length-encoding (RLE)
Consider the following
aaaaaaaabbbaaaaaaaabbbba

We could instead represent this by “runs”

8a,3b,19a,4b,1a

RLE is often more compact, especially when data contains lots of runs
of the same number(s)

RLE is also lossless, you can always reconstruct the original

Fax machines transmit RLE-encoded linescans of the black-and-white

15 documents
Quantization (Lossy)
Another thing we can do is actually quantize the data such that it cannot be
recovered completely

Consider the following string of numbers:

ORIGINAL = {10, 12, 15, 18, 20, 3, 5, 2, 13}
(all the numbers are unique so Huffman coding won’t help)

Integer divide this by 5 (i.e. quantize it)

QUANTIZED = {2, 2, 3, 3, 4, 0, 1, 0, 2}

The natural binary range is smaller, we could also use Huffman encoding to get a bit
more compression.

Of course, the values are not the same, on “reconstruction” (multiply by 5) we get
only an approximation of the original:
RECOVERED
16 = {10, 10, 15, 15, 20, 0, 5, 0, 10}
Lossy vs. Lossless
• For things like text documents and computer data files, lossy compression
doesn’t make sense
– An approximation of the original is no good!

• But for data like audio or visual, small errors are not easily detectable by
our senses
– An approximation is acceptable

• This is one reason we can get significant compression of images and audio,
vs. other types of data
– Lossless 10:1 is typically possible
– Lossy 300:1 is possible with no significant perceptual loss

• With lossy we can even talk about “quality”

– The more like the original the higher the quality
17 – The less like the original, the lower the quality

Chapter 3 Multimedia Data Compression
100% (2)
Chapter 3 Multimedia Data Compression
23 pages
Module 5 - Info Theory and Compression Algo
No ratings yet
Module 5 - Info Theory and Compression Algo
58 pages
Chapter 7
No ratings yet
Chapter 7
36 pages
Compression Techniques and Cyclic Redundency Check
No ratings yet
Compression Techniques and Cyclic Redundency Check
5 pages
Multimedia System: Chapter Eight: Multimedia Data Compression
No ratings yet
Multimedia System: Chapter Eight: Multimedia Data Compression
29 pages
L15 Compression
No ratings yet
L15 Compression
63 pages
Image Compression
No ratings yet
Image Compression
50 pages
Stu-Lossless Compression Algos
No ratings yet
Stu-Lossless Compression Algos
21 pages
Introduction To Data Compression - Guy E. Blelloch PDF
No ratings yet
Introduction To Data Compression - Guy E. Blelloch PDF
54 pages
Chapter 4 Multi
No ratings yet
Chapter 4 Multi
45 pages
CHAPTER FOURmultimedia
No ratings yet
CHAPTER FOURmultimedia
23 pages
Compression PDF
No ratings yet
Compression PDF
55 pages
Image Compression
100% (1)
Image Compression
47 pages
Data Compression
No ratings yet
Data Compression
20 pages
Chapter 4 Lossless Compression Algorithims
No ratings yet
Chapter 4 Lossless Compression Algorithims
30 pages
Data Compression
No ratings yet
Data Compression
21 pages
Data Structures and Algorithms Compression Methods
No ratings yet
Data Structures and Algorithms Compression Methods
21 pages
Image Compression
100% (1)
Image Compression
38 pages
Lec 42024
No ratings yet
Lec 42024
13 pages
Unit Iv - Multimedia File Handling: Compression and Decompression
No ratings yet
Unit Iv - Multimedia File Handling: Compression and Decompression
49 pages
Unit 1 Data Compression
No ratings yet
Unit 1 Data Compression
30 pages
Chapter 2-Compression Techniques
No ratings yet
Chapter 2-Compression Techniques
63 pages
16 San
No ratings yet
16 San
7 pages
Data Compression Techniques
No ratings yet
Data Compression Techniques
5 pages
Image Compression: - Data vs. Information - Entropy - Data Redundancy
No ratings yet
Image Compression: - Data vs. Information - Entropy - Data Redundancy
30 pages
Image Compression for Engineers
No ratings yet
Image Compression for Engineers
34 pages
MM-Lecture 5 Image Compression
No ratings yet
MM-Lecture 5 Image Compression
20 pages
Unit 5 - Data Compression
No ratings yet
Unit 5 - Data Compression
46 pages
Chapter-5 Data Compression
No ratings yet
Chapter-5 Data Compression
53 pages
Digital Coding for ELEC1010 Students
No ratings yet
Digital Coding for ELEC1010 Students
72 pages
Compression (Compatibility Mode)
No ratings yet
Compression (Compatibility Mode)
12 pages
Domnic Image&Video Compression 2022
No ratings yet
Domnic Image&Video Compression 2022
76 pages
Ip-Un3 1
No ratings yet
Ip-Un3 1
44 pages
Compressor Principles
No ratings yet
Compressor Principles
32 pages
Chapter 4 - Introduction To Source Coding PDF
No ratings yet
Chapter 4 - Introduction To Source Coding PDF
72 pages
Data Compression
No ratings yet
Data Compression
28 pages
Unit 5 Data Compression
No ratings yet
Unit 5 Data Compression
98 pages
Image Compression: CS474/674 - Prof. Bebis
100% (1)
Image Compression: CS474/674 - Prof. Bebis
110 pages
Assignment 1
No ratings yet
Assignment 1
14 pages
Compression 2
No ratings yet
Compression 2
70 pages
Module 5 IVP
No ratings yet
Module 5 IVP
112 pages
Image Compression
100% (1)
Image Compression
111 pages
Test Lah
No ratings yet
Test Lah
47 pages
Image Compression Techniques
No ratings yet
Image Compression Techniques
49 pages
CH 15
No ratings yet
CH 15
34 pages
Synopsis On: Data Compression
No ratings yet
Synopsis On: Data Compression
25 pages
MM Unit-III - 0
No ratings yet
MM Unit-III - 0
22 pages
3 MM Compression
100% (1)
3 MM Compression
35 pages
Lecture 3 Compression in Multimedia
No ratings yet
Lecture 3 Compression in Multimedia
60 pages
ImageCompression-UNIT-V-students Material
No ratings yet
ImageCompression-UNIT-V-students Material
88 pages
Data Compression Techniques
0% (1)
Data Compression Techniques
18 pages
Compression: Safeen H. Rasool Assist. Lecturer
No ratings yet
Compression: Safeen H. Rasool Assist. Lecturer
16 pages
Image Compression by Retaining Image Quality - Ieee Format
No ratings yet
Image Compression by Retaining Image Quality - Ieee Format
4 pages
Data Compression Techniques
No ratings yet
Data Compression Techniques
41 pages
Data Compression Techniques
No ratings yet
Data Compression Techniques
10 pages
MATLAB Interpolation Techniques
No ratings yet
MATLAB Interpolation Techniques
10 pages
Q Bank BMSP
No ratings yet
Q Bank BMSP
2 pages
Essential Electronics: R C R 2 L 4 f L ζ = = π
No ratings yet
Essential Electronics: R C R 2 L 4 f L ζ = = π
11 pages
Final Exam: EE 3512 - Signals - Spring 2011 Dr. Obeid 5/9/2011 NAME
No ratings yet
Final Exam: EE 3512 - Signals - Spring 2011 Dr. Obeid 5/9/2011 NAME
6 pages
170W Class D Amplifier Schematic Diagram
No ratings yet
170W Class D Amplifier Schematic Diagram
3 pages
LELEC2795 - Communication Systems: 1 Theoretical Background
No ratings yet
LELEC2795 - Communication Systems: 1 Theoretical Background
4 pages
CS4442 - CS9542 - Part 2 - Lecture 1 - Intro - Filtering
No ratings yet
CS4442 - CS9542 - Part 2 - Lecture 1 - Intro - Filtering
40 pages
DSP Manual With Matlab
No ratings yet
DSP Manual With Matlab
33 pages
Digital Signal Processing Lesson Plan
No ratings yet
Digital Signal Processing Lesson Plan
2 pages
Advanced Electronics Circuits
No ratings yet
Advanced Electronics Circuits
19 pages
dbx202 Vca
No ratings yet
dbx202 Vca
2 pages
m5 Lec4
No ratings yet
m5 Lec4
5 pages
Equalizer Design Example
No ratings yet
Equalizer Design Example
10 pages
Computer Vision: Spatial Filtering
No ratings yet
Computer Vision: Spatial Filtering
55 pages
Reference: DSP LAB by Sanjit K. Mitra
No ratings yet
Reference: DSP LAB by Sanjit K. Mitra
16 pages
Linear Convolution Using DFT3.Doc2
No ratings yet
Linear Convolution Using DFT3.Doc2
5 pages
CV Gtu Answers
No ratings yet
CV Gtu Answers
56 pages
PDF - Dayton Audio - DC250-8 - 1
No ratings yet
PDF - Dayton Audio - DC250-8 - 1
1 page
XM 200
No ratings yet
XM 200
10 pages
Digital Image Processing Course
No ratings yet
Digital Image Processing Course
3 pages
A 0.0022 MM 10 Bit 20 MS S SAR ADC With Passive Single-Ended-to-Differential-Converter
No ratings yet
A 0.0022 MM 10 Bit 20 MS S SAR ADC With Passive Single-Ended-to-Differential-Converter
11 pages
ESS2 Preview-Mechatronics
No ratings yet
ESS2 Preview-Mechatronics
49 pages
SB850F
No ratings yet
SB850F
2 pages
SAS Module #11 - MRI
No ratings yet
SAS Module #11 - MRI
10 pages
1500gti 15" Woofer - Technical Data: Specifications Impedance Curve
No ratings yet
1500gti 15" Woofer - Technical Data: Specifications Impedance Curve
3 pages
Applied and Computational Harmonic Analysis: Ingrid Daubechies, Jianfeng Lu, Hau-Tieng Wu
No ratings yet
Applied and Computational Harmonic Analysis: Ingrid Daubechies, Jianfeng Lu, Hau-Tieng Wu
19 pages
Lab 9
No ratings yet
Lab 9
10 pages
Sony SCD-1 Technical White Paper
No ratings yet
Sony SCD-1 Technical White Paper
20 pages
VRX932LA DBX Drive Rack 260 Tuning Table
No ratings yet
VRX932LA DBX Drive Rack 260 Tuning Table
15 pages

Chapter 5 Data Compression

Uploaded by

Chapter 5 Data Compression

Uploaded by

Chapter 5

Consider the following string of data:

There are 4 different pieces of information (let’s say 4 symbols)

Bit assignment using VLC:

Now, compute the number of bits used to encode the same

• This idea of variable length coding is used in many places

• This is very commonly applied to many compression techniques as a final stage

See next example:

Digital image processing a practical introduction using java, Nick Efford

Digital image processing a practical introduction using java, Nick Efford

Digital image processing a practical introduction using java, Nick Efford

Digital image processing a practical introduction using java, Nick Efford

Digital image processing a practical introduction using java, Nick Efford

We could instead represent this by “runs”

RLE is also lossless, you can always reconstruct the original

Fax machines transmit RLE-encoded linescans of the black-and-white

Consider the following string of numbers:

Integer divide this by 5 (i.e. quantize it)

• With lossy we can even talk about “quality”

You might also like