Data Compression Techniques Guide

This document discusses data compression techniques for multimedia. It begins by outlining the motivation for compression due to the large storage and bandwidth requirements of audio and video. It then describes different compression types including lossy vs lossless and various techniques like entropy encoding, source coding, and hybrid coding. Specific entropy encoding techniques like Run Length Encoding and Huffman coding are also mentioned.

Uploaded by

Joshua chirchir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

157 views41 pages

Data Compression Techniques Guide

Uploaded by

Joshua chirchir

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

Multimedia Systems

Chapter 7: Data Compression (1)

References
• Chapter 7 from our Textbook: “Multimedia
Fundamentals: Media Coding and Content
Processing”
• Slides from the reference book:
Fundamentals of Multimedia
Outline
• Introduction
• Motivation for compression
• Coding requirements
• Compression types
• General Data Compression Scheme
• Compression Techniques • Entropy Encoding
– Run Length Encoding
– Huffman Coding

Introduction
• Video and audio have much higher storage
requirements than text
• Data transmission rates (in terms of
bandwidth requirements) for sending
continuous media are considerably higher
than text
• Efficient compression of audio and video
data, including some compression standards,
will be considered in this chapter
Motivation for Compression
• Terminology
– 1 kbit = 1000 bit
– 1 Kbit = 1024 bit (= 210)
– 1 Mbit = 1024 x 1024 bit (= 210 * 210 = 220)
• Discrete Data: Considering a small window of 640 x
480 pixels on a display
– Text
– Vector Image
– Bitmap Image
• Continuous Data: Required storage space per
second
– Uncompressed speech of telephone quality
– Uncompressed stereo audio signal of CD quality
– Video sequence
Motivation for Compression:
Discrete Data
• Text
– Assuming 2 bytes are used for every 8 x 8 pixel character,
• Character per screen page = ...
• Storage required per screen page = ...

• Vector Image
– Assuming that a typical image consists of 500 lines, each of which
is defined by its coordinates in the x direction and the y direction,
and an 8-bit attribute field
• Coordinates in the x direction require ...
• Coordinates in the y direction require ...
• Bits per line = ...
• Storage required per screen page
Motivation for Compression:
• Bitmap Image
– Assuming using 256 colors requiring a single byte per pixel
• Storage required per screen page = ...

Continuous Data
• Uncompressed speech of telephone
quality
– Assuming being sampled at 8 kHz and
quantized using 8 bit per sample yielding a
data stream of 64 Kbit/second
• Storage space required per second = ...
Motivation for Compression:
• Uncompressed stereo audio signal of
CD quality
– Assuming being sampled at 44.1 kHz and
quantized using 16 bits
• Data rate = ...
• Storage space required per second = ...
Continuous Data
• Video sequence
Motivation for Compression:
– Assuming 25 full frames per second, luminance
and chrominance of each pixel are coded using
3 bytes, luminance sampled at 13.5 MHz while
chrominance (R-Y and B-Y) is sampled at 6.75
MHz, each, and samples are uniformly coded
using 8 bits.
• Bandwidth = ...
• Data Rate = ...
• Storage space required per second = ...
Motivation for Compression:
Continuous Data
• Processing uncompressed video data streams
requires
– Storage space in the gigabyte – Buffer space in
the megabyte
– Data transfer rates of 140 Mbit/s [per
unidirectional connection]
• These requirements can be considerably
lowered by employing compression
Can Multimedia Data be
Significantly Compressed?
• Redundancy can be exploited to do
compression
• Spatial redundancy
– correlation between neighboring pixels in
image/video
• Spectral redundancy
– correlation among colors
• Psycho-visual redundancy
– Perceptual properties of human visual system
What Makes “Good” Compression
• Quality of compressed and decompressed data
should be as good as possible
• Compression/decompression process should be as
simple as possible
• Decompression time must not exceed certain
thresholds
• [De]/Compression requirements can be divided
into
– Dialogue mode (video conferencing)
– Retrieval mode (digital libraries)
– Both
Coding Requirements: Dialogue
Mode
• End-to-end delay does not exceed 150
ms for compression and decompression
alone.
– Ideally, compression and decompression should
not exceed 50ms in order to ensure natural
dialogue.
• In addition
– delay in the network,
– communications protocol processing in the end system,
– data transfer to and from the respective input and output
devices.

Coding Requirements: Retrieval

Mode
• Fast forward and fast rewind with simultaneous
display (or playback) of the data should be
possible
• Random access to single images or audio passages
in a data stream should be possible in less than 0.5
s. – Maintains interaction aspects in retrieval systems
• Decompression of images, video or audio
passages should be possible without interpreting
all preceding data.
– Allows random access and editing
Coding Requirements: Both Modes
• Support display of the same data in different systems –
Formats have to be independent of frame size and video frame rate
• Audio and video compression should support different
data rates at different qualities
• Precisely synchronize audio and video
• Support for economical solution
– Software
– Few VLSI chips
• Enable cooperation of different systems
– Data generated on a multimedia system can be reproduced on
another system (e.g. course materials).
Compression Types
• Physical versus logical Compression
– Physical
• Performed on data regardless of what information it contains
• Translates a series of bits to another series of bits
– Logical
• Knowledge-based
– e.g. United Kingdom to UK

• Spatial Compression – 2D or single image

• Temporal Compression – 3D or video
• Codec – Compression / Decompression
• Color / intensity … same thing
• Symmetric
Compression Types
– Compression and decompression roughly use the same
techniques and take just as long
– Data transmission which requires compression and
decompression on-the-fly will require these types of
algorithms
• Asymmetric
– Most common is where compression takes a lot more
time than decompression
• In an image database, each image will be compressed once and
decompressed many times
– Less common is where decompression takes a lot more
time than compression
Compression Types
• Creating many backup files which will hardly ever be read
• Non-adaptive
– Contain a static dictionary of predefined
substrings to encode which are known to occur
with high frequency
• Adaptive
– Dictionary is built from scratch
• Lossless
– decompress(compress(data)) = data
– Used for computer data, medical images, etc.
Compression Types
• Lossy
– decompress(compress(data)) ≠ data
– Some distortion
– A small change in pixel values may be invisible
– Suited for audio and video
General Data Compression Scheme
Input Data Encoder
(compression)
Codes /
Codewords
Storage or Networks

Codes /
Codewords Decoder
B0 = # bits required before compression (decompression)
B1 = # bits required after compression
Output Data
Compression Ratio = B0 / B1.

Compression Techniques
Coding Type Basis Run- Technique
length Coding
Entropy Huffman Coding
Encoding Arithmetic Coding
DPCM
Prediction
DM
FFT
Transformation
DCT
Source Coding
Bit Position
Layered Coding Subsampling
Sub-band Coding
Vector Quantization
JPEG
MPEG
Hybrid Coding
H.263
Many Proprietary Systems

Compression Techniques
• Entropy Coding
– Semantics of the information to be encoded are ignored
– Lossless compression technique
– Can be used for different media regardless of their
characteristics
• Source Coding
– Takes into account the semantics of the information to be
encoded.
– Often lossy compression technique
– Characteristics of medium are exploited
• Hybrid Coding
– Most multimedia compression algorithms are hybrid techniques

Entropy Encoding
• Information theory is a discipline in applied mathematics involving
the quantification of data with the goal of enabling as much data as
possible to be reliably stored on a medium and/or communicated
over a channel.
• According to Claude E. Shannon, the entropy η(eta) of an
information source with alphabet S = {s1, s2, ..., sn} is defined as

pi
i=1 pi i=1
where pi is the probability that symbol si in S will occur.

Entropy Encoding
• In science, entropy is a measure of the disorder of a
system.
– More entropy means more disorder
– Negative entropy is added to a system when more order is
given to the system.
• The measure of data, known as information entropy, is
usually expressed by the average number of bits needed
for storage or communication.
– The Shannon Coding Theorem states that the entropy is the best
we can do (under certain conditions). i.e., for the average length
of the codewords produced by the encoder, l’,
η ≤ l’
Entropy Encoding
• Example 1: What is the entropy of an image with
uniform distributions of gray-level intensities (i.e.
pi = 1/256 for all i)?
• Example 2: What is the entropy of an image
whose histogram shows that one third of the
pixels are dark and two thirds are bright?
Entropy Encoding: Run-Length
• Data often contains sequences of identical bytes.
Replacing these repeated byte sequences with the
number of occurrences reduces considerably the
overall data size.
• Many variations of RLE
– One form of RLE is to use a special marker M-byte that
will indicate the number of occurrences of a character
• “c”!#
– How many bytes are used above? When do you think the M-byte
should be used?

• ABCCCCCCCCDEFGGG
is encoded as Note: This encoding is DIFFERENT
Entropy Encoding: Run-Length
from what is mentioned in your book
ABC!8DEFGGG

– What if the string contains the “!” character?

– How much is the compression ratio for this example

• Many variations of RLE :

– Zero-suppression: In this case, one character that is
repeated very often is the only character used in the
RLE. In this case, the M-byte and the number of
additional occurrences are stored.
• When do you think the M-byte should be used, as opposed to
using the regular representation without any encoding?
• Many variations of RLE :
Entropy Encoding: Run-Length
– If we are encoding black and white images (e.g.
Faxes), one such version is as follows:
(row#, col# run1 begin, col# run1 end, col# run2 begin, col# run2
end, ... , col# runk begin, col# runk end)
(row#, col# run1 begin, col# run1 end, col# run2 begin, col# run2
end, ... , col# runr begin, col# runr end)
...
(row#, col# run1 begin, col# run1 end, col# run2 begin, col# run2
end, ... , col# runs begin, col# runs end)
Entropy Encoding: Huffman Coding
• One form of variable length coding
• Greedy algorithm
• Has been used in fax machines, JPEG and
MPEG
Algorithm huffman
Input: A set C = {c1 , c2 , ... , cn} of n characters and
their frequencies {f(c1), f(c2 ) , ... , f(cn )} Output: A
Huffman tree (V, T) for C.
1. Insert all characters into a min-heap H according to their
frequencies.
2. V = C; T = {}
3. for j = 1 to n – 1
Entropy Encoding: Huffman Coding
4. c = deletemin(H)
5. c’ = deletemin(H)
6. f(v) = f(c) + f(c’) // v is a new node
7. Insert v into the minheap H
8. Add (v,c) and (v,c’) to tree T making c and c’ children of v
in
T
9. end for
• Example
Entropy Encoding: Huffman Coding
• Most important properties of Huffman Coding
– Unique Prefix Property: No Huffman code is a
prefix of any other Huffman code
• For example, 101 and 1010 cannot be Huffman codes. Why?
– Optimality: The Huffman code is a
minimumredundancy code (given an accurate data
model)
• The two least frequent symbols will have the same length for
their Huffman code, whereas symbols occurring more frequently
will have shorter Huffman codes
• It has been shown that the average code length of an information
source S is strictly less than η+ 1, i.e.
Entropy Encoding: Huffman Coding
η ≤ l’ < η+ 1
Entropy Encoding: Adaptive
Huffman Coding
• The Huffman method assumes that the
frequencies of occurrence of all the
symbols of the alphabet are known
apriori.
– This is rarely the case in practice
– Semi-adaptive Huffman coding has been
employed where data is read twice, the first
pass being to determine the frequencies
Entropy Encoding: Adaptive
• Disadvantage: Too slow for real-time applications –
Another solution is Adaptive Huffman Coding
• Employed by Unix’s “compact” program.
Huffman Coding
• Decoder “mirrors” the operations of the encoder, as
they both may occur at different times
• Main idea of the algorithm is as follows
– Encoder and Decoder both start with an empty Huffman
Coding Tree
• No symbol is assigned codes yet.
Entropy Encoding: Adaptive
– First symbol read is written on the output stream in its
uncompressed form
• In fact, each uncompressed character being read for the first time is read
this way
– That is why we need an escape character to determine when we read an
uncompressed character for the first time.
– This escape character is denoted by NEW and given frequency 0 all the
time
– This symbol is then added to the tree and a code is
assigned to it.
Entropy Encoding: Adaptive
Huffman Coding
– Next time this symbol is encountered, its code will
be written in the output screen and its frequency is
increased by 1.
– Since this modifies the tree, it is checked whether it
is a
Huffman tree or not
• If not, it will be rearranged, through swaps, and new
codes will be assigned
• Sibling Property
– Must be preserved during swaps
Entropy Encoding: Adaptive
– All nodes are arranged in the order of increasing counts, left to right
and bottom to top.
– During a swap, the farthest node with count N is swapped with the
node whose count has just been increased to N + 1.

Huffman Coding
• Example

Chap 5 Compression
No ratings yet
Chap 5 Compression
43 pages
Chapter 5 Data Compression
No ratings yet
Chapter 5 Data Compression
57 pages
Special Topics Data Compression
No ratings yet
Special Topics Data Compression
51 pages
MMC Unit II
No ratings yet
MMC Unit II
40 pages
Chapter - 5 Data Compression
No ratings yet
Chapter - 5 Data Compression
8 pages
Data Compression
No ratings yet
Data Compression
23 pages
Ch8c Data Compression
No ratings yet
Ch8c Data Compression
7 pages
Ch-5 Data Compression
No ratings yet
Ch-5 Data Compression
58 pages
Multimedia Class 11
No ratings yet
Multimedia Class 11
6 pages
Nteractive Ultimedia Ystems: Ompression Types and Techniques
No ratings yet
Nteractive Ultimedia Ystems: Ompression Types and Techniques
12 pages
20250320121146-Module-3 MMC Notes
No ratings yet
20250320121146-Module-3 MMC Notes
27 pages
Computer Networks 2 10CS64 UNIT 7 Notes
No ratings yet
Computer Networks 2 10CS64 UNIT 7 Notes
22 pages
Data Compression Techniques Explained
No ratings yet
Data Compression Techniques Explained
17 pages
Data Representation (Part 3)
No ratings yet
Data Representation (Part 3)
26 pages
Unit 5
No ratings yet
Unit 5
41 pages
Multimedia Data Compression Guide
No ratings yet
Multimedia Data Compression Guide
121 pages
Main Techniques and Performance of Each Compression
No ratings yet
Main Techniques and Performance of Each Compression
23 pages
Data Represintation and Storage
No ratings yet
Data Represintation and Storage
19 pages
Data Compression 2
No ratings yet
Data Compression 2
19 pages
Chapt 6
No ratings yet
Chapt 6
26 pages
Chapter 5 Data Compression
No ratings yet
Chapter 5 Data Compression
71 pages
Multimedia Project Planning & Compression
No ratings yet
Multimedia Project Planning & Compression
9 pages
CH5 Data Compression
No ratings yet
CH5 Data Compression
69 pages
3 MM Compression
100% (1)
3 MM Compression
35 pages
Multimedia Compression Essentials
No ratings yet
Multimedia Compression Essentials
52 pages
5 Data Compression Ioenotes
No ratings yet
5 Data Compression Ioenotes
47 pages
CG Unit 5
No ratings yet
CG Unit 5
76 pages
Video Compression Makes Big Gains
No ratings yet
Video Compression Makes Big Gains
4 pages
Compression
100% (1)
Compression
38 pages
Module 3
No ratings yet
Module 3
23 pages
Compression of Multimedia Data
No ratings yet
Compression of Multimedia Data
14 pages
Data Compression
No ratings yet
Data Compression
19 pages
Unit 5 - Data Compression
No ratings yet
Unit 5 - Data Compression
46 pages
Chapter-5 Data Compression
No ratings yet
Chapter-5 Data Compression
53 pages
Data Compression: CS 147 Minh Nguyen
No ratings yet
Data Compression: CS 147 Minh Nguyen
25 pages
Lossy and Lossless Compression Techniques
100% (1)
Lossy and Lossless Compression Techniques
18 pages
Multimedia Communication Vtu Unit 3
No ratings yet
Multimedia Communication Vtu Unit 3
44 pages
Multimedia Communication Vtu Unit 3
71% (7)
Multimedia Communication Vtu Unit 3
44 pages
Image Compression
No ratings yet
Image Compression
43 pages
05A Compression
No ratings yet
05A Compression
102 pages
MATLAB Image Compression Guide
No ratings yet
MATLAB Image Compression Guide
13 pages
3-0-Fundamental of Compression
No ratings yet
3-0-Fundamental of Compression
17 pages
HTCS501 Unit 4
No ratings yet
HTCS501 Unit 4
17 pages
Image Compression System Using H.264 Encoding
No ratings yet
Image Compression System Using H.264 Encoding
6 pages
Image Compression
No ratings yet
Image Compression
70 pages
Unit Iv - Multimedia File Handling: Compression and Decompression
No ratings yet
Unit Iv - Multimedia File Handling: Compression and Decompression
49 pages
Data Compression Techniques
No ratings yet
Data Compression Techniques
21 pages
Caie As Level: Computer SCIENCE (9618)
No ratings yet
Caie As Level: Computer SCIENCE (9618)
16 pages
Aadel Veri
No ratings yet
Aadel Veri
37 pages
Lossless and Lossy Compression
No ratings yet
Lossless and Lossy Compression
18 pages
Image Compression Essentials
No ratings yet
Image Compression Essentials
5 pages
Dereje Teferi Dereje - Teferi@aau - Edu.et
No ratings yet
Dereje Teferi Dereje - Teferi@aau - Edu.et
36 pages
MM Compression
No ratings yet
MM Compression
38 pages
Lecture 3 Compression in Multimedia
No ratings yet
Lecture 3 Compression in Multimedia
60 pages
Data Compression Techniques
100% (1)
Data Compression Techniques
18 pages
Chapter 3 Multimedia Data Compression
100% (2)
Chapter 3 Multimedia Data Compression
23 pages
Unit3 Ece MMC 6th Sem
No ratings yet
Unit3 Ece MMC 6th Sem
96 pages
CE 341 Fall 2020 Exam #2 Questions
No ratings yet
CE 341 Fall 2020 Exam #2 Questions
1 page
Interpretation of Universities Using Multidimensional Scaling and Principal Component Analysis
No ratings yet
Interpretation of Universities Using Multidimensional Scaling and Principal Component Analysis
4 pages
3-Stats Uni
No ratings yet
3-Stats Uni
15 pages
MISY262: Fundamentals of Business Analytics: Data and Variables
No ratings yet
MISY262: Fundamentals of Business Analytics: Data and Variables
9 pages
Business Analytics Essentials
No ratings yet
Business Analytics Essentials
14 pages
1-1-Scientific Method (Online) #3
No ratings yet
1-1-Scientific Method (Online) #3
5 pages
1 - HW 07 Assignment PDF
No ratings yet
1 - HW 07 Assignment PDF
2 pages
Friction in Orthodontics
No ratings yet
Friction in Orthodontics
21 pages
Day 1 - Quiet Speech
No ratings yet
Day 1 - Quiet Speech
3 pages
Current Trend Unit 3
No ratings yet
Current Trend Unit 3
18 pages
Understanding Power and Influence
100% (1)
Understanding Power and Influence
47 pages
Re-Hiring of EX-Employees Policy
No ratings yet
Re-Hiring of EX-Employees Policy
2 pages
SL 334 2023 00-1
No ratings yet
SL 334 2023 00-1
3 pages
Week 13 (Intro To Linear Programming)
No ratings yet
Week 13 (Intro To Linear Programming)
24 pages
Environmental Factors Influencing Clubb International and Its SWOT Analysis
No ratings yet
Environmental Factors Influencing Clubb International and Its SWOT Analysis
11 pages
Notes:: Highways and Airports B.SC Graduation Project 2016
No ratings yet
Notes:: Highways and Airports B.SC Graduation Project 2016
1 page
Academic Critique Writing Guide
No ratings yet
Academic Critique Writing Guide
10 pages
Photography as a Relaxing Hobby
No ratings yet
Photography as a Relaxing Hobby
12 pages
MUA 0900 Winter 2024 Syllabus
No ratings yet
MUA 0900 Winter 2024 Syllabus
5 pages
One Month of Mindfulness: Mindful Walk in Your Cleaning Your Teeth or
No ratings yet
One Month of Mindfulness: Mindful Walk in Your Cleaning Your Teeth or
1 page
Bahasa Inggris Asas Kelas 4
No ratings yet
Bahasa Inggris Asas Kelas 4
6 pages
ModulusofSubgradeReaction PDF
No ratings yet
ModulusofSubgradeReaction PDF
11 pages
Lesson Plan
No ratings yet
Lesson Plan
8 pages
CMSL MAIN Book CS Vikas Vohra, YES Academy (Jan 24)
100% (1)
CMSL MAIN Book CS Vikas Vohra, YES Academy (Jan 24)
548 pages
Book List GNM 2nd Year
No ratings yet
Book List GNM 2nd Year
1 page
IFU Implants Titan EN - Instruction For Use Implants Titanium-04 - 2024
No ratings yet
IFU Implants Titan EN - Instruction For Use Implants Titanium-04 - 2024
8 pages
Perfect Competition and The Creativity of The Market (2001) Louis Makowski and Joseph M. Ostroy
No ratings yet
Perfect Competition and The Creativity of The Market (2001) Louis Makowski and Joseph M. Ostroy
58 pages
GTA: San Andreas Cheat Codes
100% (1)
GTA: San Andreas Cheat Codes
9 pages
Communication Process Elements
No ratings yet
Communication Process Elements
54 pages
Free Goods: Sales & Distribution
No ratings yet
Free Goods: Sales & Distribution
8 pages
Makalah Pronoun: Bahasa Inggris
No ratings yet
Makalah Pronoun: Bahasa Inggris
12 pages
Book Wipo TM JKT 16 3
No ratings yet
Book Wipo TM JKT 16 3
32 pages
3.lecture - 2 Type of Areal Photgraph
No ratings yet
3.lecture - 2 Type of Areal Photgraph
34 pages
P A D A S: Akshiraa Coaching Centre - Poly TRB 2021-English Answer Key
No ratings yet
P A D A S: Akshiraa Coaching Centre - Poly TRB 2021-English Answer Key
43 pages
Model 8324: Service & Operating Manual
No ratings yet
Model 8324: Service & Operating Manual
19 pages
Liz Banks Liz Banks Liz Banks Liz Banks: Ducation
No ratings yet
Liz Banks Liz Banks Liz Banks Liz Banks: Ducation
5 pages
Indac, Lesson 4 Take Off - Take Action. Self - Check.self Reflect.
No ratings yet
Indac, Lesson 4 Take Off - Take Action. Self - Check.self Reflect.
4 pages

Data Compression Techniques Guide

Uploaded by

Data Compression Techniques Guide

Uploaded by

Multimedia Systems

Chapter 7: Data Compression (1)

Coding Requirements: Retrieval

• Spatial Compression – 2D or single image

– What if the string contains the “!” character?

• Many variations of RLE :

You might also like