KEMBAR78
Digital Image Processing - 5 Unit - ASRao | PDF | Data Compression | Code
0% found this document useful (0 votes)
16 views113 pages

Digital Image Processing - 5 Unit - ASRao

The document discusses image compression, detailing its purpose, methods (lossless and lossy), and the importance of reducing data size while retaining quality. It covers various types of redundancies, compression ratios, and fidelity criteria, including objective and subjective measures of image quality. Additionally, it explains the image compression model, including the roles of encoding, quantization, and coding techniques like Huffman coding.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views113 pages

Digital Image Processing - 5 Unit - ASRao

The document discusses image compression, detailing its purpose, methods (lossless and lossy), and the importance of reducing data size while retaining quality. It covers various types of redundancies, compression ratios, and fidelity criteria, including objective and subjective measures of image quality. Additionally, it explains the image compression model, including the roles of encoding, quantization, and coding techniques like Huffman coding.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 113

Allanki Sanyasi Rao

AMIE; M.Tech; (Ph.D); MISTE; MIETE


Associate Professor & HOD
Dept. of ECE
Any Difference
ASRao 3
“Without Compression, a CD stores only 200 Pictures or 8 Seconds Movie”
Introduction
 Image compression is a process that reduces the size of a digital image
file while retaining as much quality as possible.
 It's a type of data compression that's used to make images easier to
store, transmit, and display.
Purpose
Image compression reduces the amount of data needed to represent an
image, which can help with storage space, transmission speed, and ease of
computation.
Methods
There are two main types of image compression methods: Lossless and
Lossy. Lossless compression is often used for archival purposes or in medical
imaging, while lossy compression is often used for photographs.
Lossy compression
This method permanently removes less critical information from an image,
which can significantly reduce the file size. However, it can also reduce
image quality to the point of distortion, especially if the image is overly
compressed.
Fundamentals

 Data Compression: It refers to the process of reducing the amount of


data required to represent a given quantity of information.

 Data Vs Information

 Data and information are not synonymous terms…!

 Data refers to the raw, unprocessed bits and bytes that make up an
image. It's the numerical representation of the image's pixels,
including their color values, intensity, and spatial relationships.

 Information, on the other hand, refers to the meaningful content or


knowledge that can be extracted from the data.

ASRao 7
 The same information can be represented by different amount of data –
for example
Example 1: Your brother, Mr. Ram, will meet you at IGI airport in New Delhi
at 6 pm tomorrow night.
Example 2: Your brother will meet you at IGI airport at 6pm tomorrow
night.
Example 3: Mr. Ram will meet you at IGI at 6pm tomorrow.

Data  Information

ASRao 8
DATA = INFORMATION + REDUNDANT DATA

ASRao 9
 The different ways to represent this information can be either:
 Through the pixel intensities (spatial domain), or
 Through extracting features of that image (higher level of
information e.g. edge of an image etc., or
 Through simply noting the frequencies of the intensity value existing
in the image along with their location (frequency domain), etc.

 Data compression aims to reduce the amount of data while preserving as


much information as possible.

 Because various amount of data can be used to represent the same


amount of information, representations that contain irrelevant or
repeated information are said to contain redundant data.
 To understand the need for image compression, consider the following
scenario:
Person Mr. Ram wish to download the X-men movie in 720p
(1280X720) format. The movie is of 2 hours duration and was recorded at 60
fps (frames per second).
1. Calculate the amount of storage it would require to store this movie?
2. How many dual layer bur-ray disk would be required to store this
movie?
3. How much time would it take to transmit this video over broadband
connection having bandwidth capacity of 100 mbps?

 A video is a sequence of video frames where each frame is a full color


image.
 Storage requirement of 1second of video:
1 Frame is one RGB image  3 Bytes
Video frame  1280 X 720 X 3B
60 fps X (1280 X720) ppf X 3B  158.2 MB
fps – frames per second
ppf – pixels per frame
B – Bytes per pixel
 Storage requirement of complete 2 Hour movie:
158.2 X 60 X 60 X 2  1194 GB

 Number of Blu-ray disks required:


1194/50 = 24
 Time required to transmit complete 2 hour movie:
100 Mbps  100 Mb in 1 sec
100 X 1024 X 1024 bits in 1 sec

1194 X 1024 X 1024 X 1024 X 8 bits 

1194 X 1024 X 1024 X 1024 X 8 bits


100 X1024 X1024

 27.17 hours
 This explains the power and need of image compression.
Areas of Application
 Web page images & high resolution digital camera photos are also
compressed to save storage space & reduce transmission time.

 In addition to these applications, image compression plays an important


role in many other areas, including
 Tele-video conferencing
 Broadcast TV via Satellite
 Military Communications
 Computer communications
 Storing weather maps
 Remote sensing
 Document and medical imaging
 Fax transmission etc…
 Image Compression involves reducing the size of image data files while
retaining necessary information.

 Retaining necessary information depends upon the application

 The reduced file created by the compression process is called


compressed file and is used to reconstruct the image, resulting in the
decompressed image.
 The original image, before any compression performed, is called the
uncompressed image file.

 The ratio of the original, uncompressed image file and the compressed
file is referred to as the compression ratio.
Compression Ratio

Compression

Let n1 and n2 denote the number of bits in two representations of the


same information. n1 and n2 are the number of bits before and after
compression.
n1 Uncompressed File Size
 Compression Ratio , C  
n2 Compressed File Size
1
 Relative Data Redundancy of the representation with n1bits,RD  1
C
Information Data 1 Data 2 Data 3
0 0000 0000 000 00
1 0000 0001 001 01
2 0000 0010 010 10
3 0000 0011 011 11

 In the table, Data 1 uses 8 bits, while Data 2 and Data 3 use 3 and 2 bits,
respectively, making 5 and 6 bits redundant. Image compression aims to
remove such redundancy.
 A compression ratio of 10:1 means 1 bit can represent 10 bits of original
uncompressed data, effectively reducing storage needs.

 With this compression ratio (10:1), the relative redundancy can be


computed as 0.9 i.e 90% means before compression 90 % data was
redundant which is unwanted.
ASRao 19
Types of Data Redundancies
 Three basic data redundancies
 Coding Redundancy
 Interpixel (or) Spatial (or) Temporal Redundancy
 Psychovisual Redundancy (or) Irrelevant Information

The role of compression is to reduce one or more of these redundancy


types.

CR: Some graylevels are more common


than others

IR: The same graylevel covers large areas

PVR: The eye can only resolve 32 graylevels


locally.
Coding Redundancy
 To avoid coding redundancy, codes
should be selected according to the
probabilities of the events.

 Idea: Variable Length Coding


 Assign fewer symbols (bits) to the
more probable events (i.e., gray
levels in our case)
Let
0  rk  1 :gray levels (discrete random varible)
pr (rk ) : probability of occurance of rk
nk : number of pixels that rk appears in the image
n : total number of pixels in an image
L : number of gray levels
l (rk ) : number of bits used to represent rk
Lavg :average length of code words assigned to the gray levels
L 1 nk
Lavg   l (rk ) pr (rk ) where pr (rk )  , k  0,1, ....., L  1
k 0 n
 Hence, total number of bits required to code an MXN image is M.N.Lavg

 If the original image is of 100 X 100 size and each pixel is represented
with 8 bits, then it occupies 100x100x8 bits space.

 0.25(2)+0.47(1)+0.25(3)+0.03(3) = 1.81 i.e. using this only 1.81 bits are


sufficient to represent each pixel intensity. So the required space is
100X100X1.81 bits. Almost 4 times less space is required to store.

ASRao 23
Inter Pixel (or) Spatial (or) Temporal Redundancy

 In the spatial domain, images often have


patterns where intensity values remain
constant within horizontal strips.

 To reduce the interpixel redundancy,


mapping is used. The mapping scheme can
be selected according to the properties of
redundancy.

 An example of mapping can be to map pixels of an image: f(x,y) to a


sequence of pairs: (g1,r1), g2,r2), …., (gi,ri), …..
gi: ith gray level ri: run length of the ith run
 For example, a line with 100 pixels of the same intensity value can be
represented more efficiently.

 Instead of using 800 bits (100 pixels x 8 bits), a coding convention can be
used, such as (5,100), indicating that the intensity value 5 repeats for
100 pixels. This approach reduces the required storage from 800 bits to
just 16 bits, exploiting spatial redundancy to represent the same
information with less data.

 Spatial Redundancy refers to the similarity between neighboring pixels in


an image.
 Temporal Redundancy refers to the similarity between consecutive
frames in a video sequence.
Psychovisual Redundancy (or) Irrelevant Information

 The concept of Psychovisual redundancy involves removing irrelevant


information from an image that doesn't significantly impact human
visualization or interpretation. In an image with a dominant intensity
value, such as 128, adjacent intensity values like 127, 129, or 130 may
not make a noticeable difference to the viewer. By representing the
image with a single intensity value, the irrelevant information is
neglected, allowing for data compression.
 If the image will only be used for visual observation (i.e. illustrations on
the web etc.), a lot of the information is usually psychovisually
redundant. It can be removed without changing the visual quality of the
image. This kind of compression is usually irreversible.
ASRao 28
Fidelity Criteria
 Fidelity Criteria is a measure of how well a compressed image retains its
original quality and characteristics.
 It evaluates the difference between the original and compressed images,
ensuring that the compressed image is a faithful representation of the
original.
 Divided into 2 classes:
 Objective Fidelity Criteria
 Subjective Fidelity Criteria
Objective Fidelity Criteria:

 Objective fidelity criteria are quantitative measures that evaluate the


difference between the original and compressed images using
mathematical formulas. These metrics are based on numerical
calculations and do not involve human subjective evaluation.
 The smaller the value of error metrics, the better the compressed image.
Examples of objective fidelity criteria include:
Mean Squared Error (MSE): calculates the average squared difference
between the original and compressed images.
Peak Signal-to-Noise Ratio (PSNR): evaluates the ratio of the maximum
possible signal to the noise introduced during compression.
Mean Absolute Error (MAE): calculates the average absolute difference
between the original and compressed images.

Example:
Suppose we have an original image with pixel values: [10, 20, 30, 40, 50]
After compression, the image has pixel values: [12, 19, 31, 39, 51]
MSE Calculation:
Calculate the difference between original and compressed pixel values:
2, -1, 1, -1, 1]
Square each difference: [4, 1, 1, 1, 1]
Calculate the average: (4 + 1 + 1 + 1 + 1) / 5 = 8/5 = 1.6
PSNR Calculation:
Calculate the maximum possible signal (255 for 8-bit images): 255
Calculate the MSE (1.6)
Calculate PSNR: 10 * log10(255^2 / 1.6) ≈ 48.13 dB

MAE Calculation:
Calculate the absolute difference between original and compressed pixel
values: [2, 1, 1, 1, 1]
Calculate the average: (2 + 1 + 1 + 1 + 1) / 5 = 6/5 = 1.2
 In this example, the MSE, PSNR, and MAE values indicate the difference
between the original and compressed images. A lower MSE and MAE,
and a higher PSNR, indicate better fidelity and a more accurate
representation of the original image.

Subjective Fidelity Criteria:

 Subjective fidelity criteria involve human evaluation and perception of


the image quality. These metrics are based on human observers' opinions
and are often used to evaluate the visual quality of an image.

Examples of subjective fidelity criteria include:


Mean Opinion Score (MOS): a panel of human observers rate the image
quality on a scale of 1-5, and the average score is calculated.
Visual Information Fidelity (VIF): a metric that evaluates the amount of
visual information preserved in the compressed image.
Structural Similarity Index (SSIM): a metric that evaluates the similarity
between the original and compressed images based on luminance, contrast,
and structural features.

In summary:
- Objective fidelity criteria (MSE, PSNR, MAE) provide numerical measures of
the difference between the original and compressed images.
- Subjective fidelity criteria (MOS, VIF, SSIM) involve human evaluation and
perception of the image quality, providing a more subjective assessment of
the image quality.
ASRao 34
Image Compression Model

 Figure shows, a compression system consists of two distinct structural


blocks: an encoder and a decoder.
 The encoder creates a set of symbols (compressed) from the input data.

 The data is transmitted over the channel and is fed to decoder.

 The decoder reconstructs the output signal from the coded symbols.

 The source encoder removes the input redundancies, and the channel
encoder increases the noise immunity.

 Source encoder and decoder – Reduces or eliminates any coding,


interpixel and/or psychovisual redundancies in the input image.

 Channel encoder and decoder – Plays an important role when the


channel is noisy or prone to error by inserting “controlled redundancy”.
Three Stages:
Mapper:
 Transforms the input data into a (non-visual) format designed to reduce
the interpixel redundancy in the image.
 Reversible/Irreversible
 DFT, DCT/Runlength coding is used to transform. Mostly DCT is used for
JPEG compression.
Quantizer:
 Reduces the accuracy of the mappers output according to the predefined
fidelity criterion.
 It is irreversible process
 Must be omitted in error-free compression scheme.

Symbol coder:
 Generates a fixed or variable length code to represent quantizer output,
and maps the output in accordance with code.

ASRao 38
ASRao 39
Huffman Coding
 It is the most popular technique for removing coding redundancy. It’s a
variable length code.
 When coding the symbols of an independent source individually, it yields
the smallest possible number of code symbols per source symbol.
 It involves a series of source reductions by ordering the probabilities of
symbols under consideration and combining the two lowest probability
symbols into a single symbol that replaces them in the next source
reduction.
 A series of source reductions by ordering the probabilities of symbols
under consideration and combining the bottom two lowest probability
symbols into a single symbol that replaces them in the next source
reduction.
 The second step is to code each reduced source starting with the
smallest source and working backwards to the original source.

 The code is reversible without loss.


Example

0 – 2/25
1—4/25
2 – 3/25
3 – 6/25
.
.
.
.

Lavg = 2.92 bits/symbol


Entropy (H) = 2.89287 bits/symbol
Arithmetic Coding
 It is also a type of lossless compression algorithm and we get the same
image at the output. Lossless algorithms are always used where
reliability/data preservation is the main factor, for e.g. Medical imaging.

 For arithmetic coding:


- Sequences of source symbols are encoded together.
- There is no one-to-one correspondence between source symbols and
code words.
 Arithmetic coding takes in a complete data stream and output one
specific codeword which is a floating point number between 0 and 1,
with as few digits as possible. As the number of symbols in the message
increases, the interval used to represent it becomes smaller, which
means longer the bit stream, more is the precision of the output number.
 Huffman coding encodes source symbols one at a time which might not
be efficient while Arithmetic coding encodes sequences of source
symbols to variable length codewords.

 Arithmetic coding is a non-block code which means that arithmetic code


does not generate individual codes for each character.
 The efficiency = 100% can always be achieved using arithmetic coding , as
entropy H will be equal to Lavg and hence it also satisfies Shannon’s first
theorem which says that minimum coding bits should be equal to
entropy H, which is obtained by arithmetic coding. But the difficulty for
Arithmetic coding is practical implementation of long messages is not
possible, as arbitrary-precision floating point library/registers are not
there.
Differences between Huffman coding and Arithmetic coding

Huffman Coding Arithmetic Coding


1. Huffman coding is more popular. 1. Arithmetic coding is less popular.
2. It is not always optimal. 2. It is always optimal.
3. Huffman coding is a simpler 3. Arithmetic coding is complex
technique. technique.
4. Compression ration is poor. 4. Compression ratio is very good.
5. Compression speed is fast. 5. Compression speed is slow.
6. Memory space required is more. 6. Memory space required is less.
7. Precision is not an important 7. Precision is an important
factor. consideration.

 Arithmetic coding is used generally for speech processing and text


applications.
Steps of Arithmetic coding

1. Arrange the characters of string/numbers into ascending order.


2. Divide the numeric range 0 to 1 into the number of different symbols
present in the message.
3. Expand the first letter to be coded along with the range.
4. Further subdivide this range into number of symbols.
5. Repeat the procedure until the termination character is encoded.
 A sequence of source symbols is assigned a single arithmetic
codeword which corresponds to a sub-interval in [0, 1].
 Smaller intervals require more information units (i.e., bits) to be
represented.
Example
 A Source emits the symbols {a, b, c, d) with probabilities 0.4, 0.2. 01 and
0.3 respectively. Construct Arithmetic coding to encode and decode the
word “dad”

Encoding Process:
 Step 1: Our objective here is to encode the word ‘dad’. Initially, the range
of intervals is assumed to be between 0 and 1. the individual range in the
interval 0 to 1 is shown below.
Step 2:
 Step 2a: The first symbol to transmit is ‘d’. Hence, the new range is from
0.7 to 1.0 which is shown below.

 Step 2b: Find the subrange for each symbol in between the intervals 0.7
and 1.0
 The formula that is used to compute the low and high range for each
symbol within the interval is given by

Low = low +range * (low_range)


High = low + range * (high_range)

 Here, low_range refers to the symbol of low range.


Step 2b:

(I) To compute interval for the symbol ‘a’ in the interval 0.7 to 1.0
 First, the range is determined to be range = 1.0 – 0.7 = 0.3
 Using the formula given, the low range for the symbol ‘a’ is computed as
Low = low + range * (low_range)
 In this case, low = 0.7, range = 0.3 and low_range for symbol ‘a’ is 0.0.
substituting these value, the lower interval is computed as
Low = 0.7 + 0.3 * 0.0
 To determine the higher interval for the symbol, the formula to be used is
High = Low + range * (high_range)
High = 0.7 + 0.3 * 0.4 = 0.82
 For symbol ‘a’, the lower interval is computed to be 0.7 and the higher
interval is 0.82, which is illustrated below.
(II) To compute interval for the symbol ‘b’ in the interval 0.7 to 1.0
 The lower interval of the symbol ‘b’ is computed as
 Low = 0.7 +0.3 * 0.4 = 0.82
 The upper interval of the symbol ‘b’ is computed as
 High = 0.7 + 0.3 * 0.6 = 0.88

ASRao 50
(III) To compute interval for the symbol ‘c’ in the interval 0.7 to 1.0
 The lower interval of the symbol ‘c’ is computed as
 Low = 0.7 +0.3 * 0.6 = 0.88
 The upper interval of the symbol ‘b’ is computed as
 High = 0.7 + 0.3 * 0.7 = 0.91

ASRao 51
Step 3:
The symbol to be transmitted is ‘a’
 The next symbol to be transmitted in the word ‘dad’ is a. Hence the new
interval is 0.7 to 0.82 which is illustrated below.

ASRao 52
 Step 3a: To find the interval for the symbol in the range 0.7 to 0.82

(I) To find the interval for the symbol ‘a’,


 We find the lower interval which is given by
 Low = low + range * (Low_range)
 In this case, range = 0.82 – 0.7 = 0.12
 Low = 0.7 + 0.12 * 0.0 = 0.7
 The upper interval is given by
 high = low + range * (high_range)
 In this case, range = 0.82 – 0.7 = 0.13
 High = 0.7 +0.12 * 0.4 =0.748

ASRao 53
(II) To find the lower and upper interval for the symbol ‘b’,
 We find the lower interval which is given by
 Low = 0.7 + 0.12 * 0.4 = 0.748
 The upper interval is given by
High = 0.7 +0.12 * 0.6 =0.772
ASRao 54
(III) To find the lower and upper interval for the symbol ‘c’,
 We find the lower interval which is given by
 Low = 0.7 + 0.12 * 0.6 = 0.772
 The upper interval is given by
High = 0.7 +0.12 * 0.7 =0.784
ASRao 55
Step 4:

 To transmit the symbol ‘d’ in the word ‘dad’


 In order to transmit the next symbol the next symbol d the intervals 0.784
and 0.82 are taken as the new lower and upper intervals which is
illustrated below.
ASRao 56
Step 5: To compute the tag value

 Tag is the average of the lower and upper intervals. Here, the lower
intervals is 0.784 and the upper intervals is 0.82. Hence the tag value.
Tag = (0.784 + 0.82)/2 = 0.802
ASRao 57
Decoding Procedure:
 The tag value and symbol probabilities will be sent to the receiver.
 After receiving the tag value, the decoder decodes the encoded data.
 The step by step procedure of decoding is given below.

Step 1:
 The tag received by the receiver is 0.802. The initial interval is assumed
to be between 0 and 1
tag

ASRao 0.802 58
 The tag value is compared with the symbol sub range. We find that 0.802
lies between 0.7 and 1.0, hence the corresponding decoded symbol ‘d’.

Step 2:
 The decoded symbol in step 1 is ‘d’ and hence the new interval is fixed as
0.7 and 1.0 which is illustrated below.

Step 3:
 The new tag value is computed in this step. The new tag value is given by
tag  low
t 
*
range
ASRao 59
 Here the low value is 0.7, the range is 1.0 – 0.7 which is 0.3, and hence
the new tag value is given by

0.802  0.7
t 
*
 0.34
0.3

Step 4:
 The decoded symbol is ‘a’. Hence the new interval changed to 0 to 0.4

ASRao 60
Step 5:
 The new tap value for the interval given in step 4 is given by

tag  low 0.34  0


t 
*
  0.85
range 0.4
 The tag value lies between 0.7 to 1.0 and hence the decoded symbol is
‘d’.

 The following steps 1, 3 and 5, it is clear that the decoded word is ‘dad’.

Thank you very much – ‘dad’.

ASRao 61
ASRao 62
Error Free Compression

 Error-free compression, also known as lossless compression, is a type of


image compression that reduces the amount of data required to represent
an image without losing any of the original image data. This means that
the compressed image can be exactly reconstructed from the compressed
data, without any loss of information.

 Error-free compression is achieved by using techniques that take


advantage of the redundancy in the image data. Redundancy in this
context refers to the presence of duplicate or unnecessary information in
the image.

ASRao 63
 There are several techniques used for error-free compression, including:
Run-length encoding (RLE): This technique replaces sequences of identical
pixels with a single pixel value and a count of the number of times it appears
in the sequence.
Huffman coding: This technique assigns variable-length codes to pixel
values based on their frequency of occurrence. More frequently occurring
values are assigned shorter codes, while less frequently occurring values are
assigned longer codes.
Lempel-Ziv-Welch (LZW) coding: This technique builds a dictionary of
frequently occurring patterns in the image and replaces each occurrence of
the pattern with a reference to the dictionary entry.
Arithmetic coding: This technique encodes the image data by creating a
single number, a fraction n where (0.0 <= n < 1.0), that represents the
ASRao 64
probability of the pixel values.
Run-length encoding (RLE)
 Every code word is made up of a pair (g,l) where g is the graylevel and l is
the number of pixels with that graylevel (length, or “run”).

 Example: 56 56 56 82 82 82 83 80
56 56 56 56 56 80 80 80

Creates the runlength code (56,3) (82,3) (83,1) (80,4) (56,5)


 The code is calculated row by row.
 Very efficient coding for binary data.
 Important to know position, and the image dimensions must be stored
with the coded image.
 Used in most fax machines.
ASRao 65
Difference Coding

 xi if i  0
f ( xi )  
 xi  xi 1 if i  0

 Example:
Original: 56 56 56 82 82 82 83 80 80 80 80
Code f(xi): 56 0 0 26 0 0 1 -3 0 0 0

 The code is calculated row by row.


 Both Run-length coding and Difference coding are reversible

ASRao 66
 These techniques are often combined and used in a variety of image
compression algorithms, such as the Graphics Interchange Format (GIF)
and the Portable Network Graphics (PNG) format.

 Error-free compression has several advantages, including:


Exact reconstruction: The compressed image can be exactly reconstructed
from the compressed data, without any loss of information.
Reversible: The compression process is reversible, meaning that the original
image can be recovered exactly from the compressed data.
No loss of detail: Error-free compression does not discard any of the
original image data, so there is no loss of detail or image quality.

ASRao 67
 However, error-free compression also has some limitations, including:
Limited compression ratio: Error-free compression algorithms typically
achieve lower compression ratios than lossy compression algorithms, which
means that the compressed image may not be as small as desired.
Computational complexity: Error-free compression algorithms can be
computationally intensive, which can make them slower than lossy
compression algorithms.

ASRao 68
ASRao 69
 Lossy compression is a data encoding method that reduces file size by
removing non-essential details. Lossy image compression permanently
removes the less essential data.

 Lossy compression is the most effective way to manage digital images.


It’s especially true for web use, which can reduce file sizes by up to 90%
while maintaining a high-quality appearance.

ASRao 70
Step 1: Initially, the lossy algorithm scans the image for data, focusing on
variations you cannot see with the naked eye. Then, the quantization
process groups similar values to simplify the data.

Step 2: The next step encodes the simplified data, using algorithms such as
DCT to eliminate data redundancies. Techniques such as the DCT represent
the image data efficiently.

Step 3: The final compressed image is much smaller, but it still has a good
enough quality for most uses, especially for web graphics and digital media.

 lossy image compression data is irreversible, and over-compression can


lead to noticeable quality loss.

ASRao 71
Lossy Compression Algorithms
Discrete Cosine Transform (DCT): It’s most commonly used in JPEG
compression. It converts image data into frequency components and dumps
the less significant ones to reduce size.

ASRao 72
Lossy Compression File Types

JPEG: Ideal for colorful photos, supports millions of colors, and has
adjustable compression. JPEG balances image quality and file size for
websites, social media, and email.

WebP: This Google-developed web image outperforms PNG and JPEG in


transparency and animation. It’s built to speed up web image loading and
compress images while maintaining visual quality.

MP3: It revolutionized music streaming and storage by compressing audio


without losing quality. Reduces file sizes by removing audio frequencies less
perceivable to human ears. Offers various bitrates to balance quality and size

ASRao 73
MP4: The versatile MP4 format supports video, audio, and image, making it
popular for online streaming, editing, and distribution. Its compatibility,
quality, and size make it a multimedia standard.

HEIC (High-Efficiency Image Format): Used in newer Apple devices, it


provides high-quality images at small file sizes for modern photography
storage. It supports advanced features such as live photos and depth maps.

TIFF (Tagged Image File Format): Although TIFF is known for lossless
compression, its lossy variant is helpful in graphics and publishing for
complex layers and pages. It can store multiple layers and pages for detailed
image editing.

ASRao 74
Pros of Lossy Compression:

Size Reduction: Lossy compression techniques effectively reduce image file


sizes and improve storage space efficiency.
Bandwidth Efficiency: Lossy compression reduces file sizes and bandwidth
usage, making it ideal for web browsing when bandwidth is limited.
Faster loading: Faster page loads and better user experience result from
smaller image sizes.
Versatile use case: Lossy compression is suitable for photos, videos, and
audio files where minor quality loss is acceptable.
Custom compression: You can select the compression level, balancing file
size and image quality based on your requirements.
Supports many formats: Supports popular image formats such as JPEG,
which is widely used on various platforms.
ASRao 75
Cons of Lossy Compression:

Quality loss: Lossy compression can reduce image quality and cause objects,
especially at high compression ratios. It gets worse when you edit lossy
compressed images.
Permanent data loss: If you compress an image with lossy compression, you
can’t regain the lost data. It’s gone permanently.
Less professional: Lossy compression may not meet the high fidelity
standards of graphic designers, artists, and photographers due to image
quality compromises.

ASRao 76
ASRao 77
Predictive Coding
 It is an approach that achieves good compression without significant
overload.
 It is based on eliminating the inter pixel redundancies of closely spaced
pixels by extracting and coding only the new information in each pixel.
 The new information of a pixel is defined as the difference between the
actual and predicted value of that pixel.
 Two types are there:
 Lossless Predictive coding
 Lossy Predictive coding

ASRao 78
Lossless Predictive Coding
 It consists of an encoder and decoder.
 An identical predictor is present in encoder and decoder.
 Successive samples of the input image f(x,y) are fed to the encoder.
 The predictor generates anticipated value fˆ ( x, y ) based on the previous
samples.
 The output of the predictor then rounded to the nearest integer, denotes
as fˆ (n) and used to form the difference or prediction error,
e(n)  f (n)  fˆ (n) which is encoded using variable length code to

generate the next element of the compressed data stream.

 The output image sequence is generated back by, . f (n)  e(n)  fˆ (n)

ASRao 79
ASRao 80
Q: Perform the encoding and decoding of lossless predictive coding for
[93, 95, 91, 97,98]

Solution: Encoder
 Initially predictor initial value is considered as ‘0’.
 One sample is taken at a time and coded.

 Error generated by encoder is [93, ASRao


2, -4, 6, 1] 81
Decoder: Decoder receives the error signal and predictor initially reset to 0.

Output generated [93, 95, 91, 97, 98] is exactly equals to input [93, 95, 91,
97, 98]

ASRao 82
Lossy Predictive Coding

ASRao 83
 Quantizer, which replaces the nearest-integer function of the error-free
encoder, is inserted between the symbol encoder and the point at which
the prediction error is formed.
 It maps prediction error into a limited range of outputs, e(n) which
establishes the amount of compression and distortion associated with
lossy predictive coding.

f (n)  e(n)  fˆ (n)


 Example: Delta Modulation
in which, fˆ (n)   f (n  1) and
  for e(n)  0
e(n)  
  otherwise

α  prediction coefficient ASRao ζ  positive constant 84


ASRao 85
 Transform coding is a technique used in image compression to convert
spatial data (pixels) into frequency data (coefficients).

 This transformation helps to:


1. Reduce spatial redundancy (correlation between neighboring pixels)
2. Concentrate energy in a few coefficients
3. Enable efficient quantization and encoding

ASRao 86
Basic Steps of Transform Coding:
1. Divide the image into blocks: Split the image into smaller blocks, typically
8x8 or 16x16 pixels.

2. Apply a transform: Use a mathematical transform to convert the spatial


data (pixels) into frequency data (coefficients).
Common transforms include:
- Discrete Cosine Transform (DCT)
- Discrete Wavelet Transform (DWT)
- Fast Fourier Transform (FFT)

The transformation helps to:


- Decorrelate the pixels
- Concentrate the energy in a few coefficients
- Enable efficient quantization
ASRao and encoding 87
3. Quantize the coefficients: The quantizer is a critical component of
transform coding, responsible for reducing the precision of the transformed
coefficients. The quantizer takes the transformed coefficients as input and
produces a set of quantized coefficients as output. This is done to:
- Reduce the number of bits required to represent each coefficient
- Enable efficient encoding

Threshold coding and Zonal coding are quantization techniques used in


transform coding for image compression.
Threshold Coding: Threshold coding is a technique used in the quantizer
to set a threshold value for the transformed coefficients. Coefficients with
absolute values below the threshold are set to zero, while coefficients
above the threshold are quantized using a uniform or non-uniform
quantizer. ASRao 88
Zonal Coding: Zonal coding is another technique used in the quantizer to
divide the transformed coefficients into different zones based on their
frequencies. Each zone is then quantized separately using a uniform or
non-uniform quantizer.

Example:
Suppose we have a set of transformed coefficients divided into two
zones: low-frequency coefficients (Zone 1) and high-frequency
coefficients (Zone 2). Zone 1 coefficients are quantized using a coarse
quantizer, while Zone 2 coefficients are quantized using a finer
quantizer.

ASRao 89
4. Encode the coefficients: Use entropy coding (e.g., Huffman coding,
arithmetic coding) to assign shorter codes to more frequent coefficients.
Decoding: The decoding process involves the reverse of the encoding
process, i.e., the encoded coefficients are decoded to obtain the quantized
coefficients. The decoded coefficients are then used to reconstruct the
frequency-domain signal.
Inverse Transform: Finally, the inverse transform is applied to the
reconstructed signal to obtain the time-domain representation of the
compressed signal.

 Transform coding is widely used in audio and image compression, video


compression, and data compression. The choice of transform and
quantization parameters depends on the application and the desired level
ASRao 90
of compression.
ASRao 91
Why another standard?

 Low bit-rate compression: At low bit-rates (e.g. below 0.25 bpp for highly
detailed gray-level images) the distortion in JPEG becomes unacceptable.
 Lossless and lossy compression: Need for standard, which provide
lossless and lossy compression in one code stream.
 Large images: JPEG doesn’t compress images greater than 64KX64K (4
billion pixels) without tiling.

 Transmission in noisy environments: In JPEG quality suffers dramatically,


when bit errors are encountered.
 Computer generated imaginary: JPEG is optimized for natural images and
performs badly on computer generated images.

 Compound documents: JPEG fails ASRao


to compress bi-level (text) imagery. 92
JPEG 2000 Standards
 JPEG2000 is a new compression standard for still images intended to
overcome the shortcomings of the existing JPEG standard.
 It works best on continuous tone images where adjacent pixels have
similar colors.
 The standardization process is coordinated by the Joint Technical
Committee on Information technology of the International Organization
for Standardization (ISO)/International Electrotechnical Commission (IEC).
 JPEG2000 utilizes Wavelet and Sub-band technologies, targeting markets
such as internet, printing, photography, and digital media.

 The core compression algorithm, Embedded Block Coding with


Optimized Truncation (EBCOT), provides superior compression with
ASRao 93
scalable resolution, SNR, and random access.
ASRao 94
Figure: Reconstructed
images compressed at
0.25 bpp by means of
(a) JPEG and
(b) JPEG2000

Figure: Reconstructed
images compressed at
0.125 bpp by means of
(a) JPEG and
(b) JPEG2000

ASRao 95
Features of JPEG2000
 Lossless and Lossy Compression: The standard provides lossy
compression with a superior performance at low bit-rates. It also provides
lossless compression with progressive decoding. Applications such as
digital libraries/databases and medical imagery can benefit from this
feature.

 Protective image security: The open architecture of the JPEG2000


standard makes easy the use of protection techniques of digital images
such as watermarking, labeling, stamping or encryption.

 Region-of-interest coding: In this mode, regions of interest (ROI’s) can be


defined. These ROI’s can be encoded and transmitted with better quality
than the rest of the image.
ASRao 96
 Robustness to bit errors: The standard incorporate a set of error resilient
tolls to make the bit-stream more robust to transmission errors.

Example of region of interest coding


 A Region of interest in the Barbara image is reconstructed with quality
scalability. The region of interest is decoded first before any background
information.

ASRao 97
Main Goals of JPEG
 High Compression Ratio in such cases where images quality is judged as
very good to excellent.

 The use of many parameters allowing user to experiment and achieve


the desired level of compression.
 Obtaining good results for any kind of continuous tone images.

 Sophisticated but not too complex compression method, allowing


software and hardware implementation on many platform.

ASRao 98
Modes of JPEG
 Sequential Mode: Each image component is compressed in a single left
to right, top to bottom scan.
 Progressive Mode: The image is compressed in multiple blocks to be
viewed from coarse to final detail.
 Lossless Mode: Important for cases where the user decides no pixel
should be lost.
 Hierarchical Mode: JPEG2000 allows images to be compressed in a way
that enables viewing at different resolutions. This means a lower-quality
version of the image can be viewed without having to download or
decompress the full, high-quality version.

ASRao 99
The JPEG2000 Compression Engine

ASRao 100
Pre-processing

Image Tiling (Optional): Image tiling is an optional step in JPEG2000 pre-


processing. It involves dividing the image into smaller tiles, which can help
improve compression efficiency and reduce computational complexity.

DC Level Shifting: DC level shifting is a step in JPEG2000 pre-processing that


involves subtracting a fixed value from each pixel in the image. This helps to
center the pixel values around zero, reducing the dynamic range of the
image and improving compression efficiency. Specifically, 7 bits are used to
represent the pixel values, allowing for a range of -128 to 127.

Color Transformation (Optional): Color transformation in JPEG2000 converts


RGB to YCbCr, separating luminance (Y) from chrominance (Cb, Cr) to
improve compression efficiency.
ASRao 101
ASRao 102
Supplementary material on DWT

Discrete Wavelet Transform (DWT) is a mathematical technique that:


1. Breaks down a signal (like an image) into different frequency components.
2. Represents the signal using a combination of these frequency
components.

Think of it like a prism that splits white light into its color components. DWT
does something similar, but for signals, separating them into different
frequency bands.

ASRao 103
Discrete Wavelet Transform (DWT) is a mathematical tool used in JPEG2000
to compress images.
It works by:
- Breaking down the image into different frequency bands
- Separating the image into approximation (low-frequency) and
detail (high-frequency) components
- Representing the image using a combination of these components
This process helps to:
- Reduce spatial redundancy
- Represent the image more efficiently
- Achieve better compression ratios

ASRao 104
Forward Transform
 Discrete Wavelet Transform (DWT) is used to decompose each tile
component into different sub-bands.
 The transform is in the form of dyadic decomposition and use bi-
orthogonal wavelets.

 DWT can be irreversible or reversible.


 Irreversible transform – 7 Daubechies 9-tap/tap filter
 Reversible transform – 7 Le Gall 5-tap/3-tap filter
ASRao 105
2D-Forward Transform
 1-D sets of samples are decomposed into low-pass and high-pass
samples.
 Low-pass samples represent a down-sampled, low resolution version of
the original set.
 High pass samples represent a down-sampled residual version o the
original set (details).

ASRao 106
Quantization
 After transformation, all coefficients are quantized using scalar
quantization.
 Quantization reduces coefficients in precision. The operation is lossy
unless the quantization step is 1 and the coefficients integers (e.g.
reversible integer 5/3 wavelet).

 The process follows the formula:

ASRao 107
Modes of Quantization
 Two modes of operation:

 Integer mode – 7: Integer-to-integer transforms are employed.


Quantization step is fixed to one. Lossy coding is still achieved by
discarding bit-planes.

 Real mode – 7: Real-to-real transforms are employed. Quantization


steps are chosen in conjunction with rate control. In this mode, lossy
compression is achieved by discarding bi-planes or changing the size
of the quantization step or both.

ASRao 108
Code-blocks, Precincts and Packets
 Precinct: Each sub-band is divided
into rectangular blocks called
precincts.
 Packets: Three spatially consistent
rectangles compromise a packet.
 Code-block: Each precinct is further
divided into non-overlapping
rectangles called code-blocks.

 Each code-block forms the input to


the entropy encoder and is encoded
independently.

 Within a packet, code-blocks are ASRao


visited in raster order. 109
Entropy Coding: Bit-planes
 The coefficients in a code block are separated into bit-planes. The
individual bit-planes are coded in 1-3 coding passes.

ASRao 110
Entropy Coding: Coding Passes
 Each of these coding passes collects contextual information about the
bit-plane data. The contextual information along with the bit-planes are
used by the arithmetic encoder to generate the compressed bit-stream.

 The coding passes are:


 Significance propagation pass – 7 coefficients that are insignificant
and have a certain preferred neighborhood are coded.
 Magnitude refinement pass – 7 the current bits of significant
coefficients are coded.
 Clean-up pass – 7 the remaining insignificant coefficients for which
no information has yet been coded are coded.

ASRao 111
JPEG2000 Bit-stream
 For each code-block, a separate bit-stream is generated.
 The coded data of each-block is included in a packet.
 If more that one layer is used to encode the image information, the
code-block bit-streams are distributed across different packets
corresponding to different layers.

ASRao 112
ASRao 113

You might also like