Digital Image Processing - 5 Unit - ASRao
Digital Image Processing - 5 Unit - ASRao
Data Vs Information
Data refers to the raw, unprocessed bits and bytes that make up an
image. It's the numerical representation of the image's pixels,
including their color values, intensity, and spatial relationships.
ASRao 7
The same information can be represented by different amount of data –
for example
Example 1: Your brother, Mr. Ram, will meet you at IGI airport in New Delhi
at 6 pm tomorrow night.
Example 2: Your brother will meet you at IGI airport at 6pm tomorrow
night.
Example 3: Mr. Ram will meet you at IGI at 6pm tomorrow.
Data Information
ASRao 8
DATA = INFORMATION + REDUNDANT DATA
ASRao 9
The different ways to represent this information can be either:
Through the pixel intensities (spatial domain), or
Through extracting features of that image (higher level of
information e.g. edge of an image etc., or
Through simply noting the frequencies of the intensity value existing
in the image along with their location (frequency domain), etc.
27.17 hours
This explains the power and need of image compression.
Areas of Application
Web page images & high resolution digital camera photos are also
compressed to save storage space & reduce transmission time.
The ratio of the original, uncompressed image file and the compressed
file is referred to as the compression ratio.
Compression Ratio
Compression
In the table, Data 1 uses 8 bits, while Data 2 and Data 3 use 3 and 2 bits,
respectively, making 5 and 6 bits redundant. Image compression aims to
remove such redundancy.
A compression ratio of 10:1 means 1 bit can represent 10 bits of original
uncompressed data, effectively reducing storage needs.
If the original image is of 100 X 100 size and each pixel is represented
with 8 bits, then it occupies 100x100x8 bits space.
ASRao 23
Inter Pixel (or) Spatial (or) Temporal Redundancy
Instead of using 800 bits (100 pixels x 8 bits), a coding convention can be
used, such as (5,100), indicating that the intensity value 5 repeats for
100 pixels. This approach reduces the required storage from 800 bits to
just 16 bits, exploiting spatial redundancy to represent the same
information with less data.
Example:
Suppose we have an original image with pixel values: [10, 20, 30, 40, 50]
After compression, the image has pixel values: [12, 19, 31, 39, 51]
MSE Calculation:
Calculate the difference between original and compressed pixel values:
2, -1, 1, -1, 1]
Square each difference: [4, 1, 1, 1, 1]
Calculate the average: (4 + 1 + 1 + 1 + 1) / 5 = 8/5 = 1.6
PSNR Calculation:
Calculate the maximum possible signal (255 for 8-bit images): 255
Calculate the MSE (1.6)
Calculate PSNR: 10 * log10(255^2 / 1.6) ≈ 48.13 dB
MAE Calculation:
Calculate the absolute difference between original and compressed pixel
values: [2, 1, 1, 1, 1]
Calculate the average: (2 + 1 + 1 + 1 + 1) / 5 = 6/5 = 1.2
In this example, the MSE, PSNR, and MAE values indicate the difference
between the original and compressed images. A lower MSE and MAE,
and a higher PSNR, indicate better fidelity and a more accurate
representation of the original image.
In summary:
- Objective fidelity criteria (MSE, PSNR, MAE) provide numerical measures of
the difference between the original and compressed images.
- Subjective fidelity criteria (MOS, VIF, SSIM) involve human evaluation and
perception of the image quality, providing a more subjective assessment of
the image quality.
ASRao 34
Image Compression Model
The decoder reconstructs the output signal from the coded symbols.
The source encoder removes the input redundancies, and the channel
encoder increases the noise immunity.
Symbol coder:
Generates a fixed or variable length code to represent quantizer output,
and maps the output in accordance with code.
ASRao 38
ASRao 39
Huffman Coding
It is the most popular technique for removing coding redundancy. It’s a
variable length code.
When coding the symbols of an independent source individually, it yields
the smallest possible number of code symbols per source symbol.
It involves a series of source reductions by ordering the probabilities of
symbols under consideration and combining the two lowest probability
symbols into a single symbol that replaces them in the next source
reduction.
A series of source reductions by ordering the probabilities of symbols
under consideration and combining the bottom two lowest probability
symbols into a single symbol that replaces them in the next source
reduction.
The second step is to code each reduced source starting with the
smallest source and working backwards to the original source.
0 – 2/25
1—4/25
2 – 3/25
3 – 6/25
.
.
.
.
Encoding Process:
Step 1: Our objective here is to encode the word ‘dad’. Initially, the range
of intervals is assumed to be between 0 and 1. the individual range in the
interval 0 to 1 is shown below.
Step 2:
Step 2a: The first symbol to transmit is ‘d’. Hence, the new range is from
0.7 to 1.0 which is shown below.
Step 2b: Find the subrange for each symbol in between the intervals 0.7
and 1.0
The formula that is used to compute the low and high range for each
symbol within the interval is given by
(I) To compute interval for the symbol ‘a’ in the interval 0.7 to 1.0
First, the range is determined to be range = 1.0 – 0.7 = 0.3
Using the formula given, the low range for the symbol ‘a’ is computed as
Low = low + range * (low_range)
In this case, low = 0.7, range = 0.3 and low_range for symbol ‘a’ is 0.0.
substituting these value, the lower interval is computed as
Low = 0.7 + 0.3 * 0.0
To determine the higher interval for the symbol, the formula to be used is
High = Low + range * (high_range)
High = 0.7 + 0.3 * 0.4 = 0.82
For symbol ‘a’, the lower interval is computed to be 0.7 and the higher
interval is 0.82, which is illustrated below.
(II) To compute interval for the symbol ‘b’ in the interval 0.7 to 1.0
The lower interval of the symbol ‘b’ is computed as
Low = 0.7 +0.3 * 0.4 = 0.82
The upper interval of the symbol ‘b’ is computed as
High = 0.7 + 0.3 * 0.6 = 0.88
ASRao 50
(III) To compute interval for the symbol ‘c’ in the interval 0.7 to 1.0
The lower interval of the symbol ‘c’ is computed as
Low = 0.7 +0.3 * 0.6 = 0.88
The upper interval of the symbol ‘b’ is computed as
High = 0.7 + 0.3 * 0.7 = 0.91
ASRao 51
Step 3:
The symbol to be transmitted is ‘a’
The next symbol to be transmitted in the word ‘dad’ is a. Hence the new
interval is 0.7 to 0.82 which is illustrated below.
ASRao 52
Step 3a: To find the interval for the symbol in the range 0.7 to 0.82
ASRao 53
(II) To find the lower and upper interval for the symbol ‘b’,
We find the lower interval which is given by
Low = 0.7 + 0.12 * 0.4 = 0.748
The upper interval is given by
High = 0.7 +0.12 * 0.6 =0.772
ASRao 54
(III) To find the lower and upper interval for the symbol ‘c’,
We find the lower interval which is given by
Low = 0.7 + 0.12 * 0.6 = 0.772
The upper interval is given by
High = 0.7 +0.12 * 0.7 =0.784
ASRao 55
Step 4:
Tag is the average of the lower and upper intervals. Here, the lower
intervals is 0.784 and the upper intervals is 0.82. Hence the tag value.
Tag = (0.784 + 0.82)/2 = 0.802
ASRao 57
Decoding Procedure:
The tag value and symbol probabilities will be sent to the receiver.
After receiving the tag value, the decoder decodes the encoded data.
The step by step procedure of decoding is given below.
Step 1:
The tag received by the receiver is 0.802. The initial interval is assumed
to be between 0 and 1
tag
ASRao 0.802 58
The tag value is compared with the symbol sub range. We find that 0.802
lies between 0.7 and 1.0, hence the corresponding decoded symbol ‘d’.
Step 2:
The decoded symbol in step 1 is ‘d’ and hence the new interval is fixed as
0.7 and 1.0 which is illustrated below.
Step 3:
The new tag value is computed in this step. The new tag value is given by
tag low
t
*
range
ASRao 59
Here the low value is 0.7, the range is 1.0 – 0.7 which is 0.3, and hence
the new tag value is given by
0.802 0.7
t
*
0.34
0.3
Step 4:
The decoded symbol is ‘a’. Hence the new interval changed to 0 to 0.4
ASRao 60
Step 5:
The new tap value for the interval given in step 4 is given by
The following steps 1, 3 and 5, it is clear that the decoded word is ‘dad’.
ASRao 61
ASRao 62
Error Free Compression
ASRao 63
There are several techniques used for error-free compression, including:
Run-length encoding (RLE): This technique replaces sequences of identical
pixels with a single pixel value and a count of the number of times it appears
in the sequence.
Huffman coding: This technique assigns variable-length codes to pixel
values based on their frequency of occurrence. More frequently occurring
values are assigned shorter codes, while less frequently occurring values are
assigned longer codes.
Lempel-Ziv-Welch (LZW) coding: This technique builds a dictionary of
frequently occurring patterns in the image and replaces each occurrence of
the pattern with a reference to the dictionary entry.
Arithmetic coding: This technique encodes the image data by creating a
single number, a fraction n where (0.0 <= n < 1.0), that represents the
ASRao 64
probability of the pixel values.
Run-length encoding (RLE)
Every code word is made up of a pair (g,l) where g is the graylevel and l is
the number of pixels with that graylevel (length, or “run”).
Example: 56 56 56 82 82 82 83 80
56 56 56 56 56 80 80 80
xi if i 0
f ( xi )
xi xi 1 if i 0
Example:
Original: 56 56 56 82 82 82 83 80 80 80 80
Code f(xi): 56 0 0 26 0 0 1 -3 0 0 0
ASRao 66
These techniques are often combined and used in a variety of image
compression algorithms, such as the Graphics Interchange Format (GIF)
and the Portable Network Graphics (PNG) format.
ASRao 67
However, error-free compression also has some limitations, including:
Limited compression ratio: Error-free compression algorithms typically
achieve lower compression ratios than lossy compression algorithms, which
means that the compressed image may not be as small as desired.
Computational complexity: Error-free compression algorithms can be
computationally intensive, which can make them slower than lossy
compression algorithms.
ASRao 68
ASRao 69
Lossy compression is a data encoding method that reduces file size by
removing non-essential details. Lossy image compression permanently
removes the less essential data.
ASRao 70
Step 1: Initially, the lossy algorithm scans the image for data, focusing on
variations you cannot see with the naked eye. Then, the quantization
process groups similar values to simplify the data.
Step 2: The next step encodes the simplified data, using algorithms such as
DCT to eliminate data redundancies. Techniques such as the DCT represent
the image data efficiently.
Step 3: The final compressed image is much smaller, but it still has a good
enough quality for most uses, especially for web graphics and digital media.
ASRao 71
Lossy Compression Algorithms
Discrete Cosine Transform (DCT): It’s most commonly used in JPEG
compression. It converts image data into frequency components and dumps
the less significant ones to reduce size.
ASRao 72
Lossy Compression File Types
JPEG: Ideal for colorful photos, supports millions of colors, and has
adjustable compression. JPEG balances image quality and file size for
websites, social media, and email.
ASRao 73
MP4: The versatile MP4 format supports video, audio, and image, making it
popular for online streaming, editing, and distribution. Its compatibility,
quality, and size make it a multimedia standard.
TIFF (Tagged Image File Format): Although TIFF is known for lossless
compression, its lossy variant is helpful in graphics and publishing for
complex layers and pages. It can store multiple layers and pages for detailed
image editing.
ASRao 74
Pros of Lossy Compression:
Quality loss: Lossy compression can reduce image quality and cause objects,
especially at high compression ratios. It gets worse when you edit lossy
compressed images.
Permanent data loss: If you compress an image with lossy compression, you
can’t regain the lost data. It’s gone permanently.
Less professional: Lossy compression may not meet the high fidelity
standards of graphic designers, artists, and photographers due to image
quality compromises.
ASRao 76
ASRao 77
Predictive Coding
It is an approach that achieves good compression without significant
overload.
It is based on eliminating the inter pixel redundancies of closely spaced
pixels by extracting and coding only the new information in each pixel.
The new information of a pixel is defined as the difference between the
actual and predicted value of that pixel.
Two types are there:
Lossless Predictive coding
Lossy Predictive coding
ASRao 78
Lossless Predictive Coding
It consists of an encoder and decoder.
An identical predictor is present in encoder and decoder.
Successive samples of the input image f(x,y) are fed to the encoder.
The predictor generates anticipated value fˆ ( x, y ) based on the previous
samples.
The output of the predictor then rounded to the nearest integer, denotes
as fˆ (n) and used to form the difference or prediction error,
e(n) f (n) fˆ (n) which is encoded using variable length code to
The output image sequence is generated back by, . f (n) e(n) fˆ (n)
ASRao 79
ASRao 80
Q: Perform the encoding and decoding of lossless predictive coding for
[93, 95, 91, 97,98]
Solution: Encoder
Initially predictor initial value is considered as ‘0’.
One sample is taken at a time and coded.
Output generated [93, 95, 91, 97, 98] is exactly equals to input [93, 95, 91,
97, 98]
ASRao 82
Lossy Predictive Coding
ASRao 83
Quantizer, which replaces the nearest-integer function of the error-free
encoder, is inserted between the symbol encoder and the point at which
the prediction error is formed.
It maps prediction error into a limited range of outputs, e(n) which
establishes the amount of compression and distortion associated with
lossy predictive coding.
ASRao 86
Basic Steps of Transform Coding:
1. Divide the image into blocks: Split the image into smaller blocks, typically
8x8 or 16x16 pixels.
Example:
Suppose we have a set of transformed coefficients divided into two
zones: low-frequency coefficients (Zone 1) and high-frequency
coefficients (Zone 2). Zone 1 coefficients are quantized using a coarse
quantizer, while Zone 2 coefficients are quantized using a finer
quantizer.
ASRao 89
4. Encode the coefficients: Use entropy coding (e.g., Huffman coding,
arithmetic coding) to assign shorter codes to more frequent coefficients.
Decoding: The decoding process involves the reverse of the encoding
process, i.e., the encoded coefficients are decoded to obtain the quantized
coefficients. The decoded coefficients are then used to reconstruct the
frequency-domain signal.
Inverse Transform: Finally, the inverse transform is applied to the
reconstructed signal to obtain the time-domain representation of the
compressed signal.
Low bit-rate compression: At low bit-rates (e.g. below 0.25 bpp for highly
detailed gray-level images) the distortion in JPEG becomes unacceptable.
Lossless and lossy compression: Need for standard, which provide
lossless and lossy compression in one code stream.
Large images: JPEG doesn’t compress images greater than 64KX64K (4
billion pixels) without tiling.
Figure: Reconstructed
images compressed at
0.125 bpp by means of
(a) JPEG and
(b) JPEG2000
ASRao 95
Features of JPEG2000
Lossless and Lossy Compression: The standard provides lossy
compression with a superior performance at low bit-rates. It also provides
lossless compression with progressive decoding. Applications such as
digital libraries/databases and medical imagery can benefit from this
feature.
ASRao 97
Main Goals of JPEG
High Compression Ratio in such cases where images quality is judged as
very good to excellent.
ASRao 98
Modes of JPEG
Sequential Mode: Each image component is compressed in a single left
to right, top to bottom scan.
Progressive Mode: The image is compressed in multiple blocks to be
viewed from coarse to final detail.
Lossless Mode: Important for cases where the user decides no pixel
should be lost.
Hierarchical Mode: JPEG2000 allows images to be compressed in a way
that enables viewing at different resolutions. This means a lower-quality
version of the image can be viewed without having to download or
decompress the full, high-quality version.
ASRao 99
The JPEG2000 Compression Engine
ASRao 100
Pre-processing
Think of it like a prism that splits white light into its color components. DWT
does something similar, but for signals, separating them into different
frequency bands.
ASRao 103
Discrete Wavelet Transform (DWT) is a mathematical tool used in JPEG2000
to compress images.
It works by:
- Breaking down the image into different frequency bands
- Separating the image into approximation (low-frequency) and
detail (high-frequency) components
- Representing the image using a combination of these components
This process helps to:
- Reduce spatial redundancy
- Represent the image more efficiently
- Achieve better compression ratios
ASRao 104
Forward Transform
Discrete Wavelet Transform (DWT) is used to decompose each tile
component into different sub-bands.
The transform is in the form of dyadic decomposition and use bi-
orthogonal wavelets.
ASRao 106
Quantization
After transformation, all coefficients are quantized using scalar
quantization.
Quantization reduces coefficients in precision. The operation is lossy
unless the quantization step is 1 and the coefficients integers (e.g.
reversible integer 5/3 wavelet).
ASRao 107
Modes of Quantization
Two modes of operation:
ASRao 108
Code-blocks, Precincts and Packets
Precinct: Each sub-band is divided
into rectangular blocks called
precincts.
Packets: Three spatially consistent
rectangles compromise a packet.
Code-block: Each precinct is further
divided into non-overlapping
rectangles called code-blocks.
ASRao 110
Entropy Coding: Coding Passes
Each of these coding passes collects contextual information about the
bit-plane data. The contextual information along with the bit-planes are
used by the arithmetic encoder to generate the compressed bit-stream.
ASRao 111
JPEG2000 Bit-stream
For each code-block, a separate bit-stream is generated.
The coded data of each-block is included in a packet.
If more that one layer is used to encode the image information, the
code-block bit-streams are distributed across different packets
corresponding to different layers.
ASRao 112
ASRao 113