KEMBAR78
5 Digital-Image-Processing | PDF | Discrete Fourier Transform | Filter (Signal Processing)
0% found this document useful (0 votes)
27 views187 pages

5 Digital-Image-Processing

Uploaded by

Naman Bhatia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views187 pages

5 Digital-Image-Processing

Uploaded by

Naman Bhatia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 187

PPT ON

DIGITAL IMAGE PROCESSING

Dr. Vijay Kr. Sharma. Associate Professor,

1
DIGITAL IMAGE FUNDAMENTALS &
IMAGE TRANSFORMS

2
What Is Digital Image Processing?
• The field of digital image processing refers to
processing digital images by means of a digital
computer.

3
What is a Digital Image ?
• An image may be defined as a two- dimensional
function, f(x,y) where x and y are spatial (plane)
coordinates, and the amplitude of f at any pair of
coordinates (x, y) is called the intensity or gray level
of the image at that point.

• When x, y, and the amplitude values of f are all finite,


discrete quantities, we call the image a digital image

4
Picture elements, Image elements, pels, and
pixels
• A digital image is composed of a finite number of
elements, each of which has a particular location and
value.
• These elements are referred to as picture elements,
image elements, pels, and pixels.

• Pixel is the term most widely used to denote the


elements of a digital image.

5
The Origins of Digital Image
Processing
• One of the first applications of digital images was in
the newspaper industry, when pictures were first sent
by submarine cable between London and New York.

• Specialized printing equipment coded pictures for


cable transmission and then reconstructed them at
the receiving end.

6
• Figure was transmitted in this way and reproduced on
a telegraph printer fitted with typefaces simulating a
halftone pattern.

• The initial problems in improving the visual quality of


these early digital pictures were related to the
selection of printing procedures and the distribution
of intensity levels

7
• The printing technique based on photographic
reproduction made from tapes punched at the
telegraph receiving terminal from 1921.

• Figure shows an image obtained using this method.

• The improvements are tonal quality and in resolution.

8
• The early Bartlane systems were capable of coding
images in five distinct levels of gray.
• This capability was increased to 15 levels in 1929.

• Figure is typical of the type of images thatcould be


obtained using the 15-tone equipment.

9
• Figure shows the first image of the moon taken by
Ranger

10
Applications of DIP
• The field of image processing has applications in
medicine and the space program.

• Computer procedures are used to enhance the contrast


or code the intensity levels into color for easier
interpretation of X-rays and other images used in
industry, medicine, and the biological sciences.

• Geographers use the same or similar techniques to


study pollution patterns from aerial and satellite
imagery
11
• Image enhancement and restoration procedures are
used to process degraded images of unrecoverable
objects
• Experimental results too expensive to duplicate.

12
Image Formation
• For example, the observer is looking at a tree 15 m
high at a distance of 100 m.

• If h is the height in mm of that object in the retinal


image,the geometry of Fig. yields
15/100 = h/17 or h=2.55mm.

13
Light and the Electromagnetic Spectrum

• Sir Newton discovered that when a beam of sunlight is


passed through a glass prism,
• The emerging beam of light is not white but consists
instead of a continuous spectrum of colors ranging
from violet at one end to red at the other.

14
The electromagnetic spectrum

15
• The electromagnetic spectrum can be expressed in
terms of wavelength, frequency, or energy.

• Wavelength (l)and frequency (n)are related by the


expression

• where c is the speed of light (2.998*108 m s)

• The energy of the electromagnetic spectrum is given by


the expression E = hv
V (mue) frequency =c/lameda

• where h is Plank”s constant


16
A Simple Image Formation Model
• Images by two-dimensional functions of the form f(x, y).

• The value or amplitude of f at spatial coordinates (x, y)


gives the intensity (brightness) of the image at that
point.

• As light is a form of energy, f(x,y) must be non zero and


finite.

17
• The function f(x, y) may be characterized by two
components:

(1)the amount of source illumination incident on the


scene being viewed
(2)the amount of illumination reflected by the objects
in the scene.
• These are called the illumination and reflectance
components and are denoted by i(x, y) and r(x, y),
respectively.

18
• The two functions combine as a product to
form f(x, y):

f(x, y) = i(x, y) r(x, y)

r(x, y) = 0 --- total absorption

1 --- total reflection

19
• The intensity of a monochrome image f at any
coordinates (x, y) the gray level (l) of the image at
that point.

That is, l = f(x0 , y0 )

L lies in the range

20
GRAY SCALE
• The interval [Lmin , Lmax ] is called thegray scale.

• Common practice is to shift this interval numerically to


the interval [0, L-1],

• where L = 0 is considered black and


L = L-1 is considered white on the gray scale.

All intermediate values are shades of gray varying from


black to white.
21
Basic Relationships Between Pixels
• 1. Neighbors of a Pixel :-
A pixel p at coordinates (x, y) has four horizontal and vertical neighbors
whose coordinates are given by (x+1, y), (x-1, y), (x, y+1), (x, y-1)

• This set of pixels, called the 4-neighbors of p, is denoted


by N4(p).

• Each pixel is a unit distance from (x, y), and some of the
neighbors of p lie outside the digital image if (x, y) is on
the border of the image.

22
ND(p) and N8(p)
• The four diagonal neighbors of p have coordinates
(x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1, y-1)
and are denoted by ND(p).

• These points, together with the 4-neighbors, arecalled the 8-


neighbors of p, denoted by N8(p).

• If some of the points in ND(p) and N8(p) fall outside the image if
(x, y) is on the border of theimage.

23
DIGITAL IMAGE PROCESSING
IMAGE ENHANCEMENT
Process an image so that the result will be more suitable than the original
image for a application. specific
Highlighting interesting detail in images Removing noise from images
Making images more visually appealing

So, a technique for enhancement of x-ray image may not be the best for enhancement of
microscopic images.
These spatial domain processes are expressed by:

G (x,y) = T ( f(x,y) ) depends only on the value of f at (x,y)


f(x,y) is the input image, G (x,y) is the output image
T is called a gray-level or intensity transformation operator which
can apply to single image.

Window origin is moved from image origin along the


1st row and then second row etc.
At each location, the image pixel value is replaced by
the value obtained after applying T operation on the
window at the origin.
the neighborhood size may be different. We can
have a neighborhood size of 5 by 5, 7 by 7 and so on
depending upon the type of the image and the type of
operation that we want to have.
25
Spatial domain techniques
Point Processing: Contrast stretching Thresholding

Intensity transformations / gray level transformations


> Image Negatives
> Log Transformations
> Power Law Transformations

Piecewise‐Linear Transformation Functions Contrast


stretching
Gray‐level slicing Bit‐plane slicing

Spatial filters
Smoothening filters Low pass filters Median filters
Sharpening filters High boost filters
Derivative filters

26
27
Mask/Filter
Neighborhood of a point (x,y)
can be defined by using a
square/rectangular (common
used) or circular subimage
area centered at (x,y)
The center of the
subimage
is moved from pixel to pixel
starting at the top of the
corner

Spatial Processing :
intensity transformation -> works on single pixel for
contrast manipulation

image thresholding

spatial filtering  Image sharpening ( working on


neighborhood of every pixel) or Neighborhood
Processing:

28
Thresholding (piece wise linear transformation)

Produce a two-level (binary) image

29
IMAGE ENHANCEMENT (SPATIAL &
FREQUENCY DOMAIN)

30
Spatial domain: Image Enhancement
Three basic type of functions are used for image enhancement. image enhancement
point processing techniques:
Linear ( Negative image and Identity transformations) Logarithmic transformation
(log and inverse log transformations) Power law transforms (nth power and nth root
transformations) Grey level slicing
Bit plane slicing
We are dealing now with image processing methods that are based only on the
intensity of single pixels.
Intensity transformations (Gray level transformations)
Linear function Negative and identity Transformations

Logarithm function
Log and inverse-log transformation Power-law
function
nth power and nth root transformations 31
Image Negatives
Here, we consider that the digital image that we are considering that will have capital L number of intensity
levels represented from 0 to capital L minus 1 in steps of 1.

The negative of a digital image is obtained by the transformation function

sT(r)L1r

32
Logarithmic Transformations
The general form of the log transformation is s
= c * log (1 + r)
C is a constant and r is assumed to be ≥ 0

The log transformation maps a narrow range of low input grey level values I
nto a wider range of output values. The inverse log transformation
performs the opposite transformation s = log(1 + r)
We usually set c to 1. Grey levels must be in the range [0.0, 1.0]
Identity Function
Output intensities are identical to input intensities.
Is included in the graph only for completeness
Power Law Transformations Why power laws are popular?
A cathode ray tube (CRT), for example, converts a video signal to light in a way. The
light intensity is proportional to a power (γ) of the source voltage VS For a computer
CRT, γ is about 2.2
nonlinear

Viewing images properly on monitors requires γ‐correction


Power law transformations have the following form s = c
*rγ c and γ are positive constants s=rγ
We usually set c to 1. Grey levels must be in the range [0.0, 1.0]
Gamma correction is used for display improvements
Some times it is also written as s = c (r+ ϵ) γ , and this offset is to provide a measurable
output even when input values are
zero

33
s = crγ
c and γ are positive constants
Power-law curves with fractional values of γ
map a narrow range of dark input values into a
wider range of output values, with the opposite
being true for higher
values of input levels.

c=γ=1 Identity
function

34
35
Effect of decreasing gamma
When the γ is reduced too much, the image begins to reduce contrast to the point where the image started
to have very slight “wash-out” look, especially in the background

a) image has a washed-


out
appearance, it needs a
compression of gray
levels
needs γ > 1
(b)result after power-law
transformation with γ =
3.0
(suitable)
(c)transformation with γ
= 4.0
(suitable)
(d)transformation with γ
= 5.0
(high contrast, the image
has areas that are too
dark,
some detail is lost)

36
37
Piecewise Linear Transformation Functions

Piecewise functions can be arbitrarily complex


• A disadvantage is that their specification requires significant user input
• Example functions :
–Contrast stretching
–Intensity-level slicing
–Bit-plane slicing
Contrast Stretching
Low contrast images occur often due to poor or non uniform lighting conditions, or due to nonlinearity,
or small dynamic range of the imaging sensor.
Purpose of contrast stretching is to process such images so that the dynamic range of the image will
be very high, so that different details in the objects present in the image will be clearly visible. Contrast
stretching process expands dynamic range of intensity levels in an image so that it spans the
full intensity range of the recording medium or display devices.

38
Control points (r1,s1) and (r2,s2) control the shape of the transform T(r)
• if r1=s1 and r2=s2, the transformation is linear and produce no changes in
intensity levels
•r1=r2, s1=0 and s2=L-1 yields a thresholding function that creates a binary
image
• Intermediate values of (r1,s1) and (r2,s2) produce various degrees of
spread in the intensity levels
In general, r1≤r2 and s1≤ s2 is assumed so that the junction is single
valued and monotonically increasing.
If (r1,s1)=(rmin,0) and (r2,s2)=(rmax,L-1), where rmin and r max are
minimum and maximum levels in the image. The transformation
stretches the levels linearly from their original range to the full range (0,L-1)

39
Two common approaches
– Set all pixel values within a range of
interest to one value (white) and all
others to another value (black)
•Produces a binary image
That means, Display high value for
range of interest, else low value
(„discard background‟)
– Brighten (or darken) pixel values in a
range of interest and leave all others
Unchanged. That means , Display high
value for range of interest, else original
value („preserve background‟)

40
Bit Plane Slicing
Only by isolating particular bits of the pixel values in a image we can highlight interesting aspects of
that image.
High order bits contain most of the significant visual information Lower bits contain subtle details

Reconstruction is obtained by:

I (i, j) 
n1
2 In (i, j)
n 1
0 t o127 can be mapped
a0s, 128 to 256 can be
mapped as 1
For an 8 bit image, the
above forms a binary image.
This occupies less
storage space.

41
Image Dynamic Range, Brightness and Control
The dynamic range of an image is the exact subset of gray values ( 0,1,2, L-1) that are present in the
image. The image histogram gives a clear indication on its dynamic range.
When the dynamic range of the image is concentrated on the lower side of the gray scale, the image will
be dark image.
When the dynamic range of an image is biased towards the high side of the gray scale, the
image will be bright or light image
An image with a low contrast has a dynamic range that will be narrow and concentrated
to the
middle of the gray scale. The images will have dull or washed out look.
When the dynamic range of the image is significantly broad, the image will have a high
contrast and the distribution of pixels will be ne ar uniform.

42
Histogram equalization Histogram
Linearisation requires construction of a
transformation function sk

43
44
HISTOGRAM EQUALISATIONIS NOT ALWAYSDESIRED.
Some applications need a specified histogram to their requirements
This is called histogram specification or histogram matching

. two-step process
- perform histogram equalization on the image
- perform a gray-level mapping using the inverse of the desired cumulative histogram

45
Arithmetic operations

46
Addition:
Image averaging will reduce the noise. Images are to be registered before adding. An
important application of image averaging is in the field of astronomy, where
imaging with very low light levels is routine, causing sensor noise frequently to
render single images virtually useless for analysis

g(x, y) = f(x, y) + η (x, y)

As K increases, indicate that the variability (noise) of the pixel values at each
location (x, y) decreases
In practice, the images gi(x, y) must be registered (aligned) in order to avoid the
introduction of blurring and other artifacts in the output image.

47
Subtraction
A frequent application of image subtraction is in the enhancement of differences
between images. Black (0 values) in difference image indicate the location where
there is no difference between the images.
One of the most commercially successful and beneficial uses of image subtraction
is in the area of medical imaging called mask mode radiography
g(x, y) = f(x, y) - h (x, y)

Image of a digital angiography. Live image and mask image with fluid injected.
Difference will be useful to identify the blocked fine blood vessels.
The difference of two 8 bit images can range from -255 to 255, and the sum of two
images can range from 0 to 510.
Given and f(x,y) image, f m = f - min (f) which creates an image whose min value is
zero.
fs = k [fm / max ( fm) ],
fs is a scaled image whose values
of k are 0 to 255. For 8 bit image
k=255,

mask an image (taken after


image injection of a contrast
medium (iodine) into the
bloodstream) with mask

48
The output pixels are set of elements
not in A.All elements in A become zero
and the others to 1All

AND operation is the set of


coordinates common to A
and B

The output pixels belong to


either A or B or Both

Exclusive or: The output pixels


belong to either A or B but not to
Both

49
An image multiplication and Division

An image multiplication and Division method is used in shading correction.


g(x, y) = f(x, y) x h (x, y)
g(x, y) is sensed image
f(x, y) is perfect image
h (x, y) is shading function.
If h(x,y) is known, the sensed image can be multiplied with inverse of h(x,y) to get
f(x,y) that is dividing g(x,y) by h(x,y)

Another use of multiplication is Region Of Interest (ROI). Multiplication of a given


image by mask image that has 1s in the ROI and 0s elsewhere. There can be more
than one ROI in the mask image.

50
51
The output pixels are set of
elements not in A.All elements in
A become zero and the others to
1All

AND operation is the set of


coordinates common to A and B

The output pixels belong to either A


or B or Both

Exclusive or: The output pixels


belong to either A or B but not to
Both

52
Let g(x,y) denote a corrupted image by adding noise η(x,y) to a noiseless image f(x,y):

g(x,y)f(x,y)(x,y)
The noise has zero mean value E[zi ]  0
At every pair of coordinates zi=(xi,yi) the noise is uncorrelated E[zi z j ]  0
The noise effect is reduced by averaging a set of K noisy images. The new image is

g (x, y)  1
K
 g (x, y)
i
i1

53
Spatial filters : spatial masks, kernels, templates, windows
Linear Filters and Non linear filters based on the operation performed on the image.
Filtering means accepting ( passing ) or rejecting some frequencies. Mechanics
of spatial filtering

f(x, y) Filter g(x, y)

54
Neighbourhood ( a
small rectangle) Window centre moves from
3x3, 5x5,7x7 etc the first (0,0) pixel and
moves till the end of first row,
then second row and till the
last pixel (M-1, N-1) of the
A pre defined
input image
operation on
the Input Image
Filtering creates a new
pixel in the image at
window neighborhood
-1,-1 -1,0 -1,-1
centre as it moves.
W1 W2 W3

0,-1 0,0 0,1 Filtered image


W4 W5 W6

1,-1 1,0 1,1


W7 W8 W9

At any point (x,y) in the image, the response g(x,y) of the filter is the sum of
products of the filter coefficients and the image response and the image pixels
encompassed by the filter.
Observe that he filter w(0,0) aligns with the pixel at location (x,y) g(x,y)= w (-1,-
1) f(x-1,y-1) + w (-1,-0) f(x-1,y) +
…+w(0,0)f(x,y)
+….+w(1,1)f(x+1,y+1)
55
56
simply move the filter mask from point to point in an image.
at each point (x,y), the response of the filter at that point is calculated
using a predefined relationship

57
Smoothening Ideal LF Noise reduction by Average: Average Filter Linear
(blurring) filter Butterworth removing sharp edges R=1/9 [ Σzi ], Filter
Low Pass LF Gaussian and Sharp intensity i=1 to 9
filter LF transitions Side effect is: Box filter (if all coefficients
integration This will blur sharp edges are equal)
Weighted Average: Mask will
have different coefficients

Order Salt and pepper noise 1.Median Non linear


statistic or impulse noise filter 50 filter
removal percentile
Order Max filter finds 2.Max filter (100 percentile)
statistic bright objects t 3.Min filter ( zero
percentile)

Sharpening Highlights Image sharpening Differentiation or first order gradient is


filters sharpenin Second derivative filter or gradient Linear
differentiation g intensity is better for edge Second derivative operator
detection (Laplacian filter) magnitude is
Image Second derivative is NLionnealirnear
sharpening Laplacian
Unsharp masking image
High Boost sharpening
filtering

First order image Non


derivatives for sharpening Linear
image
sharpening

58
Smoothing Linear Filter or averaging filters or Low pass filters

The output (response) of a smoothing, linear spatial filter is simply the average of the pixels
contained in the neighborhood of the filter mask. These filters sometimes are called
averaging filters. they also are referred to a lowpass filters.

Weighted Average mask: Central pixel usually have higher value. Weightage is
inversely proportional to the distance of the pixel from centre of the mask.

T the general implementation for filtering an MxN image with a weighted averaging
filter of size m x n (m and n odd) is given by the expression, m=2a+1 and
n=2b+1,where a and b are nonnegative integers. an important application of spatial
averaging is to blur an image for the purpose getting a gross representation of objects
of interest, such that the intensity of smaller objects blends with the background;
after filtering and thresholding

59
60
Examples of Low Pass Masks ( Local Averaging)

61
Popular techniques for lowpass spatial filtering

Uniform filtering
The most popular masks for low pass filtering are masks with all their coefficients
positive and equal to each other as for example the mask shown below. Moreover,
they sum up to 1 in order to maintain the mean of the image.

Gaussian filtering
The two dimensional Gaussian mask has values that attempts to approximate the
continuous function. In theory, the Gaussian distribution is non-zero everywhere,
which would require an infinitely large convolution kernel, but in practice it is
effectively zero more than about three standard deviations from the mean, and so we
can truncate the kernel at this point. The following shows a suitable integer-valued
convolution kernel that approximates a Gaussian with a of 1.0.

62
Order-Statistics ( non linear )Filters

The best-known example in this category is the Median filter, which, as its name implies,
replaces the value of a pixel by the median of the gray levels in the neighborhood of that pixel
(the original value of the pixel is included in the computation of the median).

Order static filter / ;ŶoŶ‐liŶeaƌ filter) / median filter Objective:Replace the valve of the pixel by
the median of the intensity values in the neighbourhood of that pixel

Although the median filter is by far the most useful order-statistics filter in image processing, it
is by no means the only one. The median represents the 50th percentile of a ranked set of
numbers, but the reader will recall from basic statistics that ranking lends itself to many other
possibilities. For example, using the 100th percentile results in the so-called max filter,
which is useful in finding the brightest points in an image. The response of a 3*3 max filter is
given by R=max [ zk| k=1, 2, ,… 9]

The 0th percentile filter is the min filter, used for the opposite purpose. Example
nonlinear spatial filters
–Median filter: Computes the median gray-level value of the
neighborhood. Used for noise reduction.
– Max filter: Used to find the brightest points in an image
–Min filter: Used to find the dimmest points in an image R = max{z | k =1,2,...,9}
R = min{z | k =1,2,...,9}

63
f (x,y)

*
Non linear MedianFilter

86
91
101 86 99 99
100 106 103 100 101
91 102 109
101
f (x,y) 102
103 g (x,y)
106
109

64
High pass filter example

A high pass filtered image may be computed as the difference between the original
image and a lowpass filtered version of that image as follows

High pass = Original – Low pass

Multiplying the original by an amplification factor yields a highboost or high-frequency-


emphasis filter

65
Highpass filter example Unsharp masking

A high pass filtered image may be computed as the difference between the original image
and a lowpass filtered version of that image as follows

Highpass = Original – Lowpass

•Multiplying the original by an amplification factor yields a highboost or high-frequency-


empha
sis filter
A=1 for Highpass Filter

A=1.1

A=1.2
A=1.15

66
The high-boost filtered image looks more like the original with a degree of
edge enhancement, depending on the value of .
A determines nature of filtering

67
Sharpening Spatial Filters

Since averaging is analogous to integration, it is logical to conclude


that sharpening could be accomplished by spatial differentiation.

This section deals with various ways of defining and implementing operators for
Image sharpening by digital differentiation.

Fundamentally, the strength the response of a derivative operator is


proportional to the degree of discontinuity of the image at the point at which
the operator is applied. Thus, image differentiation enhances edges and
other discontinuities (such as noise) and deemphasizes areas with slowly varying
gray- level values.

68
Use of first derivatives for Image Sharpening ( Non linear) (EDGE
enhancement)

About two dimensional high pass spatial filters


An edge is the boundary between two regions with relatively distinct grey level
properties. The idea underlying most edge detection techniques is the computation of
a local derivative operator.
The magnitude of the first derivative calculated within a neighborhood around
the pixel of interest, can be used to detect the presence of an edge in an image.
First derivatives in image processing are implemented using the magnitude of the
gradient.
For a fun ction f(x, y), the gradient of f at coordinates (x, y) is de fined as the
al column vector.
two-
dimension

69
The magnitude M (x,y) of this vector, generally referred to simply as the gradient
1/2
is
f (x, y)  mag( f (x, y))   f (x .y ) 2  f (x ,y ) 
2
   
 x   y  

Size of M(x,y) is same size as the original image. It is common practice to refer to this
image as gradient image or simply as gradient.

Common practice is to approximate the gradient with absolute values which is simpler to
implement as follows.

f(x, y)  f (x, y)  f (x, y)


x

y

70
A basic definition of the first-order derivative of a one-dimensional
function f(x) is the difference

δf
δx = f(x + 1) - f(x).

Similarly, we define a second-order derivative as the difference

In x and y directions

71
Sharpening Spatial Filters

First derivative
(1) must be zero in flat segments (areas of constant gray-level values);

(2) must be nonzero at the onset of a gray-level step or ramp; and

(3) must be nonzero along ramps.

Similarly, any definition of a second derivative


(1) must be zero in flat areas;

(2) must be nonzero at the onset and end of a gray-level step or ramp;

(3) must be zero along ramps of constant slope.

72
The digital implementation of the two-dimensional Laplacian is obtained by
summing these two components:

Laplacian operator (for enhancing fine


details)
The Laplacian of a 2-D function f (x, y) is a second order derivative defined as
2 f (x,y) 2 f (x,y)
 f (x, y) 
2

x 2 y2
2f 4z (z z z z )
In practice it can be also implemented using a 3x3 mask 5 2 4 6 8

The main disadvantage of the


Laplacian operator
is that it produces double edges

73
(a) Filter mask used to implement the
digital Laplacian

(b) Mask used to implement an extension


that includes the diagonal neighbors.

(d) Two other implementations of the


Laplacian

74
LAPLACIAN + ADDITION WITH ORIGINAL IMAGE DIRECTLY

75
76
Use of first derivatives for Image Sharpening ( Non linear)

About two dimensional high pass spatial filters


An edge is the boundary between two regions with relatively distinct grey level
properties. The idea underlying most edge detection techniques is the computation of
a local derivative operator.
The magnitude of the first derivative calculated within a neighborhood around the
pixel of interest, can be used to detect the presence of an edge in an image.
First derivatives in image processing are implemented using the magnitude of the
gradient.
For a function f(x, y), the gradient of f at coordinates (x, y) is defined as the two-
dimensio nal column vector.

77
The magnitude M (x,y) of this vector, generally referred to simply as the gradient
f
1/2
f (x, y)  mag( f (x, y))   f (x .y ) 2  f (x ,y ) 
is 2
   
 x   y  

Size of M(x,y) is same size as the original image. It is common practice to refer to this
image as gradient image or simply as gradient.

Common practice is to approximate the gradient with absolute values which is simpler to
implement as follows.

f(x, y)  f (x, y)  f (x, y)


x

y

78
79
DERIVATIVE OPERATORS

Roberts operator
Above Equation can be approximated at point Z5 in a number of ways. The
simplest is to use the difference (Z5 - Z8 ) in the x direction and (Z5 - Z6 ) in the y
direction. This approximation is known as the Roberts operator, and is expressed
mathematically as follows
f  z5 z8  z5 z6

Another approach for approximating the equation is to use cross differences

f  z5 z9  z6  z8

80
Above Equations can be implemented by using the following masks.
The original image is convolved with both masks separately and the
absolute values of the two outputs of the convolutions are added.

81
Prewitt operator

f (x, y)  f (x, y)  f (x, y)


x y

Another approximation to the above equation, but using a 3 x 3 matrix is:

The difference between the first and third rows approximates the derivative in the x
direction
•The difference between the first and third columns approximates the derivative in the y
direction
• The Prewitt operator masks may be used to implement the above approximation

82
83
the summation of coefficients in all masks equals 0, indicating that
they would give a response of 0 in an area of constant gray level

84
85
Filtering in the Frequency Domain

Filters in the frequency domain can be divided in four groups: Low pass filters

… … … . I M A G E BLUR
Remove frequencies away from the origin
Commonly, frequency response of these filters is symmetric around the origin;
The largest amount of energy is concentrated on low frequencies, but it represents just
image luminance and visually not so important part of image.

High pass filters … … … … E D G E S DETECTION


Remove signal components around and further away from origin
Small energy on high frequency corresponds to visually very important image features
such as edges and details. Sharpening = boosting high frequency pixels

Band pass filters


Allows frequency in the band between lowest and the highest frequencies;

Stop band filters


Remove frequency band.
To remove certain frequencies, set their corresponding F(u) coefficients to zero

86
Low pass filters ( smoothing
filteIrdse)alLow Pass filters Butterworth low pass ILPF
filters Gaussian low pass filters BLPF
High Pass Filters ( Sharpening GLPF
filters)
Ideal High pass filters Butterworth High pass IHPF
filters Gaussian High pass filters Laplacian BHPF
in frequency domain GHPF
High boost , high frequency emphasis filters

Homomorthic filters log n or ln


f(x,y)= i(x,y) r(x,y)
F[f(u,v)]= F [log n [i(x,y) r(x,y)] = F [log n [i(x,y)] + F [log n [r(x,y)]

87
88
89
IMAGE ENHANCEMENT III (Fourier)

90
Filtering in the Frequency Domain
• Basic Steps for zero padding
Zero Pad the input image f(x,y) to p =2M-1, and q=2N-1, if arrays are of same size.
If functions f(x,y) and h(x,y) are of size MXN and KXL, respectively, choose to pad with zeros:
P шM + N - 1
Q шK + L- 1
Zero-pad h and f
• Pad both to at least
• Radix-2 FFT requires power of 2
For example, if M = N = 512 and K = L = 16, then P = Q = 1024
• Results in linear convolution
• Extract center MxN

Practical implementation: Overlap-add partitions image into I x J smaller blocks,


pads each block and filter h to same size, filters each
block separately, and recombines:

91
Filtering in the Frequency Domain
• Basic Steps for Filtering in the Frequency Domain:

1. Multiply the input padded image by (-1) x+y to center the transform.
2. Compute F(u,v), the DFT of the image from (1).
3. Multiply F(u,v) by a filter function H(u,v).
4. Compute the inverse DFT of the result in (3).
5. Obtain the real part of the result in (4).
6. Multiply the result in (5) by (-1)x+y .
Given the filter H(u,v) (filter transfer function OR filter or filter function) in the
frequency domain, the Fourier transform of the output image (filtered image)
is given by:
G (u,v)= H (u,v) F (u,v) Step (3) is array multiplication

The filtered image g(x,y) is simply the inverse Fourier transform of G(u,v).

g (x,y) = F -1 [ G(u,v)] = F -1 [H (u,v) F (u,v)] Step (4)

F, H,g, are arrays of same size as in input image. F -1 is IDFT.

92
1. Multiply the input image by (-1)x+y to center the transform
2. Compute F(u,v), the DFT of the image from (1)
3. Multiply F(u,v) by a filter function H(u,v)
4. Compute the inverse DFT of the result in (3)
5. Obtain the real part of the result in (4)
6. Multiply the result in (5) by (-1)x+y

93
94
Low Pass Filter attenuate
high frequencies while
͞passiŶg͟ low frequencies.

High Pass Filter attenuate low


frequencies while ͞passiŶg
high frequencies.

95
Correspondence between filtering in spatial and frequency domains

Filtering in frequency domain is multiplication of filter times fourier


transform of the input image

G (u,v)= H (u,v) F (u,v)

Let us find out equivalent of frequency domain filter H (u,v) in spatial domain.

Consider f (x,y) = ɷ(x,y), we know f(u,v) =1


Then filtered output F -1 [H (u,v) F (u,v)] = F -1 [H (u,v) ]

But this is inverse transform of the frequency domain filter


But this is nothing but filter in the spatial domain.
Conversely, if we take a forward Fourier transform of a spatial filter, we get its
fourier domain representation

Therefore, two filters form a transform pair


h(x,y) H (u,v),

96
Since h(x,y) can be obtained from the response of a frequency domain filter to an
impulse, h(x,y) spatial filter is some times referred as r finite impulse response filter
(FIR) of H(u,v)

f(x,y) * h (x,y) F(u,v) H(u,v)

Spatial domain processing, in general, is faster than frequency


domain processing.
In some cases, it is desirable to generate spatial domain mask
that approximates a given frequency domain filter.
The following procedure is one way to create these masks in a
least square error sense.
Recall that filter processing in frequency domain, which is
product of filter and function, becomes convolution of function
and filter in spatial domain.

97
98
99
Consider the following filter transfer function:

This filter will set F(0,0) to zero and leave all the other frequency components.
Such a filter is called the notch filter, since it is constant function with a hole
(notch) at the origin.

100
HOMOMORPHIC FILTERING
an image can be modeled mathematically in terms of illumination and reflectance as follow:
f(x,y) = I(x,y) r(x,y)

Note that:
F{ f (x, y)} ≠ F{i(x, y)} F{r(x, y)}

To accomplish separability, first map the model to natural log domain and
then take the Fourier transform of it. z(x, y) = ln{ f (x, y)} = ln{i(x, y)}+ ln{r(x, y)}
Then,
F{z(x, y)} = F{ln i(x, y)}+ F{ln r(x, y)}
or
Z (u, v) = I (u, v) + R(u, v)

Now, if we process Z(u,v) by means of a filter function H(u,v) then,

101
102
103
UNIT-III

IMAGE RESTORATION

104
105
Model for image
degradation/restoration process
The objective of restoration is to obtain an estimate for the original image from its
degraded version g(x,y) while having some knowledge about the
degradation function H and additive noise η(x,y).
– Additive noise Linear blurring
g(x, y) = f(x, y) + η(x, y) g(x, y) = f(x, y) * h(x, y)

– Linear blurring and additive noise


g(x, y) = f(x, y) * h(x, y) +η(x, y)

the degraded image in spatial


domain is
g(x, y) = h(x, y) ⊗ f (x, y) +η (x, y)
convolution
Therefore, in the frequency domain it is
131
G(u,v) = H(u, v)F(u, v) + N(u,v)
107
Restoration in the presence of noise only – Spatial filteri

•Mean filters (, Order-statistic filters

Get more data ! Capture N images of the same scene


gi(x,y) = f(x,y) + ni(x,y)
– Average to obtain new gave(x,y) = f(x,y) + nave(x,y)
image
Estimation of Noise
Consists of finding an image (or subimage) that contains only noise, and then
using its histogram for the noise model
• Noise only images can be acquired by aiming the imaging device (e.g.
camera) at a blank wall
In case we cannot find "noise-only" images, a portion of the image is selected
that has a known histogram, subtract the known values from the
histogram, and what is left is our noise model.
• To develop a valid model many sub-images need to be evaluated
108
Noise pdfs
1. Gaussian (normal)

109
Radar range and velocity images typically contain noise
that
135
can be modeled by the Rayleigh distribution
111
112
The gray level values of the noise are evenly distributed across a
specific range
• Quantization noise has an approximately uniform distribution
113
114
Three principal methods of estimating the degradation function for
Image Restoration: ( Blind convolution: because the restored imagewill
be only an estimation. )
1.Observation, 2) Experimentation, 3) Mathematical modeling

Origi nal i m a g e ( u n k no w n ) Degraded image

f ( x ,y ) f(x,y)*h(x,y) g ( x ,y )

O b s e r v a t io n

E s t i m a t e d Tr a n s f e D FT Subimage
G s ( u , v) g s( x , y )
rfunction
G s( u , v) Rest oration
H ( u , v )  H s( u , v )  process by
Fˆs ( u, v ) es ti mati on
D FT R eco nstru cte
T h i s case is u s e d w h e n w e Fˆs ( u , v)
know only d Subimage
fˆ ( x , y )
g(x,y) andcannotrepeatt s
heexperiment! 140
Estimation by Mathematical Modeling:

Sometimes the environmental conditions that causes the degradation


can be modeled by mathematical formulation

If the value of K is large, that means the turbulence is very strong whereas if the
value of K is very low, it says that the turbulence is not that strong

If the value of K is large, that means the turbulence is very


strong whereas if the value of K is very low, it says that the
turbulence is not that strong
141
Inverse Filtering: (un
constrained)

even if H (u, v) is known exactly, the perfect reconstruction


may not be possible because N (u,v) is not known.
Again if H(u,v) is near zero, N(u,v)/H(u,v) will dominate
the
142
F’(u,v) estimate.
Minimum Mean Square Error ( Wiener ) Filtering
Least Square Error Filter
Wiener filter (constrained)
Direct Method (Stochastic Regularization)

•Degradation model:
g(x, y) = h(x, y) * f (x, y) + η(x, y)
•Wiener filter: a statistical approach to seek an
estimate fˆ that minimizes the statistical function
(mean square error):
e2 = E { (f - f ˆ ) 2 }

Mean Square Error =

118
119
UNIT-IV

IMAGE SEGMENTATION &


MORPHOLOGICAL IMAGE PROCESSING

120
IMAGE SEGMENTATION

Input is Image output is features of


images
Segmentation is an approach for Feature
extraction in an image
Features of Image: Points, lines, edges, corner
points, regions
Attributes of features :
Geometrical (orientation, length, curvature, area,
diameter,
perimeter etc
Topological attributes: overlap, adjacency,
common end point, parallel, vertical etc
Image segmentation refers to the process of
partitioning an image into groups of pixels
which are homogeneous with respect to some
criterion..
121
APPROACHES SEGMENTATION

122
HOUGH TRANSFORM

123
Point detection:

124
Detection of Lines,
Apply all the 4 masks on the
imag
e

There will be four responses R1, R2, R3, R4.

Suppose that at a particular point,


│R 1 │ > │R j │ , where j=2,3,4 and j≠1

Then that point is on Horizontal line.


Like this, we can decide which point is associated with which line.
Line detection is a low level feature extraction.

125
Detection of an edge in an image:
What is edge:
An ideal Edge can be defined as a set of connectedpixels each
of which is located at an orthogonal step transition in gray level

126
Calculation of Derivatives of Edges:

127
There are various ways in which this first derivative operators can
be implemented
Prewitt Edge Operator Sobel Edge Operator
(noise is taken
care)

Horizontal Vertical Horizontal vertical


Gx Gy Gx Gy

The direction of the edge that is the direction of gradient


vector f. Direction α (x,y) = tan -1 ( Gy / Gx )

The direction of an edge at a pixel point (x,y) is orthogonal to


the direction α (x,y)

128
Edge Linking

we have to detect the position of an edge and


by this, what is expected is to get the
boundary of a particular segment.

For this there are two approaches : One is


HOUGH
local processing
TRANSFORM

The second approach is global processing (


HOUGHS transformation)

129
EDGE LINKING BY LOCAL PROCESSING
A point (x, y) in the image which is already operated by the sobel
edge operator. T is threshold

In the edge image take two points x,y and x’,y’ and to link
them
Use similarity measure
first one is the strength of the gradient operator
the direction of the gradient

By sobel edge operator.

These 2 points are similar and those points will be linked


together and such operation has to be done for each and
do this for every other point in the edge detected image

130
HOUGH TRANSFORM Global processing
The Hough transform is a mapping from the spatial
domain to a parameter space for a particular straight line,
the values of m and c will be constant

Spatial domain Parameter


space

. Mapping this straight line in the parameter space.

131
So, we have seen 2 cases
Case one: a straight line in the xy plane is mapped to a
point in
the mc plane and
Case two: if we have a point in the xy plane that is mapped
to a straight line in the mc plane

and this is the basis of the Hough transformation by using which


we can link the different edge points which are present in the
image domain
132
Image Space Parameter Space
Lines Points
Points Lines Intersecting lines
Collinearpoints

So for implementation of Hough Transform, what we


have to do is this entire mc space has to be subdivided into a
number of accumulator cells.
•At each point of the parameter space, count how many lines
pass through it.
This is a “bright” point in
the parameter image It can be found by thresholding. This
is called the accumulator

133
when this straight line tries to be vertical, the slope m tends to
be infinity ; to solve this make use of the normal representation
of a straight line Use the “Normal” equation of a line:

A Point in Image Space is now represented as a


SINUSOID
 = x cos+y sin Therefore, use (,) space
 = x cos  + y sin 
 = magnitude
drop a perpendicular from origin to the line
 = angle perpendicular makes with x-axis
134
So, unlike in the previous case where the parameters were the
slope m and c, now parameters become , and .
•Use the parameter space (, )
•The new space is FINITE
•0 <  < D , where D is the
image diagonal ρ = √(M2 +N2), MxN
is image size.
•0 <  <  ( or = ± 90 deg)
•In (,) space
point in image space == sinusoid in (,) space where
sinusoids overlap, accumulator = max maxima still = lines in
image space

135
Global Thresholding : a threshold value is selected where
the threshold value depends only on the pixel intensities in the
image

Dynamic or adaptive thresholding: Threshold depends on pixel


value and pixel position. So, the threshold for different pixels
in the image will be different.

Optimal thresholding : estimate that how much is the error


incorporated if we choose a particular threshold. Then,
you choose that value of the threshold where by which your
average error will be minimized

136
Region based segmentation operations THRESHOLDING
thresholding
region growing and
the region splitting and merging techniques

So, for such a bimodal histogram, you find that there are two peaks.
Now, the simplest form of the segmentation is, choose a threshold value say T
in the valley region
if a pixel at location x,y have the intensity value f (x, y) ≥ T; then we say
that
these pixel belongs to object
whereas if f (x, y) < T, then these pixel belongs to the background.
137
Thresholding In a multi modal
histogram

So, you will find that the basic aim of this thresholding operation
is we want to create a thresholded image g (x, y) which will be a
binary image containing pixel values either 0 or 1 depending
upon whether the intensity f (x, y) at location (x, y) is greater than
T or it is less than or equal to T
138
global thresholding.
This is called
Automatic Thresholding
1. Initial value of Threshold T
2. With this threshold T, Segregate the pixels into two gr2oups G1
and G2
3. Find the mean values of G1 and G2. Let the means be μ1
and μ2
4.Now Choose a new threshod. Find the
average of the means
T new = (μ1 + μ2)/2
5.With this new threshold, segregate two groups
and repeat the procedure. │T – T new│> ∆T’ , back to step.
Else stop.

139
Basic Adaptive Thresholding is
– Divide the image into sub-images and use threshold
local s
But, in case of such non uniform illumination, getting a global
threshold which will be applicable over the entire image is very
very difficult
So, if the scene illumination is non uniform, then a global
threshold is not going to give us a good result. So, what
we have to do is we have to subdivide the image into a
number of sub regions and find out the threshold
value for each of the sub regions and segment that sub
region using this estimated threshold value and here,
because your threshold value is position dependent, it
depends upon the location of the sub region; so the kind of
thresholding that we are applying in this case is an
adaptive thresholding.

140
Basic Global and Local Thresholding
Simple tresholding schemes compare each pixels gray level
with a single global threshold. This is referred to as Global
Tresholding.

If T depends on both f(x,y) and p(x,y) then this is referred to


a
Local Thresholding

141
Adaptive thresholding Local Thresholding
Adaptive Thresholding is
– Divide the image into sub-images and use local thresholds,
Local properties (e.g., statistics) based criteria can be used for
adapting the threshold.

Statistically examine the intensity values of the local


neighborhood of each pixel. The statistic which is most
appropriate depends largely on the input image. Simple and
fast functions include the mean of the local intensity
distribution,
T= Mean, T= Median, T= ( Max + Min )
1. Convolv/e2 tYhoeuicmaangseimwuitlhataestuhietaebflfeescttawtisiticathle foopel orwato
i nrg,
i.e. the mean or median.
steps:
2. Subtract the original from the convolved image.
3. Threshold the difference image with C.
4. Invert the thresholded image.

142
OTIMAL THRESOLDING 143
Now, what is our aim in this particular case? Our aim is that we
want to determine a threshold T which will minimize the
average segmentation error.

144
Overall probability of error is given by: E (T ) = P2
E1(T) + P1 E2(T)
Now, for minimization of this error
∂ E(T) / ∂ T=0
By assuming Gaussian probability density
function,

The value of T can now be found out as the solution for T is


given by, solution of this particular equation
AT2 + BT + C=0 A= ı 1 2 - ı 22
B= 2 ( μ1 ı 2 2 - μ2 ı 1 2 )

C= (μ2 2ı 12 - μ 12 ı 22 )+ 2ı 12 ı 22 ln ( )

145
2
ı = ı 12 = ı 22

Optimal Threshold is obtained


Tby=( μ1 + μ2 ) /2 + [ ı 2 / (μ1 - μ2 )] ln ( P2/P1)

The capital P1 and capital P2, they are same; in that case,
the value of T will be simply μ1 plus μ2 by 2 that is the mean
of the average intensities of the foreground region and the
background region.

146
Boundary characteristics for Histogram Thresholding
Use of Boundary Characteristics for Histogram
Improvement
and Local Thresholding

147
Region growing:
starting from this particular pixel, you try to grow the region
based on connectivity or based on adjacency and similarity. So,
this is what is the region growing based approach

Group pixels from sub-regions to larger regions


– Start from a set of ’seed’ pixels and append pixels with
similar properties
•Selection of similarity criteria: color,
descriptors (gray level + moments)
•Stopping rule

Basic formulation
–Every pixel must be in a region
–Points in a region must be connected
–Regions must be disjoint
–Logical predicate for one region and for distinguishing between
regions
148
Region splitting& merging –
Quadtree decomposition

If all the pixels in the image are similar, Let R


leave it as it is denote the
If they are not similar, Full image.
then you break this image into quadrants. R
make 4 partitions of this image.
Then, check each and every partitionis similar

If it is not similar, again you partition that ular region.


partic
Let us suppose that all the pixels in R are
not similar; ( say VARIANACE IS
LARGE)
149
Now say, this R1 is not uniform,
so partition R1 region again making it R10 R11 R12 R13
and
you go on doing this partitioning until and unless you come
to a partition size which is the smallest size permissible or you
come to a situation where the partitions have become uniform,
or so you cannot partition them anymore.
And in the process of doing this, we
have a quad tree representation of the image.

150
So, in case of quad tree representation, if root node R, initial
partition gives out 4 noidses - R0 R1 R2 and R3. Then R1
gives again R10 R11 R12 and R13. Once such
partitioning is completed, then what you do is you try to
check all the adjacent partitions to see if they are similar.
If they are similar, you merge them together to form a bigger
segment. Say, if R12 and R13 are similar. Merge them.

151
So, this is the concept of splitting and merging technique for
segmentation.
Now at the end, leave it if no more partition is possible ie.
reached a minimum partition size or every partition has become
uniform;
then look for adjacent partitions which can be combined
together to give me a bigger segment.

152
UNIT-V

IMAGE COMPRESSION

153
154
Data redundancy is the central concept in image
compression and can be mathematically defined.
Data Redundancy
Because various amount of data can be used to represent the
same amount of information, representations that contain
irrelevant or repeated information are said to contain redundant
data.
• The Relative data redundancy RD of the first data set, n1, is
defined by:
CR refers to the compression ratio compression ratio
(CR) or bits per pixel (bpp) and is defined by:

If n1 = n2 , then CR=1 and RD=0, indicating that the first


representation of the information contains no redundant
data.
155
Coding Redundancy :

• Code: a list of symbols (letters, numbers, bits , bytes etc.)


• Code word: a sequence of symbols used to represent a piece
of information or an event (e.g., gray levels).
• Code word length: number of symbols in each code word

Ex: 101 Binary code for 5, Code length 3, symbols 0,1

The gray level histogram of an image can be used in


construction of codes to reduce the data used to represent it.
Given the normalized histogr am of a gray level image where

rk is the pixel values defined in the interval [0,1] and pr r(k) is the
probability of occurrence of rk. L is the number of gray levels. nk is
the number of times that kth gray level appears in the image and n
is the total number of pixels (n=MxN)
156
An 8 gray level image has the following gray level distribution.

157
The average number of bit used for fixed 3-bit code:

158
Inter pixel Redundancy or Spatial Redundancy

159
The gray level of a given pixel can be predicted by its neighbors and
the difference is used to represent the image; this type of
transformation is called mapping
Run-length coding can also be employed to utilize inter pixel
redundancy in image compression
Removing inter pixel redundancy is lossless

160
Irrelevant information
One of the simplest ways to compress a set of data is to remove
superfluous data For images, information that is ignored by human
visual system or is extraneous to the intended use of an image are
obvious candidate for omission. The “gray” image, since it appears as
a homogeneous field of gray, can be represented by its average
intensity alone – a single 8-bit value. Therefore, the compression
would be

Psychovisual Redundancy (EYE CAN RESOLVE 32 GRAY


LEVELS ONLY)
The eye does not respond with equal sensitivity to all visual
information. The method used to remove this type of redundancy is called
quantization which means the mapping of a broad range of input values to
a limited number of output values.

161
Fidelity criteria
Fidelity criteria is used to measure information loss and
can be divided into two classes.
1)Objective fidelity criteria (math expression is used):
Measured mathematically about the amount of error in
the reconstructed data.
1)Subjective fidelity criteria: Measured by human
observation

Objective fidelity criteria:


When information loss can be expressed as a
mathematical function of the input and output of the
compression process, it is based on an objective fidelity
criterion. For instance, a root-mean- square (rms) error
between two images.
Let f(x,y) be an input image and be an
approximation of f(x,y)
resulting from compressing and decompressing the input image1.87For
the mean-square signal-to-noise ratio of the output image
is defined as

163
Subjective criteria:

• Subjective fidelity criteria:

•A Decompressed image is presented to a cross section of


viewers and averaging their evaluations.
•It can be done by using an absolute rating scale Or
•By means of side by side comparisons of f(x, y) & f’(x, y).
•Side by Side comparison can be done with a scale such as
{-3, -2, -1, 0, 1, 2, 3}
to represent the subjective valuations
{muchworse,worse, slightly worse, thbeetter,
same, slightly
better, much better} respectively. One possible absolute
rating scale: Excellent, fine, average, poor 164
165
166
All the three stages are present in every compressionmodel.
If error free compression is needed, Quantizer part will be omitted.
In predict compression system, Quantizer + Mapper = Single Block

Mapper: Transforms the image into array of coefficients reducing inter pixel
redundancies. This is a reversible process which is not lossy. Run-length coding
is an example of mapping. In video applications, the mapper uses previous (and
future)
frames to facilitate removal of temporal redundancy.

• Quantizer: This process reduces the accuracy and hence


psycho visual redundancies of a given image is irreversible and
therefore lossy.

•Symbol Encoder: Removes coding redundancy by assigning shortest codes


for
the most frequently occurring output values.

167
Huffman
Coding:

Entropy of the source


or Source Reduction

= - [ 0.4 log (0.4) + 0.3 log (0.3) + 0.1 log (0.1 + 0.1 log (0.1) +
)
0.06 log (0.06) + 0.04 log (0.04)] = 2.14
168
169
Huffman Coding: Note that the shortest codeword (1) is given
for the symbol/pixel with the highest probability (a2). The
longest codeword (01011) is given for the symbol/pixel with the
lowest probability (a5). The average length of the code is
given by:

170
(lossy image compression)
In digital images the spatial freTqruaennscfo
i esrmarCe oidmipnogr:tant as they correspond
to important image features. High frequencies are a less important part of
the images. This method uses a reversible transform (i.e. Fourier,
Cosine transform) to map the image into a set of transform coefficients which are
then quantized and coded.

Transform Selection: The system is based on discrete 2D transforms. The


choice of a transform in a given application depends on the amount of the
reconstruction error that can be tolerated and computational resources
available.
• Consider a sub ima ge NxN image f(x,y), where the forward dis crete transform
T(u,v) is given by:

General scheme • For u, v=0,1,2,3,..,N-1


•The transform has to decorrelate
the pixels or
to compact as
much information as
possible into the smallest number of transform coefficients
•The quantization selectively eliminates or more coarsely quantizes the less
informative coefficients 196
A Transform Coding System

172
DCT

if u=0 ,
Same for α (v)
if u=ϭ,Ϯ,….N-1

Inverse DCT

173
JPEG Standard
JPEG exploits spatial redundancy

Objective of image compression standards is to enhance the


interoperability and compatibility among compression
systems by different vendors.

JPEG Corresponds to ISO/IEC international standard 10928-1,


digital compression and coding of continuous tone still images.
JPEG uses DCT.

JPEG became the Draft Standard in 1991 and


International International
standard IS in 1992.

174
JPEG Standard
Different modes such as sequential, progressive and
hierarchical modes and options like lossy and lossless modes
of the JPEG standards exist.
JPEG supports the following modes of encoding

Sequential : The image is encoded in the order in which it is


scanned. Each image component is encoded in a single left-
to- right, top-to-bottom scan.

Progressive : The image is encoded in multiple passes. (web


browsers). Group DCT coefficients into several spectral
bands. Send low-frequency DCT coefficients first AND
Send higher- frequency DCT coefficients next

Hierarchical : The image is encoded at multiple resolutions to


accommodate different types of displays.
175
JPEG(Joint Photographic Experts Group)
Applications : color FAX, digital still camera, multimedia
computer, internet
JPEG Standard consists of
Algorithm: DCT + quantization + variable length coding

Steps in JPEG Compression


 Divide the file into 8 X 8 blocks.
Apply DCT. Transform the pixel information from the spatial
domain to the frequency domain with the Discrete Cosine
Transform.
 Each value in the spectrum is divided by the matching value in
the quantization table, and the result rounded to the nearest
integer.
 Modified spectrum is converted from an 8x8 array into a liner
sequence .Look at the resulting coefficients in a zigzag order.
Do a run-length encoding of the coefficients ordered in
this manner.
Follow by Huffman coding.
176
The coefficient with zero frequency is called DC coefficients,
and the remaining 63 coefficients are AC coefficients.
For JPEG decoding, reverse process is applied

177
Applications of JPEG-2000 and their requirements

• Internet
• Color facsimile
• Printing
• Scanning
• Digital photography
• Remote Sensing
• Mobile
• Medical imagery
• Digital libraries and archives
• E-commerce

178
Each application area
Improved low bit-rate performance: It should give acceptable quality
below 0.25 bpp. hNeatwsorkedimsagoedmeliveeryraendqreumoirteesemnsinegnts
applications have this requirements .

Progressive transmission: The wstahndiacrdhshtouhldealow progressive

standard
increasing pixel accuracy should f u l fil l.
transmission that allows images to be re con stru cted with
and resolution.
Region of Interest Coding: It should preferentially allocate more bits to
the regions of interest (ROIs) as compared to the non-ROI ones.

Content based description: Finding the desired image from a large


archive of images is a challenging task. This has applications in medical
images, forensic, digital libraries etc. These issues are being addressed
by MPEG-7.
• Image Security: Digital images can be protected using
watermarking, labeling, stamping, encryption etc.
179
2D
DWT for Image
Entropy
Quantization
Discrete
Wavelet
Compression
Coding

Transform

2D discrete wavelet transform


(1D DWT applied alternatively to 10 10
20
20
horizontal and vertical direction 30 30
40 40
line by line ) converts images into 50 50

“sub-bands” Upper left is the DC 60


20 40 60
60
20 40 60

coefficient
Lower right are higher frequency
sub-bands.

180
LL1 HL1

Image decomposition Scale 1


LH1 HH1
4 subbands : LL1, LH1,HL1,HH1

LL2 HL2 HL1

LH2 HH2
Image decomposition Scale 2
LH1 HH1
4subbands : LL2, LH2,HL2,HH2

181
Image Decomposition

182
Post-Compression
Rate-Distortion
(PCRD

183
Embedded Block Coding with Optimized Truncation of bit-stream
(EBCOT), which can be applied to wavelet packets and which
offers both resolution scalability and SNR scalability.

Each sub band is partitioned into small non-overlapping block of


samples, known as code blocks. EBCOT generates an
embedded bit-stream for each code block. The bit-stream
associated with each code block may be truncated to any of a
collection of rate-distortion optimized truncation points

184
Steps in JPEG2000
Tiling:
Smaller non-overlapping blocks of image are known as tiles The image
is split into tiles, rectangular regions of the image. Tiles can be any size.
Dividing the image into tiles is advantageous in that the decoder will
need less memory to decode the image and it can opt to decode only
selected tiles to achieve a partial decoding of the image.

Wavelet Transform: Either CDF 9/7 or CDF 5/3 bi-orthogonal wavelet


transform.

Quantization: Scalar quantization

All operations, such as component mixing, DWT,


quantization and
entropy coding are therefore done independently for each tile.

185
186
MPEG1 MOVING PICTURE EXPERT GROUP
MPEG exploits temporal redundancy. Prediction based.

Compare each frame of a sequence with its predecessor and only pixels that
have changed are updated,
MPEG-1 standard is for storing and retrieving video information on digital
storage media.
MPEG-2 standard is to support digital video broadcasting, HDTV
H.261 standard for telecommunication applicatisoynsstems.

MPEG1 COMPRESSION ALGORITHM: MPEG Digital Video Technology

Temporal compression algorithm: Temporal compression algorithm relies on


similarity between successive pictures using prediction in motion compensation
Spatial compression algorithm: relies upon redundancy with in small areas of a
picture and is based around the DCT transform, quantization an entropy coding
techniques.
MPEG-1 was up to 1.5 Mbit/s. MPEG-2 typically over 4MBit/s but can be up to
80 Mbit/s.
MPEG-1( ISO/IEC 11172 ) and MPEG-2( ISO/IEC 13818 ) , MPEG-4( ISO/IEC
14496 )

187

You might also like