Introduction To Image Processing
Introduction To Image Processing
Processing
Image analysis
Image analysis:
This involves
manipulating the
image data to
1) do we need color information ? determine
2) do we need some transformationsexactly
to frequency
the domain (FFT, DCT and
WLT) ? information
3) do we need segmentation to find necessaryinformation?
object to help
4) what are the important features in in
thesolving
image?a
computer
Image analysis is primarily a data reduction process to achieve low order amount
of imaging problem.
It is typically
image data. It is used in both computer vision and image processing
part of a larger process. It may answer some question, just like:
applications.
In image processing applications, image analysis method can be used to
determine
the degradation function for an image restoration procedure, to develop
an
enhancement algorithm or to determine exactly what information is
visually
First Stage-Preprocessing: It is used to remove noise and to eliminate
important
irrelevant for an image
visually unnecessary compression
informations method.
(such These
as noise or all are image analysis tasks.
System model:
blurring).
Image analysis process can be broken into three primary stages:-
i/p Pre- Data Feature
It is used if we want to investigate more closely a specific portion within the image
called a region of interest (ROI). To do this we need some image geometry
operations. These operations include crop, zoom, enlarge, shrink, translate, and
rotate images.
1) Cropping Process: It is the process of selecting a small portion of the image
(sub-image) and cutting it away from the rest of the image. AND and OR operations
can be used for such purposes. A white square ANDed with an image will allow
only
the image portion within that square to appear in the output with the background
turned black. A black square ORed with an image will allow only that portion of the
image corresponding to that square to appear in the output with the rest of
image
2) Zooming
being Process:
white. After weishave
This process cropped
called imagea masking.
sub-image from the original one, we
can zoom in on it by enlarging it. Zooming process is typically performed using
i) zero-order hold:
This is done by repeating previous pixel values or convolution with the hold
matrix
(convolution mask)
Thus, creating a "blocky effect" while extending the image size. The
convolution
result is put in the pixel location corresponding to the upper-left corner.
[ 35 2
3 4
6]
Ex: The image I(r,c) = is a 3*3 sized order, Extend it to 6*6 sized order
9 4 2
Blocky effect
3 3 2 2 4 4
3 3 2 2 4 4
5 5 3 3 6 6
Iext (r,c)=
5 5 3 3 6 6
9 9 4 4 2 2
9 9 4 4 2 2
2
Image
Processing
which slides across the extended image and perform a simple arithmetic operation
at
each pixel. The convolution process requires us to over lay the mask on the image,
multiply the coincident values and sum all these results, i.e., finding the vector inner
product of the mask with the underlying sub-image. The existing image values
should
not be changed while sliding the mask over by one pixel and repeating the process.
Note that this is the average of the two existing neighbors. This process
continues
until we get to the end of the row, each time placing the convolution results in the
location corresponding to the mask center. When the end of the row is reached, the
= �) mask is
Ientire
image. If the image array is �)𝑴(�,
�=−∞ I(r,c) and the
M(r,c),then
3
Image
Processing
6 6 6 6 6
8 6 4 6 8
5 5.5 6 5.5 5
2 5 8 5 2
The above method allows us to enlarge an N*N sized image to a size of (2N-1)*(2N-1) and can be
repeated as desired.
Note that these method will only allow us to enlarge on image by a factor of (2N-1),
to enlarge it by another factor (K) we take two adjacent values and linearly
interpolate more than one value between them by
1) define an alargment factor = K.
2) subtract the two adjacent values.
3) divide the result by K.
4) add that result to the smaller value and keep adding till finding all the (K-
1)
intermediate pixel locations values.
4
Image Processing
5) Do this for every pair of adjacent pixels, first along the rows and then along
the columns.
The above process will allow us to enlarge the image by any factor of [K(N-
1)+1], where K is integer and N*N is the image size. Since N is very large and
K<<N, Then
≈
the enlarging factor can be approximated to [KN].
Ex: If an image is enlarged three times its original size, what are the interpolated
values between the two adjacent values 125 and 140.
Sol: 140-125=15 is the difference between two adjacent values 125 and 140.
15/K =15/3=5 is the new difference between interpolated values. Thus, the values
are:
Shrinking process
It is opposite to the enlarging process. This is to reduce the amount of data that
need to be processed. This process is usually done by quantization image is the
process of reducing the image data by removing some of the detail information by
mapping groups of data points to a single point. The quantization can be done to:
(a) The pixel values themselves I(r,c); The Method is referred to as gray-
level reduction , or to
(b) The spatial coordinates (r,c); which is called spatial
reduction.
(a) Gray-level reduction: This process is accomplished by three methods
i) Thresholding (Th) ; we first specify Th, then pixels above Th = 1, and those below Th =
0.
The result is a binary Image. This
Is a preprocessing step in the
extraction of object features such
as shape, area and perimeter.
5
Image
Processing
(ii) Reducing the No. of bits per pixel in the data; It is efficiently done by masking the
lower
bits via an AND operation, the No. of bits that remarked determine the no. of
gray available.
Ex: If we want to reduce 8-bit information [contains 256 possible gray-level
values]
down to 32 possible values.
SOL: Each 8-bit value is ANDed with the string [11111000]. This is equivalent
to
dividing by ( 8 = 23 ). 23 corresponds to shift the result left three times. Now gray-
Ranges Mapped to Quantized Values
level
0 - 7values (Ranges) are mapped to 0Quantized Values, as in the followings:
-------->
8 - 15 --------> 8
16 - 23 --------> 16
0 - 7 --------> 7
8 - 15 --------> 15
16 - 23 --------> 23
This is done by quantization of the spatial coordinates resulting in reducing the actual size of the
image. This method is accomplished by taking a group of pixels (matrix) that are spatially adjacent
and mapping them to one pixel. This can be done in one of three ways
(i) Decimation (or sub-sampling): simply done by eliminating some of the data.
(ii) Averaging: we take all pixels in each group and find the average gray-level by summing their
value and dividing by the no. of pixels.
(iii) Median: we sort all pixel values from lowest to highest and then select a middle value.
3 2 4 5 7 6
6 5 4 2 3 1
1 3 5 7 9 2
2 5 7 3 4 6
2 6 8 9 1 0
3 5 6 8 7 9
7
Image Processing
Sol.: 1- By decimations; take every other row and column, this results
in:
3 5 2*2 sized image
2 3
Note: Anti-aliasing pre-filtering is applied as preprocess averaging (mean) filter to improve the
image quality while using decimation.
4.889 5.223
5 6
Translation process:
Original Image
r'=r+r0 r,c
r',c'
c'=c+c0
Translated Image
where (r',c') are the new coordinates , (r,c) are the original coordinates and (r0 ,c0) are the
translations of the image.
Rotation process:
Original Image
ө
Rotated Image
8
Image
Processing
r^ = r cos ө + r sin ө , and c^ = c sin ө + c cos ө
r'' = (r+r0) cos ө + (r+r0) sin ө , and c'' = (c+c0) sin ө + (c+c0) cos ө
2 4 7 4 6 2 6 10 9
3 5 1 + 5 3 6 = 8 8 7
2 6 8 1 2 4 3 8 12
Example:
Addition of image I(r,c) with noise N(r,c) will result in a corrupted image of the form
Example:
3 2 4
Original image = 5 2 6
7 8 3
Image with 3 2 4
motion = 5 6 6
7 8 3
9
Image
Processing
3 2 4 3 2 4
7 8 3 7 8 3
0 0 0 blank image
= 0 4 0 with motion
0 0 0
Motion location
They are used to adjust the brightness of the image I(r,c) to be SI(r,c).
They are used to combine the information of two images, for finding (ROI) or for spatial
quantization.
NOT operation:
Create the negative of the original image by inverting each bit within each pixel value.
(1 0) , (0 1).
Spatial Filters:
They are used typically for noise removal or for image enhancement. Three types will be discussed
here
1. Mean Filter: It adds softer look to the image and remove noise. It is liner filter with a convolution
mask.
a- If the coefficients of the mask sum to one, the average brightness of the original image will be
retained.
10
Image
Processing
Example: For 3*3 mask 1/9 1/9 1/9
It is an averaging filter operates on local 3*3 group of pixel called neighborhoods and replaces the
center pixel with an average of the pixels at the same original brightness but with blurring because
all elements are +ve.
b-If coefficients of the mask sum to zero, the average brightness will be lost.
1/2 1 -1/2
-1 0 1
1/2 -1 -1/2
2. Median Filter: It is used for noise removal. It is a non-linear filter. The center pixel of a group is
replaced by the median of the coefficients of that group.
5 5 6
3 4 7
3. Enhancement Filter:
0 -1 0 1 -2 1
-1 5 -1 and -2 5 -2
0 -1 0 1 -2 1
The Laplacian-type filters will enhance the details (enhance the high frequencies) in all
directions.
11
Image Processing
b- The difference filters: They will enhance the details in a specific direction due to mask
type selection.
0 1 0
0 -1 0
0 0 0
0 0 0
1 0 0
0 0 -1
0 0 1
-1 0 0
Questions:
(Q.1) Enlarge the following 3*3 sized image to 9*9 sized image using two typical methods. Discuss
the effect of such enlarging operations on the resulting images in both methods.
3 2 1
I(r,c) = 5 3 2
4 7 9
(Q.2) Specify the 8-bit OR string used to reduce "256" gray-levels down to "16" gray-level. Give the
mapping process.
(Q.3) Perform the quantization of "256" gray-levels down to "16" levels and shift the values down
to the middle of the range. Specify the ORing and the ANDing strings used to do such gob.
(Q.4) Define the Arithmetic mean filter and the geometric mean filters giving their properties of
operations.
12
Image
Processing
Edge/Line Detection
Many of the edge and line detection operators are implemented with convolution
masks, and most are based on discrete approximations to differential operators. A large
change in Image brightness over a short spatial distance indicates the presence of an
edge.
Edge detection operators are based on the idea that edge information in an image is
found by looking at the relationship a pixel has with its neighbors. If a pixel's gray-level
value is similar to those around it, there is, probably not an edge at that point.
However, if a pixel has neighbors with widely varying gray levels, it may represent an
edge point. Ideally, an edge separates two distinct objects. In practice, apparent edges
are caused by (1) by changes in color (2) by changes in texture (3) by the specific lighting
conditions present during the image acquisition process.
Let us first consider a one-dimensional gray level function, f(x) (along a row). If
the
function, f(x) is differentiated, the first derivative, 𝝏𝒇⁄𝝏�, measures the gradient of the
transition and would be a positive-going or negative going spike at a transition.
The direction of the intensity transition (either increasing or decreasing) can be
identified by the polarity of the first derivative.
13
Image Processing
∅(�, �) = 𝝏
��𝐚�−� ( 𝝏� )
𝒇
𝝏
�
Sobel Operator
The Sobel edge detection masks look for edges in both the horizontal and vertical
directions and then combine this information into a single metric. The masks are as
follows:
General Mask Row Mask Column Mask
− − −
(� (
�� � �� � �
�
)
�� � � �)
� � � � � �
−�
�
)
� � �� � �
Derivatives
� are approximated by �
𝝏𝒇 �
�
� �
≈ (�� + ��� + �� ) − (�� +
𝝏 ��� + �� )
𝝏
≈ (�� + ��� + �� ) − (�� + (−
� ��� + �� )
𝝏𝒇
�
−
14
�
�
Image Processing
These masks are each convolved with the image. At each pixel location, we now
have two numbers: S1, corresponding to the result from the row mask, and S2, from the column
mask. We use these numbers to compute two metrics, the edge magnitude and the
edge direction, which are defined as follows:
√��� � +
𝑺
��𝐚�−
�( )�
𝑺
Edge Magnitude and Edge Direction
�
��� �
The edge direction is perpendicular to the edge itself because the direction
specified is the direction of the gradient, along which the gray levels are changing.
Operators implementing first derivatives will tend to enhance noise. However,
the
Sobel operator has a smoothing action.
Prewitt Operator
The Prewitt is similar to the Sobel, but with different mask coefficients. The masks
are defined as follows:
−1 −1 −1 0 1
( 0 0 0 0 1)
1 1 0 1
)
1
−1
Derivatives are approximated by
𝝏𝒇
𝝏� ≈ (�� + �� + �� ) − (�� +
�� + �� )
𝝏𝒇
𝝏� ≈ (�� + �� + �� ) − (�� + (−
�� + �� )
These masks are each convolved with the image. At each pixel location we find two
numbers: P1, corresponding to the result from the row mask, and P2, from the column
1
mask. We use these results to determine two metrics, the edge magnitude and the edge
−
direction, which are defined as follows:
1
Edge Magnitude √��� + ��𝐚�−� (
𝑷
)
𝑷
� and Edge Direction
�
��� � �
As with the Sobel edge detector, the direction lies 90° from the apparent direction
of the edge.
15
Image
Processing
Laplace The
Operator
Laplace second-order derivative is defined as
= �𝝏
� �
��𝒇���
𝒇 + 𝝏�
𝝏 �𝒇
and approximated to
�� � 𝒇 = ��� − (�� + �� + �� + �� )
which can be obtained with the following single mask:
−
−
�
− )
�
�
� � �
(−
Hough Transform
If the edge points found by the above edge detection methods are sparse, the resulting
� image may consist of individual points, rather than straight lines or curves. Thus,
edge
�
in order to establish a boundary between the regions, it might be necessary to fit a line
to those points. This can be a time consuming and computationally inefficient process,
especially if there are many such edge points. One way of finding such boundary lines is
by use of the Hough Transform, which is useful for detecting straight lines in images.
The Hough transform (Hough is pronounced Huff) is designed to find lines in images, but
it can be easily varied to find other shapes. The idea is simple. Suppose (x , y) is a point
in the image (which we shall assume to be binary). We can write y = a x + b , and
consider all pairs (a ,b) which satisfy this equation, and plot them into an "accumulator
array" or "accumulator matrix" . The (a ,b) array is the "transform array".
16
Image Processing
Each point in the image is mapped onto a line in the transform. The points in the
transform corresponding to the greatest number of intersections correspond to the
strongest line in the image. For example, suppose we consider an image with five
points: (1,0), (1,1), (2,1), (4,1) and (3,2). Each of these points corresponds to a line as
follows:
Each of these lines appears in the transform as shown in the following figure:
17
Image Processing
The dots in the transform indicate places where there are maximum intersections
of
lines: at each dot three lines intersect. The coordinates of these dots are (a ,b) =
(1,0)
and (a ,b) = (1,-1). These values correspond to the lines
These two lines are shown on the image in the following figure:
These are indeed the "strongest" lines in the image in that they contain the greatest
number of points.
There is a problem with this implementation of the Hough transform, and that is that it
can't find vertical lines: we can't express a vertical line in the form as
m represents the gradient, and a vertical line has infinite gradient. We need another
parameterization of lines.
18
Image
Processing
Clearly any line can be described in terms of the two parameters r and 𝜽, where r is the
perpendicular distance from the line to the origin, and 𝜽 is the angle of the
line's
𝜽
perpendicular to the ofx-axis.
But since the gradient In this
the line's parameterization,
perpendicular is tan ,vertical lines of
the gradient arethe
simply those
line itself
whichbe
must have 𝜽 = 0. If we allow r to have negative values, we can restrict 𝜽 to the range
−��� these
≤ 𝜽 two
≤ ��
.
Putting expressions
� for the gradient together produces:
Given this parameterization, we need to be able to find the equation of the line. First
note that the point (p , q) where the perpendicular to the line meets the line is (p , q) =
(r cos ��, r sin ��). Also note that the gradient of the perpendicular is tan 𝜽 = sin 𝜽 / cos ��.
Now let (x , y) be any point on the line. The gradient of the line is
If we now multiply out these fractions we obtain:
19
Image
Processing
𝜽
discrete set of values of r and to use. For each pixel (x , y) in the image, we compute
𝜽
for each value of , and place the result in the appropriate
𝜽 𝜽
position in the (r , ) array. At the end, the values of (r , ) with the highest values in
the array will correspond to strongest lines in the image.
An example will clarify this: consider the image shown in the following figure of a small
image:
20
Image Processing
The accumulator array (matrix) contains the number of times each value of (r , 𝜽)
appears in the above table (No. of repetitions):
In practice this matrix will be very large, and can be displayed as an image. In this
and
21
Image
Processing
Hough transform. (a) original “Tower” image; (b) edge map; (c) lines detected by Hough Transform
22
Image Processing
The implementation of the HT for lines, HT Line, is given in the code, below. It is
important to observe that Equation c = – x m + y is not suitable for
implementation
since the parameters can take an infinite range of values. In order to handle the infinite
range for c, we use two arrays in the implementation in the code. When the slope m is
between – 45° and 45°, then c does not take a large value. For other values of m the
intercept c can take a very large value. Thus, we consider an accumulator for each case.
In the second case, we use an array that stores the intercept with the x-axis. This only
solves the problem partially since we cannot guarantee that the value of c will be
small
when the slope m is between – 45° and 45°.
23
Image Processing
Exercises:
1. Enter the following sub-image matrix f(x,y) into Matlab:
and use imfilter to apply each of the Roberts, Prewitt, Sobel, and Laplacian
edge-finding methods to the sub-image. In the case of applying the filters
(such as with Roberts, Prewitt, or Sobel) apply each of the raw and
columnfilters separately, and join the results. Then apply a thresholding if
necessary to obtain a binary sub-image showing only the edges.
Which method seems to produce the best results?
Hint:
� � � � � �
� � � � � �
f(r , c) = � � � � � �
� � � � � �
[
� �
� � � �
� � � � �
� ]
apply each of the Roberts, Prewitt, Sobel, and Laplacian edge-finding methods and
find the resulting edges.
24
Image Processing
3. Use the Hough transform to detect the strongest line in the binary
image shown below. Use the form with 𝜽 in steps
of from and place the results in an accumulator array.
25