KEMBAR78
CV Module 1 | PDF | Cartesian Coordinate System | Computer Vision
0% found this document useful (0 votes)
220 views166 pages

CV Module 1

Uploaded by

Yogesh Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
220 views166 pages

CV Module 1

Uploaded by

Yogesh Garg
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 166

COMPUTER VISION

Course Objectives

01 02 03 04 05
Recognize and describe Describe the foundation of Become familiar with the Get an exposure to Build computer vision
both the theoretical and image formation and image major technical approaches advanced concepts leading applications.
practical aspects of analysis. Understand the involved in computer to object and scene
computing with images. basics of 3D Computer vision. Describe various categorization from
Connect issues from Vision. methods used for images.
Computer Vision to Human registration, alignment, and
Vision matching in images.
To implement fundamental image processing
techniques required for computer vision.

Understand Image formation process and


generate 3D model from images.

Course Extract features form Images and do analysis


of Images.
Outcomes To develop applications using computer vision
techniques.

Understand video processing, motion


computation and 3D vision and geometry
Digital Image Formation And L
Level Processing:
• Overview and State-of-the-art
• Fundamentals of Image Formation
• Transformation
Module 1 • Orthogonal, Euclidean, Affine, Project
• Fourier Transform
• Convolution and Filtering
• Image Enhancement
• Restoration
• Histogram Processing
Depth Estimation And Multi-Camera Views:

Perspective

Binocular Stereopsis

Camera and Epipolar Geometry

Module Homography

2
Rectification

DLT

RANSAC

3-D reconstruction framework

Auto-calibration apparel
Feature Extraction And Image Segmentation:

Feature Extraction: Edges - Canny, LOG, DOG;

Line detectors (Hough Transform),

Module 3 Corners - Harris and Hessian Affine,

Orientation Histogram, SIFT, SURF, HOG, GLOH,

Scale-Space Analysis- Image Pyramids and Gaussian derivative


filters,

Gabor Filters and DWT.


Image Segmentation:
Region Growing
Edge Based approaches to segmentation
Graph-Cut
Mean-Shift
MRFs
Texture Segmentation;
Object detection.
Pattern Analysis And Motion Analysis:

Pattern Analysis:

Clustering:

Module 4
K-Means, K-Medoids, Mixture of Gaussians,

Classification:

Discriminant Function, Supervised, Un-supervised, Semi-supervised;

Classifiers:

Bayes, KNN, ANN models;


Dimensionality Reduction:
PCA, LDA, ICA;
Non-parametric methods.
Motion Analysis:
Background Subtraction and Modeling,
Optical Flow,
KLT,
Spatio-Temporal Analysis,
Dynamic Stereo;
Motion parameter estimation.
Shape From X:

Light at Surfaces;

Phong Model;

Reflectance Map;
Module 5 Albedo estimation;

Photometric Stereo;

Use of Surface Smoothness Constraint;

Shape from Texture, color, motion and edges


TEXTBOOKS

1. Richard Szeliski, Computer Vision: Algorithms and Applications,


Springer-Verlag London Limited 2011.
2. Computer Vision: A Modern Approach, D. A. Forsyth, J. Ponce, Pearson
Education, 2003.
REFERENCE BOOKS
1. Richard Hartley and Andrew Zisserman, Multiple View Geometry in
Computer Vision, Second Edition, Cambridge University Press, March
2004.
2. K. Fukunaga; Introduction to Statistical Pattern Recognition, Second
Edition, Academic Press, Morgan Kaufmann, 1990.
3. R.C. Gonzalez and R.E. Woods, Digital Image Processing, Addison-
Wesley, 1992
MODULE 1
Digital Image Formation
And Low Level Processing
Computer vision is a field of artificial intelligence (AI) that
enables computers and systems to derive meaningful
information from digital images, videos and other visual
inputs — and take actions or make recommendations based
on that information.

AI enables computers to think

Computer vision enables them to see, observe and


understand.
Computer
Vision
• Make computers understand
images and video.

• What kind of scene?

• Where are the cars?

• How far is the building?

• …
Using Computer Vision: Facial Expression

Detecting faces allows the devices to identify the presence of faces apart from
the task of recognizing them.

In this video Masha, a college student, is experimenting on how the computer


judges our face and tells our mood by the color of the dragon on the screen.

http://www.youtube.com/watch?v=7tD1KlTkunM&feature=player_embedded
Using Computer • Here are pictures of people and their expressions. As you
can see, below the faces, the camera can sense where the
Vision: Facial main features change in the face.

Expressions
Camera Mouse
o The Camera Mouse can detect your head’s motions and
move along on the computer screen.
o “Instead of using a mouse, a webcam or built-in camera
looks at you and tracks a spot on your face. If you move
your head to the left, the mouse moves to the left. If
you hold the pointer over the spot, a click is issued.
Anything you can do with a mouse, you can do
with Camera Mouse.” – Professor Gips
o June 2007, Camera Mouse was made available free of
charge through Internet download.
o According to Gips, 100,000 copies were downloaded in
the first 31 months; in the year following that, another
100,000. More recently is that100,000 were
downloaded in just one month
Computer Vision to the
rescue !!
• Computer Vision can also be used to help people in need
• Such as those who can’t use certain body parts
to communicate.
• Jordan, the girl above, can’t communicate using her hands to
move the mouse on a computer. But with the Camera Mouse
that recognizes where she wants to click on she can move the
mouse where she wants using her head.
Eagle Eyes
o Eagle Eyes allows people who can only move their eyes to use
the computer by having five electrodes attached to their head
in spots that can see head and eye movement.

o “Eagle Eyes and Camera Mouse do more than provide the


disabled a means to access and use the computer; they now
have a means of communicating and connecting that their
body has denied from them for years.”
Computer Vision :
Speaking with Eyes
• The computer senses your eyes and
notices the eye movements. When
someone blinks the computer would click
something.
• Looking into the side, or raising eyebrows
are some ways to communicate with your
eyes in the computer.
• There’s also the eye gaze detection that
detects where you are trying to move to.
Facebook tagging !
\(^_^\)
• Facebook also has face
recognition.
• It scans you and your friends'
photos for recognizable faces
and suggests nametags for
the faces by matching them
with their profile photos and
other tagged photos.
Enables computers to see, identify and
process images as human vision does
and then provide appropriate output.

•Extract high dimensional data from


real world

•Examples image recognition, visual


recognition, and facial recognition.
Computer Vision Vs Human Vision
Computer Graphics: Models to
Images

Computer
Comp. Photography: Images to
Vision and Images
Nearby Fields

Computer Vision: Images to


Models
Vision is really hard

• Vision is an amazing feat of natural


intelligence
– Visual cortex occupies about 50% of Macaque brain
– More human brain devoted to vision than anything else

Is that a
queen or a
bishop?
Why computer vision matters

Safety Health Security

Comfort Fun Access


Ridiculously brief history of computer vision
• 1966: Minsky assigns computer vision
as an undergrad summer project
• 1960’s: interpretation of synthetic
worlds
Guzman ‘68
• 1970’s: some progress on interpreting
selected images
• 1980’s: ANNs come and go; shift toward
geometry and increased mathematical
rigor
• 1990’s: face recognition; statistical Ohta Kanade ‘78
analysis in vogue
• 2000’s: broader recognition; large
annotated datasets available; video
processing starts

Turk and Pentland ‘91


How vision is used now

• Examples of state-of-the-art
Optical character recognition (OCR)
Technology to convert scanned docs to text
• If you have a scanner, it probably came with OCR software

Digit recognition, AT&T labs License plate readers


http://www.research.att.com/~yann/ http://en.wikipedia.org/wiki/Automatic_number_plate_recognition
Face detection

• Many new digital cameras now detect faces


– Canon, Sony, Fuji, …
Smile detection

Sony Cyber-shot® T70 Digital Still Camera


3D from thousands of images
Object recognition (in supermarkets)

LaneHawk by EvolutionRobotics
“A smart camera is flush-mounted in the checkout lane, continuously
watching for items. When an item is detected and recognized, the
cashier verifies the quantity of items that were found under the basket,
and continues to close the transaction. The item can remain under the
basket, and with LaneHawk,you are assured to get paid for it… “
Vision-based biometrics

“How the Afghan Girl was Identified by Her Iris Patterns” Read the story
wikipedia
Login without a password…

Face recognition systems now


Fingerprint scanners on
beginning to appear more widely
many new laptops, http://www.sensiblevision.com/
other devices
Object
recognition (in
mobile phones)
• Point & Find, Nokia
• Google Goggles
Special effects: shape capture

The Matrix movies, ESC Entertainment, XYZRGB, NRC


Special effects: motion capture

Pirates of the Carribean, Industrial Light and Magic


Sports

• Sportvision first down line


• Nice explanation on
www.howstuffworks.com

• http://www.sportvision.co
m/video.html
Smart cars Slide content courtesy of Amnon Shashua

• Mobileye
– Vision systems currently in high-end BMW, GM,
Volvo models
– By 2010: 70% of car manufacturers.
Google cars

http://www.nytimes.com/2010/10/10/science/10google.html?ref=artificialintelligence
Interactive Games: Kinect

• Object Recognition:
http://www.youtube.com/watch?feature=iv&
v=fQ59dXOo63o
• Mario:
http://www.youtube.com/watch?v=8CTJL5lUj
Hg
• 3D:
http://www.youtube.com/watch?v=7QrnwoO
1-8A
• Robot:
http://www.youtube.com/watch?v=w8Bmgt
MKFbY
Vision in space

NASA'S Mars Exploration Rover Spirit captured this westward view from atop
a low plateau where Spirit spent the closing months of 2007.

Vision systems (JPL) used for several tasks


• Panorama stitching
• 3D terrain modeling
• Obstacle detection, position tracking
• For more, read “Computer Vision on Mars” by Matthies et al.
Industrial robots

Vision-guided robots position nut runners on wheels


Mobile robots

NASA’s Mars Spirit Rover


http://en.wikipedia.org/wiki/Spirit_rover http://www.robocup.org/

Saxena et al. 2008


STAIR at Stanford
Medical imaging

Image guided surgery


3D imaging
Grimson et al., MIT
MRI, CT
CV in Image Processing

•Computer vision deals


•Involves tasks such as
with theories and •Segmentation of
noise removal, •Description of the
algorithms for images to isolate object
smoothing, and segmented regions.
automating the process regions
sharpening of edges
of visual perception
Computer Graphics Vs Computer Vision

Computer vision is a term


Computer graphics is a
used to describe a way to
term used to describe a
understand images that
way to render images
are rendered using
using computers
computers
Computer Graphics: Synthesize pictures from
mathematical or geometrical models.

CG Vs IP Image Processing: Analyze pictures to derive


descriptions (often in mathematical or geometrical
forms) of objects appeared in the pictures.
Image Processing
Vs
Computer Vision
Comparisons
Human Vision Vs Computer Vision
Applications
Types of Digital Images
Digital Image Processing
Components of General Image
Processing System
Digital Image Processing

Processing of digital images by a digital computer

Need to process images?

Major applications that motivated


1) Improvement of pictorial information for human perception
2) Image processing for autonomous machine application
3) Efficient storage and transmission

27-Mar-22 59
Human Perception

Methods are employed that enhance pictorial information for


human interpretation and analysis.

Applications include:
• Noise filtering
• Content enhancement
• Remote sensing
• Area of medicine
• Terrain mapping
• Atmospheric studies
• Astronomical studies
27-Mar-22 60
Noise filtering

27-Mar-22 61
Content enhancement

27-Mar-22 62
Content enhancement - deblurring

63
Remote sensing

64
Area of medicine

65
Terrain mapping Weather forecast

27-Mar-22 66
Atmospheric studies - Ozone hole

27-Mar-22 67
Machine Applications

• Industry machine vision for product assembly and inspection


• Automated target detection and tracking
• Finger print recognition
• Machine processing of aerial and satellite imagery for weather prediction and
crop assessment etc.

27-Mar-22 68
Image

Image – Projection of 3D scene in 2D plane


A 2D function f( x , y ), x , y → spatial coordinates and f at any pair of
coordinates ( x , y ) is called intensity or gray level of the image at that
point.
Depending on x , y and f values, images can be classified in to
• Analog image
• Digital image
Analog – f ( x , y ) has continuous range of values.
Digital – f ( x , y ) has finite , discrete quantities.
27-Mar-22 69
27-Mar-22 70
The Purpose of converting Analog to digital image with help of digital computers is to
store and transmit efficiently.

Image contains finite number of elements.


Each element has a particular location and value.
These elements are called as picture elements or image elements or pels or pixels.

Advantages: Disadvantages:
1) Fast processing High memory for good quality images and
2) Cost effective hence requires fast processing
3) Effective storage
4) Effective transmission
5) Versatile image manipulation
27-Mar-22 71
Digital Image Processing

• Image processing is a method to convert an image into digital form


• Perform some operations on the converted image in order to get an
enhanced image or to extract some useful information from it.
• It is a type of signal dispensation in which input is an image (video
frame or photograph) and output may be image or characteristics
associated with that image.
• Usually Image Processing system includes treating images as two
dimensional signals and applying predefined methods on the
images.

27-Mar-22 72
Low Level processes – involves primitive operations such as
image pre-processing
Both input and output are images.
Mid Level processes – involves tasks such as segmentation ,
description and classification
Inputs are generally images but the outputs are attributes that
extracted from the images
High Level processes – involves ensemble of recognized objects

27-Mar-22 73
Purpose of Image processing

The purpose of image processing is divided into 5 groups.


They are :
• Visualization - Observe the objects that are not visible.
• Image sharpening and restoration - To create a better
image.
• Image retrieval - Seek for the image of interest.
• Measurement of pattern – Measures various objects in an
image.
• Image Recognition – Distinguish the objects in an image.
27-Mar-22 74
Components of Digital Image Processing

Two elements are required to acquire digital images.


1. Sensor
2. Digitizer

27-Mar-22 75
FUN FACTS

01 02 03 04
Our eyes recognize each Cones in human eyes works Some other creatures can see Red yellow and blue are
wavelength by a different as a receiver for these small parts of the spectrum that known as primary colours and
colour. Red has the longest visible light waves. are not visible to us. For are used to create all the
wavelength and violet has the example, some insects can colours that we see. Orange,
shortest wavelength. see UV light. purple and green are called
secondary colours.
ACTIVITY
Applications of Computer Vision

One-page writeup on any 5 applications of


computer vision that we see in real world
 Process – converts object into another
form

Transformations  Change in form, nature or appearance


Geometric Transformations

 Changes the orientation , size and shape of the objects


 Alter the coordinate descriptions of objects
 Converts one type of data to another
type of data

 It's of the form y = A x

Linear  A – is called standard matrix

transformation
 |A| = 0 then A is called Singular
 |A|!= 0 then A is called Non-Singular or
Regular

 Inverse Transformation x = A-1 y


 Consider two functions f(x) and g(x) on the
interval [a , b]
 Then Inner product of f and g is given as (f
, g)

Inner Product 𝑏
 (f , g) = ‫𝑥𝑑 𝑥 𝑔 𝑥 𝑓 𝑎׬‬

 If (f , g) = 0 , we say f and g are orthogonal


{1+x , x-x2} are orthogonal on [-2 , 2]

Two functions
Inner product
are orthogonal
is zero
in nature
Orthogonal Transformation

It is a linear
It preserves the
transformation T:V→V
lengths of vectors
which preserves a
and angles between
symmetric inner
vectors
product
Euclidean Transformation

The Euclidean An Euclidean


transformations are transformation is
the most either a translation,
commonly used a rotation, or a
transformations. reflection.
Translations and Rotations on the xy-Plane

 We intend to translate a point in the xy-plane to a new place by adding a


vector <h, k> .
 It is not difficult to see that between a point (x, y) and its new place (x', y'),
 we have x' = x + h and y' = y + k.
 Let us use a form similar to the homogeneous coordinates.
 That is, a point becomes a column vector whose third component is 1.
Thus, point (x,y) becomes
 Then, the relationship between (x, y) and (x', y') can be put into a matrix form like
 Therefore, if a line has an equation Ax + By + C = 0,
 after plugging the formulae for x and y,
 the line has a new equation
 Ax' + By' + (-Ah - Bk + C) = 0
 If a point (x, y) is rotated an angle a about the coordinate origin to
become a new point (x', y'), the relationships can be described as
Thus, rotating a line Ax + By + C = 0 about the origin a degree brings it to a new equation:

(Acosa - Bsina)x' + (Asina + Bcosa)y' + C = 0


Combination of translation and rotation

 Translations and rotations can


be combined into a single
equation
 Rotate the point (x,y) about
an angle a about the
coordinate origin and
translates the rotated result in
the direction of (h,k)
 if translation (h,k) is applied first followed by a rotation of angle a (about
the coordinate origin)

Therefore, rotation and translation are not commutative!


Translations and Rotations in Space

 translates points by adding a vector <p, q, r>


 Rotations in space are more complex
 Because we can either rotate about the x-axis, the y-axis or the z-
axis.
 When rotating about the z-axis, only coordinates of x and y will
change and the z-coordinate will be the same.
 In effect, it is exactly a rotation about the origin in the xy-plane.
Rotation about z axis 90 degrees

x-axis rotates to the y-axis and the y-axis rotates to the negative direction of the original x-axis.
Rotation about X axis

y-axis rotates to the z-axis and the z-axis rotates to the negative direction of the original y-axis.
Rotation about y axis

x-axis rotates to the negative direction of the z-axis and the z-axis rotates to the original x-axis.
 A rotation matrix and a translation matrix can be combined into a single
matrix as follows,
 where the r's in the upper-left 3-by-3 matrix form a rotation
and p, q and r form a translation vector.
 This matrix represents rotations followed by a translation.
Euclidean Transformations

 Euclidean transformations preserve length and angle measure.

 Moreover, the shape of a geometric object will not change.

 That is, lines transform to lines, planes transform to planes, circles transform
to circles, and ellipsoids transform to ellipsoids.

 Only the position and orientation of the object will change.


Affine Transformations

 Affine transformations are generalizations of Euclidean transformations.

 Under affine transformations, lines transforms to lines; but, circles become


ellipses.

 Length and angle are not preserved.


Projective transformations

 Projective transformations are the most general "linear" transformations and require the use of
homogeneous coordinates.
 Given a point in space in homogeneous coordinate (x,y,z,w) and its image under a projective
transform (x',y',z',w'), a projective transform has the following form:

 4-by-4 matrices must be non-singular (i.e., invertible). Therefore, projective transformations are
more general than affine transformations because the fourth row does not have to contain 0,
0, 0 and 1.
 Projective transformation can bring finite points to infinity and points at infinity to finite range
Scaling

 Scaling can be applied to all axes, each with a different scaling factor.
 For example, if the x-, y- and z-axis are scaled with scaling
factors p, q and r, respectively, the transformation matrix is:
Shearing

 The effect of a shear transformation looks like ``pushing'' a geometric


object in a direction parallel to a coordinate plane (3D) or a coordinate
axis (2D).
 In the following, the red cylinder is the result of applying a shear
transformation to the yellow cylinder:
 How far a direction is pushed is determined by a shearing factor.
 On the xy-plane, one can push in the x-direction, positive or negative,
and keep the y-direction unchanged.
 Or, one can push in the y-direction and keep the x-direction fixed.
 The following is a shear transformation in the x-direction with shearing
factor a:
 The shear transformation in the y-direction with shearing factor b is
 In space, one can push in two coordinate axis directions and keep the
third one fixed.
 The following is the shear transformation in both x- and y-directions with
shearing factors a and b, respectively, keeping the z-coordinate the same
Projective transformations

 Projective transformations are so general, little information is necessarily preserved by them.


 Hyperplanes still map to hyperplanes.
 Parallelism is not preserved (this is what we would expect, since the perspective transformation can
map squares to trapezoids, 'horizontal' lines meeting as they recede with distance).
 Although ratios are not preserved.
 In 2D, we associate the point (a,b) with a set of points in 3D homogeneous
coordinates (wa,wb,w) for all reals w.
Image is represented in 2D Matrix

Neighbourhood

- 4 neighbourhood

- 8 neighbourhood
Neighbourhood
of a pixel - diagonal neighbourhood

Pixel (x,y) has two horizontal and two vertical neighbours

Set of 4 pixels is called 4-neighbours N4(P)

If P is a boundary pixel then it will have less number of neighbours

27-Mar-22
• A pixel has 4 diagonal neighbors

• Denoted as ND(P)

• The points of N4(P) and ND(P)


together are called 8 neighbours of
P

• If P is a boundary pixel then both


N4(P) and
ND(P) will have less number of
pixels

27-Mar-22
• Some more processing is required to say whether these pixels belong to the same object or not.

• So, we do grouping operation

• For grouping, we have to identify the pixels that are connected and not connected

• Connectivity between pixels is an important property to establish object boundaries, find area of the object, find
descriptors of the object to recognize the object

• Two pixels are said to be connected if they are adjacent


• - they are neighbours
• - their intensity values are similar

• Binary image –> 2 points P and Q will be connected if q belongs to N(p) or p belongs to N(q) and B(p)==B(q)

• Since it is a binary image the intensity value is either 0 or 1


• Connectivity property in gray level image

• Let V be the set of gray levels

• This defines the connectivity for 2 points F(p,q) belongs to V, then three
types of connectivity are defined

• 4 connectivity
• 8 connectivity
• m connectivity
IMAGE ENHANCEMENT
27-Mar-22
Improves the quality of the image

To highlight the interesting details in the image

Remove noise

More appealing

IMAGE Methods :

ENHANCEMENT 1) Spatial domain


2) Frequency domain
3) Combination method

27-Mar-22
SPATIAL DOMAIN
Refers to the image plane itself

Direct manipulation of pixels values

Categories
1) Intensity information
2) Spatial filtering

Spatial domain techniques operate directly on the pixels of the image

Denoted as g(x,y) = T{ f(x,y) }

27-Mar-22
Origin
y

(x,y)

x
27-Mar-22
In spatial domain processing , the process consists of

Moving the origin of the neighborhood from pixel to pixel and

Applying the operator T to the pixel in the neighborhood that will be yielding
the output of the location
Simplest form - when the neighborhood is of the size 1x1

The g depends on the value of f and (x,y) and T will become gray level

T – gray level or intensity or mapping transformation

Denoted as S=T(r)

27-Mar-22
Contrast Stretching Point Processing

27-Mar-22
• Figure a) – effect of transformation that produce an high
contrast image than original image

• Image will be darken below m and brighten above m

• This technique is called contrast stretching

• Figure b) produce two level image which is known as


binary image

• Known as threshold function

• Simple and most powerful processing approach

• Transformation at any point depends on gray level at that


point . It is called Point processing

27-Mar-22
The value of r below m will be darker

The value of r above m will be lighter

Principal approach – use the mask or filter

Mask – 2D array which is of order 3x3

The mask coefficients will determine the value of process that can be applied on the image

This is called mask processing or filtering

27-Mar-22
Basic Intensity transformation function

Also known as gray level transformation


Represented as S=T(r)
Value of transformation function are stored in matrix form
Mapping of r with S is implemented through some lookup table
8 bit environment – look up table consists of 256 entities ( 0 to 255)

27-Mar-22
27-Mar-22
Linear transformation function (
Image negativity & image
Identity )
CLASSIFICATION
OF Logarithmic transformation
function (log function & Inverse
TRANSFORMATI log function)
ON FUNCTIONS
Power law transformation
function (nth power and nth root)

27-Mar-22
Very less helpful in digital
image processing
Output image is same as
input image
IMAGE IDENTITY
It is also called as identity
transformation
The transformation is a
linear straight line

27-Mar-22
Image negative
Intensity level – 0 to L-1

Represented as S=L-1-r

Intensity value or pixel value is reversed to produce the equivalent photographic


negative.
0->black and 1->white

r=0 r=L-1
S=L-1-r S=L-1-r
S=L-1 =L-1-L+1= 0

27-Mar-22
LOG TRANSFORMATION

Represented as S=clog(1+r)

c-constant and r>=0

(1+r) is taken because If r=0 then log(r)=log(0)=0

Using the logarithmic t/f , we can compress or expand the gray level

o/p – higher contrast image or lower contrast image depending on the function that we
perform

27-Mar-22
27-Mar-22
INVERSE LOG TRANSFORMATION

We can observe that we have the wider


pixel value and narrow output pixel values

Higher input values – lower output values

1 and 2 are the parts that are


unaffected by logarithmic functions
27-Mar-22
Consists of nth power and nth root

Represented as S = C r γ

POWER LAW γ >1 then it act as nth power

TRANSFORMATI γ <1 then it act as nth root


ON
It is also known as gamma correction

It is similar to the logarithmic transformations


but for different values of γ

27-Mar-22
27-Mar-22
Three types
PIECEWISE
LINEAR •Contrast
TRANSFORMATI stretching
ON FUNCTION •Gray level slicing
•Bit plane slicing
27-Mar-22
CONTRAST STRETCHING

Contrast – difference between the highest gray level


and the lower gray level of an image

Low contrast image is obtained due to poor


illumination

Basic idea – increase the contrast of an image by


making darker portion more darker and brighter
portion more brighter
27-Mar-22
27-Mar-22
1) r1=S1 and r2=S2. This function act as
linear transformation

2) r1=r2 S1=0 and S2=L1-1. this function


act as thresholding. Here it produces the
binary image which will be having 0 or 1
ANALYSIS 3) Intermediate values (r1,S1) and (r2,S2).
This produces various degrees of spread in
gray values
4) Generally r1<=r2 and S1<=S2 . At
this case the function will have single value
and it is monotonically increasing function

27-Mar-22
S

S2=L-1

S1=0 r1=r2 r
black white
Used to highlight specific range of gray
levels

Implemented using two approaches

GRAY LEVEL
SLICING I approach: Display a high value of r
gray levels in the range of intrest and
low level value for all other gray levels
II approach: Brighten the desired gray
level but preserve the gray level
unchanged for other pixels

27-Mar-22
T(r)
T(r)

A B A B

27-Mar-22
Highlights the contribution made to the total image
appearance for specific bits

8 bit image – image consists of 8 – one bit planes


ranging from bit plane 0 to bit plane 7

BIT PLANE Bit plane 0 – LSB and bit plane 7 is MSB

SLICING Seperating a digital image in to bit plane is useful


for anlysing the relative importance played by
each bit
Used in image compression

27-Mar-22
Histogram, Image
restoration, Convolution,
Filtering , Fourier transform
Histogram Equalization
• Graphical representation of any data

• Used to represent the data related to


6 6 7 7 6
the digital image
5 2 2 3 4
• Representation of relative frequency of 3 3 4 4 5
occurrence of the various gray levels
5 7 3 6 2

7 6 5 5 4

27-Mar-22
0 0
6 6 7 7 6 1 0
2 3 5 5
5 2 2 3 4
3 4
4 4 4
3 3 4 4 5 4 4 3
5 7 3 6 2 5 5
6 5
7 6 5 5 4
7 4 0 1 2 3 4 5 6 7

27-Mar-22
Histogram Equalization
• Dark image – histogram placed at 0

• Right Image – Histogram placed toward 255

• Low contrast image – Histogram placed at center

• High contrast image – Histogram is placed on entire plane ( flat profile )

• Used for various image processing applications

• Can control the quality of the image by normalizing the histogram value to a flat profile

27-Mar-22
4 4 4 4 4 6 6 6 6 6

3 4 5 4 3 2 6 7 6 2

3 5 5 5 3 2 7 7 7 2

3 4 5 4 3 2 6 7 6 2
Histogram
4 4 4 4 4 Equalization 6 6 6 6 6

27-Mar-22
Gray 0 1 2 3 4 5 6 7
levels
No.Of 0 0 0 6 14 5 0 0
Pixels

14

6
5

0 1 2 3 4 5 6 7
27-Mar-22
27-Mar-22
27-Mar-22
7 7 7

1 2 3 4 5 6 7
Image Restoration
Objective of image restoration
Convolution is a simple mathematical operation

It is fundamental to many common image processing


operators.

Convolution Convolution provides a way of `multiplying together' two


arrays of numbers of different sizes, but of the same
dimensionality, to produce a third array of numbers of the
same dimensionality.
This can be used in image processing to implement operators
whose output pixel values are simple linear combinations of
certain input pixel values.
Convolution

 A convolution is an operation with which we can merge two arrays by multiplying


them, these arrays could be of different sizes, the only condition, however, is that the
dimensions should be the same for both arrays.
 Convolution is used for many things like calculating derivatives, detect edges, apply
blurs etc. and all this is done using a "convolution kernel". A convolution kernel is a very
small matrix, and, in this matrix, each cell has a number and an anchor point.
 For example, if you want to highlight borders in an image (image sharpening)
convolution is an operation involved. Another example, if you want to remove noise
under some specifications you can use convolution as well.
 The anchor point is used to know the position of the kernel with respect to the image. It
starts at the top left corner of the image and moves on each pixel sequentially. Kernel
overlaps few pixels at each position on the image. Each pixel which is overlapped is
multiplied and then added. And the sum is set as the value of the current position.
Pseudo code to describe the
convolution process:

 For each image row in input image:


 For each pixel in image row:
 Set accumulator to zero
 For each kernel row in kernel:
 For each element in kernel row:
 If element position corresponding* to pixel position then
 Multiply element value corresponding*to pixelvalue
 Add result to accumulator
 Endif
 Set output image pixel to accumulator
• In an image processing
context, one of the input
arrays is normally just a
graylevel image.
• The second array is usually
much smaller, and is also
two-dimensional known as
the kernel
Filtering
Filtering is a technique for modifying or
enhancing an image

Image processing operations implemented


with filtering include smoothing, sharpening,
and edge enhancement.
Filtering is a technique for modifying or enhancing an image.

Image processing operations implemented with filtering include smoothing,


sharpening, and edge enhancement.
Filtering is a neighborhood operation.

A pixel's neighborhood is some set of pixels, defined by their locations relative to


that pixel.
Linear filtering is filtering in which the value of an output pixel is a linear
combination of the values of the pixels in the input pixel's neighborhood.
Linear filtering of an image is accomplished through an operation
called convolution.

Convolution is a neighborhood operation in which each output pixel is the


weighted sum of neighboring input pixels.

The matrix of weights is called the convolution kernel, also known as the filter.

A convolution kernel is a correlation kernel that has been rotated 180 degrees.
Fourier Transformations

• Fourier transform is mainly used for image processing.


• In the Fourier transform, the intensity of the image is transformed into frequency variation
and then to the frequency domain.
• Low-frequency components can be removed using filters of FT domain.
• When an image is filtered in the FT domain, it contains only the edges of the image.
• And if we do inverse FT domain to spatial domain then also an image contains only edges.
• Fourier transform is the simplest technique in which edges of the image can be fined.
• To process an image in frequency domain, we need to first convert it using
into frequency domain and we have to take inverse of the output to convert
it back into spatial domain.
• That’s why both Fourier series and Fourier transform has two formulas.
• One for conversion and one converting it back to the spatial domain
y

You might also like