SIFT : Scale Invariant
Feature Transform
Acknowledgements
● Computer Vision : Algorithms and Applications 2nd ed. Richard Szeliski
● Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints.
International Journal of Computer Vision 60, 91–110 (2004).
https://doi.org/10.1023/B:VISI.0000029664.99615.94
Features / Interest points
Features are specific locations
in the images such as
mountain peaks, building
corners, doorways or
interestingly shaped patches
of snow.
They are usually described by
the appearance of patches of Fig : Two pairs of images
pixels surrounding the point
What could be the feature / interesting points one might
location. use to establish a set of correspondences between
these images ?
Fig : Image pair with extracted patches
Which patches are worthy enough to be used as a feature and why ?
Desirable Feature characteristics
● Scale invariance
● Rotation invariance
● Illumination invariance
● Viewpoint invariance
Scale Rotation Viewpoint
Illumination
Feature detection and matching pipeline
● Feature detection: Each image is searched
Feature detection
for locations that are likely to match well in
other images.
● Feature description: Each region around
Feature description detected interest point is converted into a more
compact and stable descriptor that can be
matched against other descriptors.
● Feature matching: For every feature in one
Feature matching
image, we efficiently search for likely matching
features in the other image.
Feature detection and matching pipeline
● Feature detection: Each image is searched for
Feature detection
locations that are likely to match well in other
images.
● Feature description: Each region around
Feature description detected interest point is converted into a
more compact and stable descriptor that
can be matched against other descriptors.
● Feature matching: For every feature in one
Feature matching
image, we efficiently search for likely matching
features in the other image.
Feature detection and matching pipeline
● Feature detection: Each image is searched for
Feature detection
locations that are likely to match well in other
images.
● Feature description: Each region around
Feature description detected interest point is converted into a more
compact and stable descriptor that can be
matched against other descriptors.
● Feature matching: For every feature in one
Feature matching
image, we efficiently search for likely
matching features in the other image.
How to achieve scale invariance?
Extract features at a
variety of scales by
performing Laplacian of
Gaussian at multiple
resolutions in a pyramid
and then matching features
at the same level.
For computation efficiency,
Laplacian of Gaussian is
approximated to Difference
Fig : Initial image is repeatedly convolved with Gaussians to produce
of Gaussian a set of scale images.
Fig : Gaussian kernel used to create scale space Fig : Laplacian of Gaussian and Difference of
Gaussian is nearly same
Locating the extrema of DoG (difference of gaussians)
Extrema (min/max) in
the Laplacian of
Gaussian (LoG) /
Difference of Gaussian
(DoG) function as
interest point locations.
Fig : Maxima and minima of the difference of Gaussian images are
detected by comparing a pixel (marked with X) to its 26 neighbours
in 3x3 regions, at the current and adjacent scales (marked with
circles)
How to achieve rotation invariance?
A dominant orientation estimate can be computed by creating a histogram of all the gradient orientations
and then finding the significant peak in the distribution.
Feature descriptor
SIFT features are formed by
computing the gradient at each
pixel in a 16x16 window around
the detected keypoint, using the
appropriate level of the gaussian
pyramid at which the keypoint was
detected.
In each 4x4 quadrant, a gradient
4x4x8 = 128 dimensional vector
orientation histogram is formed by
adding the weighted gradient value To reduce the effects of contrast or gain, 128-D vector is
normalized to unit length
to one of eight orientation
histogram bins.
Feature matching
Image A Image B
For each feature in image A, find its corresponding feature in image B
Feature descriptor 1, fd1 = [45, 78, 92, 167]
Feature descriptor 2, fd2 = [67, 89, 119, 23]
We calculate either manhattan / euclidean distance to quantize the similarity between feature 1
and feature 2
Manhattan distance of fd1 and fd2 = | 67 - 45 | + | 89 - 78 | + | 119 - 92 | + | 23 - 167 |
Euclidean distance of fd1 and fd2 = √ ( (67 - 45)^2 + (89 - 78)^2 + (119 - 92)^2 + (23 - 167)^2 )
Code walkthrough
How SIFT is getting used in Fingerprint
Verification System
About the dataset
Dataset name : Sokoto Coventry Fingerprint Dataset
(SOCOFing)
Dataset link :
https://www.kaggle.com/datasets/ruizgara/socofing?resource=d
ownload
It is a biometric fingerprint database for academic research
purposes.
Made up of 6000 fingerprint images which is 10 fingerprint
images of 600 people.
In addition to the database, it has altered fingerprint images for
testing purposes.
Given two fingerprint images, how similar are they ?
Given two fingerprint images, how similar are they ?
SIFT SIFT
We get feature points (kp1) We get feature points (kp2)
and their corresponding and their corresponding
feature vectors (des1) feature vectors (des2)
BF Matcher/
Flann Matcher
Score is calculated
based on feature
matches
database
Fingerprint Verification system
target
Problem : To check whether the fingerprint of
target person is present in the database or not
CASE : target is present in the database
database
290.78
target
350.80
27.82
320.71 Score distribution
CASE : target is not present in the database
database
290.78
target
350.80
280.10
320.71 Score distribution
Thank you