https://www.analyticsvidhya.
com/blog/2019/10/detailed-guide-powerful-sift-technique-image-
matching-python/
A Detailed Guide to the Powerful SIFT Technique for Image
Matching (with Python code)
machines are super flexible and we can teach them to identify images at an almost
human-level.
This is one of the most exciting aspects of working in computer vision!
Table of Contents
1. Introduction to SIFT
2. Constructing a Scale Space
1. Gaussian Blur
2. Difference of Gaussian
3. Keypoint Localization
1. Local Maxima/Minima
2. Keypoint Selection
4. Orientation Assignment
1. Calculate Magnitude & Orientation
2. Create Histogram of Magnitude & Orientation
5. Keypoint Descriptor
6. Feature Matching
Introduction to SIFT
SIFT, or Scale Invariant Feature Transform, is a feature
detection algorithm in Computer Vision.
These keypoints are scale & rotation invariant that can be used for various computer
vision applications, like image matching, object detection, scene detection, etc.
The major advantage of SIFT features, over edge features or hog features, is that
they are not affected by the size or orientation of the image.
For example, here is another image of the Eiffel Tower along with its smaller version.
The keypoints of the object in the first image are matched with the keypoints found in the
second image. The same goes for two images when the object in the other image is
slightly rotated. Amazing, right?
he entire process can be divided into 4 parts:
Constructing a Scale Space: To make sure that features are scale-independent
Keypoint Localisation: Identifying the suitable features or keypoints
Orientation Assignment: Ensure the keypoints are rotation invariant
Keypoint Descriptor: Assign a unique fingerprint to each keypoint
Finally, we can use these keypoints for feature matching!
Constructing the Scale Space
We need to identify the most distinct features in a given image while ignoring any
noise.
we need to ensure that the features are not scale-dependent
So, for every pixel in an image, the Gaussian Blur calculates a value based on its
neighboring pixels. Below is an example of image before and after applying the
Gaussian Blur. As you can see, the texture and minor details are removed from the
image and only the relevant information like the shape and edges remain:
Gaussian Blur successfully removed the noise from the images and we have highlighted
the important features of the image.
Now, we need to ensure that these features must not be scale-dependent.
his means we will be searching for these features on multiple scales, by creating a ‘scale
space’.
Scale space is a collection of images having different scales,
generated from a single image.
To create a new set of images of different scales, we will take the original image and
reduce the scale by half. For each new image, we will create blur versions as we saw
above.
how many times do we need to scale the image and how many subsequent blur images
need to be created for each scaled image? The ideal number of octaves should be
four, and for each octave, the number of blur images should be five.
Difference of Gaussian
Next, we will try to enhance the features using a technique called Difference of
Gaussians or DoG.
Difference of Gaussian is a feature enhancement algorithm that
involves the subtraction of one blurred version of an original
image from another, less blurred version of the original.