DIP Notes
1.Working of Image Sensor
The main purpose of image sensors are used to convert incoming light (photons)
into an electrical signal that can be viewed, analyzed, or stored. Image sensors are
a solid-state device and serve as one of the most important components inside
a machine vision camera.
2. Simple Image Model
When an image is generated by a physical process, its values are proportional to
energy radiated by a physical source. A simple model of image formation: light
from a source is reflected on a (coloured) object's surface. A fraction of the
(coloured) light is reflected towards the eye or camera. Digital colour cameras are
dramatically falling in price, making them affordable for ubiquitous appliances in
many applications.
3.Unsharp masking filter
An unsharp masking filter, despite its name, is used to sharpen images by
enhancing edges and fine details, often employed in image processing and
photography.
● What it does:
It works by creating a blurred copy of the image, subtracting it from the
original, and then adding the difference (the "mask") back to the original,
effectively boosting the high-frequency components and sharpening the edges.
● Origin and Usage:
This technique originated in darkroom photography but is now widely used in
digital image processing software.
● How it works:
o A blurred version of the original image is created.
o This blurred version (the "mask") is subtracted from the original.
o The resulting difference (the mask) is added back to the original image,
enhancing the edges.
o By manipulating parameters like radius, amount, and threshold, you can
control the level of sharpening and the effects on specific details and
areas.
● Benefits:
o Enhances details: Reveals fine details that might otherwise be obscured.
o Improves contrast: Increases the contrast between colors defining an
edge.
o Versatile: Effective in a variety of images and applications.
o Good for images with fine details: Especially useful for sharpening
images with a lot of fine detail.
● Potential drawbacks:
o Can introduce noise: Sharpening filters can amplify existing noise in the
image.
o Can create halos: Over-sharpening can lead to artifacts, including halos
around edges.
o Not a solution for blurry images: It can't correct for poor focus or motion
blur, it's meant to enhance existing clarity.
4. Importance of Visual perception in DIP
Visual perception, the brain's interpretation of visual information, is crucial in
image processing as it dictates how humans understand and interact with images,
impacting applications like image recognition, quality assessment, and content
understanding.
Here's why visual perception is so important in image processing:
● Human-Centric Applications:
Many image processing applications are designed to be viewed and understood
by humans, so understanding how people perceive images is crucial. For
example, in image compression or enhancement, algorithms should preserve or
even enhance the perceived visual quality.
● Image Understanding and Interpretation:
Visual perception allows humans to recognize objects, interpret scenes, and
understand the meaning conveyed by images. Image processing aims to mimic
or assist this human ability, for instance, in image recognition or object
detection, where algorithms attempt to identify objects and their relationships
in an image.
● Perceptual Quality Assessment:
Visual perception is at the core of how humans evaluate the quality of
images. In areas such as image compression or restoration, the goal is to
minimize or eliminate perceptible artifacts, ensuring that the reproduced
images retain the fidelity of the original.
● Color Perception:
Humans perceive color differently based on factors like brightness, saturation,
and hue, which is crucial in image processing to create realistic and visually
appealing outputs.
5. Image Undersampled
When an image is undersampled, it means the image was captured with too few
data points (pixels) per unit area, resulting in a loss of fine details and potentially
creating visible artifacts like "aliasing" where high-frequency details appear as
incorrect low-frequency patterns, making the image appear blurry, pixelated, or
distorted, especially when zoomed in; essentially, important information about
the image is discarded during the sampling process.
Key points about undersampling:
● Aliasing:
The most common issue with undersampling, where high-frequency details in
the original image are misinterpreted as lower frequencies, creating jagged
edges or moiré patterns.
● Loss of detail:
Since there are not enough pixels to capture fine features, important details in
the image are lost, leading to a blurry appearance.
● Relevance in photography and imaging:
o Camera sensor size: If the camera sensor has large pixels relative to the
lens focal length, it can lead to undersampling, especially when trying to
capture fine details.
o Zooming in: When zooming in on an undersampled image, the pixelation
and aliasing artifacts become more noticeable.
Example scenarios:
● Astrophotography:
Using a telescope with a short focal length combined with a camera with large
pixels can result in undersampled images of stars, making them appear as
blurry blobs instead of sharp points.
● Medical imaging:
If an MRI scan is undersampled, fine details within the tissue might be lost,
impacting diagnosis accuracy.
6. Pixel resolution
The pixel -- a word invented from picture element - Pixels are the smallest unit in a
digital display. Up to millions of pixels make up an image or video on a device's
screen. Each pixel comprises a subpixel that emits a red, green and blue (RGB)
color, which displays at different intensities. The RGB color components make up
the gamut of different colors that appear on a display or computer monitor.
When referencing the resolution of a display, numbers like 1920 x 1080 refer to
the number of pixels.
How do pixels work, and how are they used?
The number of pixels determines the resolution of a computer monitor or TV
screen, and generally the more pixels, the clearer and sharper the image. The
resolution of the newest 8K full ultra-high-definition TVs on the market is
approximately 33 million pixels -- or 7680 x 4320.
The number of pixels is calculated by multiplying the horizontal and vertical pixel
measurements. For example, HD has 1,920 horizontal pixels and 1,080 vertical
pixels, which totals 2,073,600. It's normally shown as 1920 x 1080 or just as
1080p. The p stands for progressive scan. A 4K video resolution, for example, has
four times more pixels than full high definition (HD), and 8K has 16 times more
pixels than 1080p.
Other common display resolutions include the following:
● 480p, which is standard definition, is 640 x 480 and is often used for
small mobile devices;
● 720p, which is HD, is 1280 x 720;
● 1440p, which is 2550 x 1440 and considered quarter HD (QHD), is often
used for PC gaming monitors; and
● 4K video resolution, which is ultra HD, is 3840 x 2160 pixels.
7.Image averaging and its applications.
Image averaging involves calculating the average pixel intensity value across
multiple images taken from the same scene and location, effectively reducing
noise and enhancing image quality, especially in low-light conditions
Applications:
● Photography:
o Long Exposures: Image averaging can be used to achieve effects similar
to long exposure times without physically holding the camera still or
needing a tripod.
o Low-Light Conditions: It helps reduce the noise that can be very
prominent in images taken in low light conditions.
● Astronomy:
o Stellar Photography: In astronomy, image averaging is crucial for
capturing faint, low-light targets, such as stars and galaxies.
o Reducing Sensor Noise: Averaging helps mitigate the noise present in
astronomical images caused by sensor limitations.
● Microscopy:
o Enhancing Image Quality: Image averaging is useful to reduce noise and
improve the clarity of microscope images, especially in applications like
cell and tissue imaging.
8.Spatial Resolution
Spatial resolution is the amount of detail in an image, or the size of the smallest
unit that can distinguish objects. It's also known as pixel size or cell size.
How is spatial resolution measured?
● Pixel scale: The size of the area on the ground represented by a single pixel
● Number of pixels: The number of pixels used to create an image
How does spatial resolution affect image quality?
● A higher spatial resolution means more pixels, which allows for finer details
to be reproduced
● A lower spatial resolution means fewer pixels, which results in less detail
Examples of spatial resolution
● NASA's Landsat satellite collects imagery at 15-meter resolution, meaning
each pixel represents a 15 m by 15 m square on the ground
● High-resolution imagery is used in disaster management, urban
development, and climate studies
Other uses of spatial resolution
● In medical imaging, spatial resolution describes the ability of a system to
depict microstructures
● In remote sensing, spatial resolution describes the ability to identify and
divide two close objects in an image
9.Bit-Depth
Bit depth, also known as color depth, defines the number of bits used to
represent the color information of each pixel in a digital image or video. It's a key
factor that influences the precision and quality of colors, shades, and overall visual
clarity. In both video and audio, bit depth directly affects how detailed and
smooth the output will appear or sound. A higher bit depth means richer color
gradations, smoother transitions, and a broader range of brightness levels,
offering a more immersive experience. This concept plays an equally vital role in
audio, impacting the dynamic range and sound clarity.
10. Thresholding in DIP
In image processing, thresholding is a technique used to simplify an image by
converting it into a binary image (black and white) by setting a threshold value,
where pixels above the threshold are considered "foreground" (usually white) and
pixels below are considered "background" (usually black), effectively isolating
objects of interest from the background within an image; it's a basic form of image
segmentation used for tasks like object detection, character recognition, and
feature extraction.
Key points about thresholding:
● Binary image creation:
The primary function of thresholding is to transform a grayscale or color image
into a binary image by assigning each pixel a value of either black or white
based on a set threshold value.
● Image segmentation:
By separating pixels based on intensity, thresholding acts as a basic image
segmentation technique, allowing you to isolate specific objects or regions
within an image.
● Applications:
Thresholding is widely used in various applications like document analysis (text
extraction), medical imaging (tumor detection), industrial inspection (defect
detection), and video surveillance (motion detection).
● Threshold selection:
The effectiveness of thresholding depends heavily on choosing an appropriate
threshold value, which can be manually set or determined automatically using
algorithms like Otsu's method
11.Contrast Stretching
Contrast stretching enhances an image by increasing the difference between light
and dark areas, essentially making details in the image more visible by
"stretching" the range of intensity values within the image, resulting in a more
pronounced contrast between the darkest and lightest pixels; it is a common
image enhancement technique used to improve the visibility of features in an
image with low contrast or poor lighting condition
Key points about contrast stretching:
● Increases dynamic range:
By mapping the intensity values of an image to a wider range, contrast
stretching effectively expands the dynamic range of gray levels, making the
image appear more detailed.
● Linear transformation:
Unlike histogram equalization, contrast stretching typically applies a linear
transformation to the pixel values, which means the relative intensity
relationships between pixels are preserved while enhancing overall contrast.
● Application areas:
This technique is particularly useful in medical imaging, satellite imagery, and
microscopy where subtle details might be difficult to discern in low contrast
images.
12. Image Transformation Technique
Power-law transformations, also known as gamma correction, are used to adjust
the brightness of an image. This is a mathematical transformation that raises the
pixel values of an image to a certain power.
Other image brightness transformation techniques:
● Intensity transformation: An algorithm that transforms each input
brightness value to a corresponding output brightness value.
● Grayscale transformation: Modifies the brightness of the pixel regardless of
the position of the pixel.
● Histogram specification: Transforms an image's histogram to match some
other histogram you specify.
13. log transformation in image processing
The primary purpose of log transformation in image processing is to enhance the
visibility of details in low-intensity areas of an image by compressing the dynamic
range of high-intensity pixels, essentially making dark areas appear brighter while
minimizing changes to bright areas, which is particularly useful for images with a
large contrast range like medical scans or astronomical photographs.
Key points about log transformation:
● Compressing high-intensity values:
When applying a log function to pixel values, the higher intensity values are
compressed, while lower intensity values are expanded, effectively stretching
out the dark areas of an image.
● Improving visibility in low-light conditions:
This makes log transformation particularly useful for images with a large
dynamic range where details in low-light regions might be obscured.
● Applications:
Fields like medical imaging, astronomy, and photography often use log
transformation to better visualize details in dark areas of an image
14. Fourier Transform in image processing.
The Fourier Transform (FT) is vital in image processing because it transforms
images from the spatial domain to the frequency domain, allowing for the
analysis and manipulation of images based on their frequency components,
which enables tasks like noise reduction, edge detection, and compression.
● What it does:
The FT decomposes an image into its constituent frequencies, representing
it as a sum of sinusoids with varying amplitudes, frequencies, and
phases. This decomposition allows for a frequency-based analysis and
manipulation of the image.
● Why it's important:
● Feature Extraction: The FT helps identify repeating patterns and
structural information within an image in terms of frequencies.
● Image Enhancement: By manipulating the frequency components,
images can be sharpened, blurred, or have noise
reduced. Enhancing high frequencies can sharpen edges, while
filtering them can blur the image.
● Image Compression: By identifying the most important frequency
components, image compression can be achieved by discarding or
quantizing less important frequencies.
● Image Analysis: The frequency spectrum reveals information about
the image's structure and content, enabling tasks like motion
detection and object recognition.
● Filtering: The Fourier Transform enables the design of filters in the
frequency domain, which can then be applied to the image in the
spatial domain to enhance or modify certain features.
● In summary:
The Fourier Transform provides a powerful toolset for analyzing and
manipulating images in a frequency-based domain, enabling various image
processing techniques that would otherwise be difficult or impossible.
15. Types of Restoration Filters
Restoration Filters are the type of filters that are used for operation of noisy
image and estimating the clean and original image. It may consists of processes
that are used for blurring or the reverse processes that are used for inverse of
blur. Filter used in restoration is different from the filter used in enhancement
process. Types of Restoration Filters: There are three types of Restoration Filters:
Inverse Filter, Pseudo Inverse Filter, and Wiener Filter. These are explained as
following below. 1. Inverse Filter: Inverse Filtering is the process of receiving the
input of a system from its output. It is the simplest approach to restore the
original image once the degradation function is known. It can be define as:
H'(u, v) = 1 / H(u, v)
Let,
F'(u, v) -> Fourier transform of the restored image
G(u, v) -> Fourier transform of the degraded image
H(u, v) -> Estimated or derived or known degradation function
then F'(u, v) = G(u, v)/H(u, v)
where, G(u, v) = F(u, v).H(u, v) + N(u, v)
and F'(u, v) = f(u, v) - N(u, v)/H(u, v)
Inverse filtering is not regularly used in its original form.
2. Pseudo Inverse Filter: Pseudo inverse filter is the modified version of the
inverse filter and stabilized inverse filter. Pseudo inverse filtering gives more better
result than inverse filtering but both inverse and pseudo inverse are sensitive to
noise. Pseudo inverse filtering is defined as:
H'(u, v) = 1/H(u, v), H(u, v)!=0
H'(u, v) = 0, otherwise
Wiener Filter: (Minimum Mean Square Error Filter). Wiener filter executes and
optimal trade off between filtering and noise smoothing. IT removes the addition
noise and inputs in the blurring simultaneously. Weiner filter is real and even. It
minimizes the overall mean square error by
e^2 = F{(f-f')^2}
where, f -> original image
f' -> restored image
E{.} -> mean value of arguments
H(u, v) = H'(u, v)/(|H(u, v)|^2 + (Sn(u, v)/Sf(u, v))
where H(u, v) -> Transform of degradation function
Sn(u, v) -> Power spectrum of the noise
Sf(u, v) -> Power spectrum of the undergraded original image
Drawbacks of Restoration Filters:
● Not effective when images are restored for the human eye.
● Cannot handle the common cause of non-stationary signals and noise.
● Cannot handle spatially variant blurring point spread function.
16. Inverse Filtering, Least Mean Square Filtering(Weiner
filter), Constrained Least Mean Square Filtering
Inverse, Least Mean Square (LMS), and Constrained Least Mean Square
(CLMS) filtering are methods used to restore or estimate the original
signal from a degraded or corrupted version, often in image or signal
processing. Inverse filtering tries to directly invert the degradation,
while LMS and CLMS algorithms minimize the error between the
estimated and original signals, with CLMS adding constraints to the
estimation process.
Here's a more detailed explanation:
1. Inverse Filtering:
● Concept:
Inverse filtering aims to reverse the effect of a filter or degradation
process that has been applied to a signal or image. It's like "undoing"
the blurring or distortion.
● How it works:
It involves dividing the Fourier transform of the degraded signal by the
Fourier transform of the degradation function (the filter that caused the
degradation).
● Pros:
Simple to implement, computationally efficient.
● Cons:
Can amplify noise, especially when the degradation function is close to
zero, leading to artifacts.
2. Least Mean Square (LMS) Filtering:
● Concept:
LMS filtering, also known as adaptive filtering, is an algorithm that
iteratively adjusts filter coefficients to minimize the mean squared error
between the desired output and the filter output.
● How it works:
It uses a gradient descent approach to update filter weights, gradually
converging to the optimal weights that minimize the error.
● Pros:
Adaptive to changing environments, efficient for real-time processing,
can handle noise and non-stationary signals.
● Cons:
Can be slow to converge, may be sensitive to initial conditions.
3. Constrained Least Mean Square (CLMS) Filtering:
● Concept:
CLMS filtering builds upon LMS filtering by adding constraints to the
optimization process. These constraints can be based on prior
knowledge about the desired signal or its characteristics.
● How it works:
It uses the method of Lagrange multipliers or other optimization
techniques to find the optimal filter weights while satisfying the
specified constraints.
● Pros:
More robust to noise and non-stationary signals than LMS, can enforce
desired signal properties.
● Cons:
More computationally complex than LMS, requires knowledge or
estimation of constraints.
In summary:
● Inverse filtering is a simple but potentially unstable method for
restoration.
● LMS filtering provides a more robust and adaptive solution,
especially for real-time applications.
● CLMS filtering offers further improvements in stability and
accuracy by incorporating prior knowledge through constraints.
These techniques are used in various applications, including:
● Image restoration: Removing blur, noise, and other degradations
from images.
● Signal processing: Cleaning up noisy signals, canceling echoes, and
improving speech quality.
● Adaptive filtering: Adjusting filters to changing conditions, like in
adaptive noise cancellation.
16. Blind Image Restoration
Blind image restoration (BIR) aims to recover the original image from a
degraded or blurred version, without knowing the exact nature of the
degradation process. Unlike traditional image restoration, which
assumes knowledge of the degradation model, BIR faces the challenge
of inferring both the degradation and the original image
simultaneously. This makes it a highly complex and ill-posed problem.
Elaboration:
● The Challenge:
BIR is difficult because the degradation process is unknown. This means
the algorithm must learn both what the image looked like originally and
how it was degraded, all from the degraded image itself.
● Methods:
Various techniques are used for BIR, including:
● Diffusion-based models: These models, like DiffBIR and
InstantIR, leverage the generative capabilities of pre-trained
diffusion models to restore images, often in a two-stage
process (degradation removal and information
regeneration).
● Convolutional Neural Networks (CNNs): Traditional
CNN-based methods have been used for blind IR, but they
can sometimes struggle to produce realistic and detailed
results.
● Prompt Learning: Methods like PromptCIR use prompts to
encode degradation information, allowing the network to
dynamically adapt to different types and levels of
degradation.
● Adaptive Methods: Approaches like ABAIR and InstantIR are
designed to adapt to various degradation types and levels,
even those unseen during training.
● Zero-Shot Methods: Methods like Invert2Restore aim to
restore images without requiring any training or specific
knowledge of the degradation model.
● Applications:
BIR is crucial in various domains where accurate image restoration is
essential, including medical imaging, remote sensing, and surveillance.
● Ongoing Research:
Researchers are continuously exploring new methods and architectures
to improve the accuracy, efficiency, and robustness of blind image
restoration techniques
17. Singular Value Decomposition (SVD)
Singular Value Decomposition (SVD) is a fundamental matrix
decomposition technique in linear algebra. It allows any real or complex
matrix to be broken down into three matrices: a left singular vector
matrix (U), a diagonal matrix of singular values (Σ), and a right singular
vector matrix (V). This decomposition exposes underlying structures
and properties of the original matrix.
Key Concepts:
● Matrices U and V:
These matrices contain orthonormal vectors, meaning they are
orthogonal (dot product is zero) and have a length of 1.
● Matrix Σ:
This is a diagonal matrix where the diagonal elements are the singular
values of the original matrix. The singular values are non-negative real
numbers, often ordered in descending order.
● Decomposition:
The original matrix (A) can be represented as the product of these three
matrices: A = UΣVᵀ.
Applications:
● Data Compression:
SVD can be used to reduce the dimensionality of data by retaining only
the most significant singular values and vectors, effectively compressing
the data.
● Noise Filtering:
In applications like image processing, SVD can be used to remove noise
by attenuating the effects of small singular values.
● Low-Rank Approximation:
SVD provides a method for finding the best low-rank approximation to a
matrix, which can be useful in various applications like recommendation
systems.
● Dimensionality Reduction:
SVD is a core technique in Principal Component Analysis (PCA), allowing
for dimensionality reduction of high-dimensional datasets.
● Machine Learning:
SVD is used in various machine learning algorithms, including
recommendation systems, text analysis, and image processing.
In essence, SVD offers a powerful way to analyze and manipulate
matrices, providing valuable insights into their structure and properties
while enabling practical applications in various fields
Unit 5
Color Transformation, Smoothing and Sharpening, Color Segmentation.
Color transformation, smoothing/sharpening, and color segmentation
are core techniques in digital image processing. Color transformation
manipulates color values within an image, like adjusting brightness or
saturation. Smoothing and sharpening enhance image clarity by either
reducing noise (smoothing) or emphasizing edges and details
(sharpening). Color segmentation divides an image into regions based
on color, useful for object recognition and analysis
Color Transformation:
● Purpose: Alters color values to change the image's appearance or
highlight specific features.
● Methods:
oBrightness Adjustment: Brightens or darkens the entire
image.
oContrast Adjustment: Increases the difference between light
and dark areas.
oSaturation Adjustment: Controls the intensity of colors.
oHue Adjustment: Shifts colors within a specific range.
oColor Space Conversion: Changes the color representation
(e.g., from RGB to HSV).
● Example: Adjusting brightness to improve visibility in a low-light
image.
Smoothing and Sharpening:
● Smoothing:
● Purpose: Reduces noise and blurs out small details, resulting
in a smoother appearance.
● Methods:
● Averaging Filters: Compute the average color value of
a pixel and its neighbors.
● Gaussian Filters: Apply a weighted average based on a
Gaussian distribution, giving more weight to closer
pixels.
● Example: Reducing graininess in a noisy image.
● Sharpening:
● Purpose: Enhances edges and details, making the image
appear sharper and crisper.
● Methods:
● Laplacian Filter: Detects edges and emphasizes them.
● Unsharp Masking: Subtractive sharpening that
removes low-frequency content, leaving
high-frequency (edge) details.
● Example: Improving the visibility of fine details in a blurry
image.
Color Segmentation:
● Purpose: Separates an image into different regions based on color,
allowing for analysis and object recognition.
● Methods:
oThresholding: Divides pixels into foreground and background
based on a color threshold.
oRegion Growing: Identifies connected regions with similar
color properties.
oK-means Clustering: Groups pixels into clusters based on
their color, then uses the cluster centers to define regions.
oHSV Color Space: Uses the Hue, Saturation, and Value
components to distinguish between colors.
● Example: Identifying a specific colored object in an image.
Relationship:
● Simultaneous Approaches:
Some methods attempt to combine smoothing and sharpening, such as
guided filters, to achieve both noise reduction and edge enhancement.
● Opposite Nature:
Smoothing and sharpening have opposite effects, so simultaneous
approaches are often challenging.
● Color Space:
Color space transformations (e.g., from RGB to HSV) can help with
segmentation and color analysis.
Applications:
● Computer Vision: Object recognition, scene understanding, image
analysis.
● Image Processing: Noise reduction, edge enhancement, feature
extraction.
● Medical Imaging: Enhancement of medical images, segmentation
of tissues and organs.
● Remote Sensing: Analysis of satellite images, identification of land
cover type
Unit 6
Opening and Closing are dual operations used in Digital Image
Processing for restoring an eroded image. Opening is generally used to
restore or recover the original image to the maximum possible extent.
Closing is generally used to smoother the contour of the distorted
image and fuse back the narrow breaks and long thin gulfs. Closing is
also used for getting rid of the small holes of the obtained image. The
combination of Opening and Closing is generally used to clean up
artifacts in the segmented image before using the image for digital
analysis.
Morphological Algorithms: Boundary Extraction,
Region Filling.
Morphological operations are a set of techniques used in image processing to analyze and
modify the shape and structure of objects within an image. The morphological operations are
based on the mathematical theory of morphology, which studies the properties of shapes and
patterns.
● Morphological operations are typically applied to binary images, although some
operations can also be applied to grayscale images. These operations are used to
remove noise, separate or extract specific shapes, and perform various other tasks in
image analysis. Here are some of the most common morphological operations: Erosion
● Dilation
● Opening
● Closing
● Top-hat Transform
● Skeletonization
● Thinning
Structuring Elements
A structuring element is a matrix used in morphological operations to define the
neighborhood or pattern of pixels that are processed together. The shape and size of
the structuring element determine how the morphological operation interacts with the
pixels in the image.
Below are the common types of structuring elements that are typically used.
Square/Rectangle
In this element, there is is a matrix of ones in a square or rectangular pattern. Uniformly
affects all pixels in the neighborhood. Often used for basic operations due to its simplicit
Disk or Circle
A disk or circle structural element is a circular pattern where the center pixel is surrounded by
a neighborhood of pixels that form a circle. It provides isotropic effects, meaning uniform
changes in all directions. Ideal for processing circular objects or when a uniform impact in all
directions is needed.
Cross
A cross structural element pattern is shaped like a cross, typically derived from a square by
retaining only the central row and column. Emphasizes changes along the horizontal and
vertical directions. Useful for operations that require preservation of linear features or lines.
Ellipse
An elliptical structural element pattern, which is like the disk but elongated in one direction. It
has anisotropic effects, meaning more influence in one direction. Useful for processing
elongated structures or features that are stretched in one direction.
Diamond
A diamond-shaped structural element pattern, often a variation of a cross, but with diagonal
influence. Combines the properties of a cross and a square, affecting pixels diagonally as well.
Useful for operations that require a balanced influence between diagonal and orthogonal
directions.
Different Morphological Operations
1. Erosion
Erosion is a fundamental morphological operation that reduces the size of objects in a binary
image. It works by removing pixels from the boundaries of objects.
● Purpose: To remove small noise, detach connected objects, and erode boundaries.
● How it Works: The structuring element slides over the image, and for each position, if all
the pixels under the structuring element match the foreground, the pixel in the output
image is set to the foreground. Otherwise, it is set to the background.
2. Dilation
Dilation is the opposite of erosion and is used to increase the size of objects in an image.
● Purpose: To join adjacent objects, fill small holes, and enhance features.
● How it Works: The structuring element slides over the image, and for each position, if
any pixel under the structuring element matches the foreground, the pixel in the output
image is set to the foreground.
3. Opening
Opening is a compound operation that involves erosion followed by dilation.
● Purpose: To remove small objects or noise from the image while preserving the shape
and size of larger objects.
● How it Works: First, the image undergoes erosion, which removes small objects and
noise. Then, dilation is applied to restore the size of the remaining objects to their
original dimensions.
4. Closing
Closing is another compound operation that consists of dilation followed by erosion.
● Purpose: To fill small holes and gaps in objects while preserving their overall shape.
● How it Works: First, dilation is applied to the image, filling small holes and gaps. Then,
erosion is applied to restore the original size of the objects.
5. Hit-or-Miss Transform
The hit-or-miss transform is used to find specific patterns or shapes in a binary image.
● Purpose: To detect specific configurations or shapes in the image.
● How it Works: It uses a pair of structuring elements: one for the foreground and one for
the background. The operation looks for places where the foreground structuring
element matches the foreground in the image, and the background structuring element
matches the background in the image.
6. Morphological Gradient
The morphological gradient is the difference between the dilation and erosion of an image.
● Purpose: To highlight the boundaries or edges of objects in the image.
● How it Works: By subtracting the eroded image from the dilated image, the boundaries
of the objects are emphasized.
7. Top-Hat Transform
The top-hat transform is used to extract small elements and details from an image.
● Purpose: To enhance bright objects on a dark background or vice versa.
● How it Works: There are two types: white top-hat (original image minus the result of
opening) and black top-hat (result of closing minus the original image).
8. Skeletonization
Skeletonization reduces objects in a binary image to their skeletal form.
● Purpose: To simplify objects to their essential structure while preserving their
connectivity.
● How it Works: It iteratively erodes the image until the objects are reduced to a minimal,
skeletal form.
9. Pruning
Pruning removes small spurs or extraneous branches from the skeleton of an image.
● Purpose: To clean up the skeleton by removing unwanted artifacts.
● How it Works: It identifies and removes small, irrelevant branches from the skeletonized
image.