quick Reference Guide
on
Digital Image Processing
01 1010 01
01 1
0 01
0 1
01
Regional Remote Sensing Centre-North
National Remote Sensing Centre, ISRO
Quick Reference Guide
On
Digital Image Processing
Regional Remote Sensing Centre-North
National Remote Sensing Centre
Indian Space Research Organisation
Index
1. Basics of Digital Image Processing ………………………………...1
Jayant Singhal
2. Digital Image Processing Techniques ........................................... 14
Neetu
3. Image Enhancement ……………..……..…………………………34
Akash Goyal
4. Image Classification……………………..…...……………….……47
Vinod Kumar Sharma
5. Automatic Information Extraction……………………..………...73
Khushboo Mirza
6. SAR Image Processing……………………..……….….……..……84
Abhinav Kumar Shukla
7. Hyperspectral Image Analysis……….....………………..………102
Prabhjot Kaur
Chapter 1
Basics of Digital Image Processing
1.1 Overview of Remote Sensing
Remote sensing is the science of measuring, detecting, and monitoring of
physical properties of a target object/area without coming in direct contact with
the target, usually by measuring the reflected or emitted radiation at a distance.
Remote sensing can be classified as active and passive (Figure 1.1). Active
remote sensing transmits and measures the electromagnetic radiation emitted
and reflected, while passive remote sensing only measures the reflected
radiation from the target.
Figure 1.1: a. Passive remote sensing b. Active remote sensing
Based on the number of bands available in the sensor, the data may be
classified as multispectral or hyperspectral. While multispectral remote sensing
gathers data from a few discrete spectral bands that are broad and cover a wide
range of wavelengths or frequencies.
Hyperspectral remote sensing, on the other hand, collects data from hundreds
or even thousands of narrow and adjacent spectral bands.
Quality and nature of remote sensing data can be assessed on the basis of 4
types on resolution:
1. Radiometric Resolution: Radiometric resolution refers to the sensitivity of
the sensor towards incoming electromagnetic radiation. It is equal to the
number of bits in which the recorded energy can be divided.
2. Spatial Resolution: Spatial resolution refers to size of the smallest object
on the ground that can be resolved by the sensor. It is usually represented
1|P a g e
by a single value which is equal to length of square on the ground that the
pixel represents.
3. Spectral Resolution: Spectral resolution is the number and the size of
bands in the electromagnetic spectrum that a remote sensing system is able
to capture.
4. Temporal Resolution: Temporal resolution refers to how often the
satellite/sensor acquires the image of a particular target on the ground.
1.2 Digital Image
With the invention of modern computers, researchers quickly realised the
benefits of storing data in a digitized format for processing. Digital, means any
representation of signal/data/information in quantized form or in terms of
digits. Digital image is pictorial representation of an object or a scene by a
group of divided cells (pixels) organised in a definite grid and having
quantized values or values in terms of digits of average intensity of the target
object or a scene.
1.2.1 Pixel
These individual cells that make up a digital image are called picture elements
or pixel for short (Figure 1.2). The smallest unit that can be displayed on a
digital display device in a digital image or graphic is called pixels.
Figure 1.2: Image grid
1.2.2 Grid
The regular grid in which pixels combines to form the digital image is called
the image grid (Figure 1.3). In the case of remote sensing images or other
2| P a g e
forms of geospatial imagery, these image grids have location information
appended with them, hence each pixel represents some specific region on the
Earth that it is tied to.
Figure 1.3: Remote sensing image and its respective pixel values in the image
grid
Each pixel in the image grid is represented by a number. These digital numbers
are almost always stored in binary format. Higher pixel value represents higher
intensity of that pixel in the image and lower pixel value is represented by
lower intensity of that pixel in the image grid.
1.3 Digital Image Processing
The use of a digital computer to process digital images, specifically remote
sensing images in our context, through an algorithm or series of algorithms to
achieve a desired outcome is known as digital image processing (Figure 1.4).
Figure 1.4: Process flow of capturing data to digital image processing to
derivation of meaningful outcomes
3| P a g e
As the name suggests, these algorithms are specifically designed to work on
digital images. Remote sensing images are captured from different platforms
(satellite, aerial etc.) and by various sensors (multi spectral sensors, digital
cameras, Synthetic Aperture Radars (SAR), etc.) stored in a digital format.
These image datasets are processed using a digital computer by running
various algorithms to derive meaningful information from them.
1.4 Remote Sensing Data formats
Remote sensing image datasets usually consist of more than one band (multi-
band images). In such datasets location of each pixel in the image grid is
defined as their row number, column number and also by their band number
information. Most commonly used formats are discussed in this section and
briefed in table 1.
Table 1: Commonly used Remote Sensing data formats
S. No. Format Description
1. Data is stored in binary files
1 Flat Binary 2. Can be BSQ, BIL and BIP
3. Meta data is stored in separate file
1. Extension of regular tiff
2 Geotiff 2. Uses small set of reserved TIFF tags to stored
georeferencing information
1. Is used to store images, tables, texts and data arrays
3 HDF
2. It is self-describing in nature
1. An interface for array-oriented data access.
4 NetCDF 2. Mostly used for atmospheric models, marine
Geophysics data etc.
Flat binary multi-band formats
It is the most basic form of storing raster data. It is usually accompanied by
metadata file (ASCII or XML). Multi band image datasets are classified on the
basis of data format in the following 3 types (Figure 1.5):
1. BSQ format (Band Sequential)
4| P a g e
In BSQ format image data for each band is stored separately, one after
the other.
2. BIL format (Band interleaved by line)
In BIL format the line data is arranged in the order of band number
and repeated with respect to line number.
3. BIP format (Band interleaved by pixel)
In BIP format bands change with respect to every pixel, which are
spatially arranged by pixel element number and line number are
stored.
Figure 1.5: Illustration of multi band raster image with nine pixels and three
bands arranged in the three different types of formats.
GeoTiff format
TIFF stands for Tag Image File Format and it is a very popular format to store
raster image data. They are a high bit rate lossless data format, which means
that it can store high quality images with any information being lost. So many
users started embedding georeferencing information with the satellite imagery
in tiff format, which later evolved in GeoTiff format (Figure 1.6).
GeoTIFF format fully complies with the TIFF 6.0 specifications, and its
extensions. It uses a small set of reserved TIFF tags to store a broad range of
georeferencing information, catering to geographic as well as projected
coordinate system’s needs. Numerical codes are used in GeoTIFF format to
describe projection types, coordinate systems, datums, ellipsoids, etc. The
projection, datums and ellipsoid codes are derived from the EPSG (European
5| P a g e
Petroleum Survey Group) list compiled by the Petrotechnical Open Software
Corporation (POSC). Images in GeoTiff format can be easily viewed using
remote sensing and GIS softwares like QGIS, ArcGIS, ERDAS IMAGINE etc.
Figure 1.6: Satellite image of Delhi airport in GeoTIFF format being displayed
in QGIS software
HDF format
The National Centre for Supercomputing Applications (NCSA) designed the
Hierarchical Data Format (HDF) as a data file format to aid users in storing
and manipulating scientific data on different operating systems and machines.
HDF is capable of supporting various data types, such as scientific data arrays,
tables, and text annotations, as well as different types of raster images and their
associated color palettes. There are two types of HDF, HDF (version 4 and
earlier) and HDF5.
HDF offers a range of features, including the ability for programs to retrieve
data information from the data file instead of an external source. It also
standardizes the format and descriptions of frequently used data sets, such as
scientific data and raster images. Additionally, HDF is platform-independent,
meaning it can be utilized on a variety of computers, regardless of the
operating system. Both the HDF development team and users can add new data
models to HDF.
6| P a g e
NetCDF format
NetCDF (Network Common Data Form) comprises machine-independent data
formats and software libraries that facilitate the creation, access, and sharing of
array-oriented scientific data. It is a recognized community standard for
sharing scientific data. The netCDF programming interfaces for C, C++, Java,
and Fortran are supported and maintained by the Unidata Program Center,
while interfaces for Python, IDL, MATLAB, R, Ruby, and Perl are also
available. Data in netCDF format is self-describing and contains information
about the data it includes. It is portable and accessible to computers that store
integers, characters, and floating-point numbers differently. NetCDF interfaces
enable efficient access to small subsets of large datasets in various formats,
even from remote servers. Additionally, data can be appended to a correctly
structured netCDF file without copying the dataset or redefining its structure.
Multiple readers can simultaneously access the same netCDF file, and one
writer can also access it. Opensource software like Panoply can be used to read
the NetCDF file (Figure 1.7).
Figure 1.7: Rainfall data (GEFS) over India as on 26/10/2015 visualised using
Panoply
1.5 Digital image display
To display digital image currently 3 major technologies are in use (Figure 1.8):
1. Cathode Ray Tubes (CRTs)
2. Liquid Crystal Displays (LCDs)
3. Light Emitting Diodes (LEDs)
7| P a g e
4. Organic light-emitting diode (OLED) displays
5. Plasma displays.
Figure 1.8: The 3 major types of digital image display technologies
a. Cathode Ray Tube b. Liquid Crystal Displays c. Light Emitting Diodes
The digital image displays consist of a grid of individuals display units, each of
which capable of producing colours individually. Each The display unit is then
assigned to produce a specific colour based on the zoom level of the digital
image chosen to be display.
1.6 Human Vision
The colours produced by the digital image display is observed by the human
eye (Figure 1.9). Retina presents in the human eye have two types of
photoreceptor cells:
1. Rod cells: Rod cells work better in dim light and are responsible for
the scotopic vision.
2. Cone cell: Cone cells work better in bright light and are responsible
for the photopic vision.
Cone cells are mainly responsible for perception of colour. There are three
types of cone cells present in the retina and each one is sensitive to different
parts of the electromagnetic spectrum:
1. L(Rho): sensitivity peaks between 564–580 nm.
2. M(Gamma): sensitivity peaks between 534–545 nm.
3. S(Beta): sensitivity peaks between 420–440 nm.
8| P a g e
Figure 1.9: Sensitivity of different cone cells to different parts of the EM
spectrum (The Stockman and Sharpe (2000)).
Our perception of colours is based on the complex response and sensitivity of
these 3 types of cells. The range of colour perceptions are quantified by CIE to
provide the standardization and is called chromaticity diagram (Figure 1.10).
Since the chromaticity diagrams spans the entire colour spectrum possible to be
perceived by the human eye, all of the digital image displays try to cover the
entire span of the diagram but are limited in practice.
Figure 1.10: CIE 1931 Chromaticity Diagram
9| P a g e
1.7 Look up table
In an image being displayed, each individual display unit is linked with a look
up table in the graphics adapter for the colour generation. It has 3 channels
corresponding to the 3 different primary colors used for formation. Values for
each channel is scaled using the Look Up Table to generate the required colour
by the pixel (Figure 1.11).
Figure 1.11: RGB look up table
1.8 Colour space
Perception of colour in our eyes comes from 3 different types of cells, hence
the colour space defined by them is 3 dimensional in nature. It can be
visualised as a cube with origin as black (all three types have zero response)
and the three axes representing the response from the 3 types of cells. Hence
corner opposite to black will be white where all the type of cells is producing
their maximum response. This cube is also standardised by the CIE (Figure
1.12).
Figure 1.12: CIE RGB colour cube
10| P a g e
1.9 Access to remote sensing datasets
Remote sensing datasets are provided by different space agencies around the
world through their own portals. Various remote sensing datasets and value-
added product are also available through different websites that aggregate these
products.
The most common one in use are as follows:
Indian Portals:
1. Bhoonidhi portal
Boonidhi portal is primary data provider of Indian remote sensing
satellites (https://bhoonidhi.nrsc.gov.in/).
2. MOSDAC (Meteorological & Oceanographic Satellite Data Archival
Centre)
MOSDAC is the satellite data repository of ISROs satellite
missions dealing with meteorology, oceanography and tropical
water cycles. (https://www.mosdac.gov.in/).
Global Portals:
1. USGS Earthexplorer
USGS Earthexplorer is the primary data provider for the remote
sensing missions conducted by NASA
(https://earthexplorer.usgs.gov/).
2. Copernicus Open Access Hub
Copernicus Open Access Hub is the primary data provider for
remote sensing missions conducted by ESA
(https://scihub.copernicus.eu/).
3. AppEEARS
The Application for Extracting and Exploring Analysis Ready
Samples (AppEEARS) is an American service that provides a
straightforward and effective approach to obtaining and
converting geospatial data from several federal data archives
(https://appeears.earthdatacloud.nasa.gov/).
11| P a g e
4. Google Earth Engine
With planetary-scale analysis capabilities, Google Earth Engine
merges a multi-petabyte catalog of satellite imagery and
geospatial datasets (https://code.earthengine.google.com/).
Remote sensing satellite imagery from these websites comes in different files
formats and it corresponds to different bands present in that particular sensor of
that particular satellite. The images can be easily view in a remote sensing and
GIS software. Examples of some of the different land cover features as
captured by remote sensing satellite are shown in figure 1.13. Sometimes one
particular pass of the satellite may not be able to cover the entire target region
and for these multiple images are mosaicked to cover the entire area. Multiple
bands can also be stacked together and 3 band can be scaled to generated RGB
colour composite of your choice (Figure 1.14).
Figure 1.13: Cartosat-3 Satellite image of a. Agricultural fields b. Forests c.
Built up areas d. River.
12| P a g e
Figure 1.14: Cartosat-3 Satellite data of a part of Delhi. (a) Band 1, (b) Band 2,
(c) Band 3, (d) Band 4, (e) True color composite and (f) False color composite
Summary
Concepts of remote sensing data and digital image processing are explained in
this chapter. Different remote sensing satellite data sources and their formats
are discussed in detail with examples. Band combination of Indian remote
sensing and foreign satellite datasets were showcased by implementing basic
digital image processing. An overall introduction of digital image processing is
successfully showcased in this chapter.
13| P a g e
Chapter 2
Digital Image Processing Techniques
2.1 Introduction
Digital Image processing is the class of methods that deal with manipulating
digital images through the use of various algorithms. It is an essential pre-
processing step in many applications, specifically remote sensing images in our
context. (Jain, 1989; Kenneth, 1996; Rafael, 2001). Image processing
techniques are used to manipulate, analyse and enhance the information from
digital images. Image processing has become an essential field due to the
availability of vast digital data and images. So, there was a need to use efficient
and effective processing techniques for analysing digital images (Wayne,
1985).
The basic details of the image acquired with overview of remote sensing are
covered in Chapter 1 with the majorly used types of the data formats, image
display technologies, coloured images, and various sources for image
downloading with the applications of digital image in remote sensing.
This chapter will highlight the majorly used processing techniques after the
image acquisition. Multi-temporal remote sensing images can provide the vast
information depending upon the applications which is essential for
interpretation and timely management. Image processing techniques have vast
application areas in remote sensing, including agricultural applications,
forestry, hydrology, urban, disaster management, etc. Remote Sensing images
acquired need to pre-processed including steps such as band selection,
radiometric correction, image rectification & geometric correction, image
enhancement, image filtering, image thresholding. The flow chart of the major
steps in digital image processing in remote sensing is shown in Figure 2.1.
The detailed description of various image enhancement techniques is provided
in Chapter 3. The various steps of pre-processing of microwave data in remote
sensing are covered in Chapter 6. After pre-processing of the images, the
image may be interpreted with the relevant information as per the requirement.
The feature can be extracted based upon the spectral signatures or backscatter
values in case of optical and microwave data respectively. After the
14| P a g e
segmentation and feature extraction, the images can be classified into the
relevant information using various methods. The several methods for image
classification including unsupervised and supervised classification are briefed
in Chapter 4.
In the recent years, satellite images are available after the radiometric
correction, atmospheric correction and orthorectified. In this chapter, the basic
details of image rectification and geometric correction, image filtering, image
thresholding, image segmentation, feature extraction, and resampling
techniques are discussed in a brief manner as follows:
Figure 2.1: Flow chart of the major steps involved in remote sensing data
2.2 Image Rectification and Geometric Correction
Image rectification transforms an image to align with a reference coordinate
system, such as a map projection. This is typically done by identifying and
matching control points between the image and the reference system and then
applying a geometric transformation to the image. The objective is to ensure
the two images have the same image coordinate and lie on the same horizontal
15| P a g e
line. Image rectification is essential to improve the accuracy of feature
extraction, tracking algorithm, and matching the objects.
Apart from image rectification, Geometric correction is the process of
removing geometric distortions from an image or map. This is done to correct
for distortions caused by the camera angle, lens distortion, or other factors. The
goal of geometric correction is to represent the scene's geometry or location
accurately. Geometric correction is essential in remote sensing, aerial
photography, and satellite imagery, where accurate spatial information is
required for analysis and interpretation.
The most common geometric corrections include:
2.2.1 Orthorectification: Orthorectification is an image processing technique
used to remove geometric distortions and project an image onto a planar
surface or a map. The process involves correcting for the effects of terrain
relief, sensor orientation, and atmospheric conditions to produce an accurate
and georeferenced image that can be used for mapping, analysis, and
visualization purposes.
Orthorectification is particularly important in remote sensing applications,
where images are often acquired from a high-altitude platform, such as an
aircraft or satellite, and have a distorted perspective due to the sensor position,
terrain relief, and atmospheric conditions. Orthorectification corrects these
distortions and makes the image usable for quantitative analysis and geospatial
applications.
The orthorectification process involves several steps, including:
2.2.1.1 Sensor model creation: A sensor model describes the geometric
relationship between the sensor and the Earth's surface. This model includes
information about the sensor's position, orientation, and field of view.
2.2.1.2 Terrain correction: A digital elevation model (DEM) is used to correct
the effects of terrain relief on the image (Figure 2.2). The DEM provides
information about the elevation of the terrain, which is used to calculate the
ground position of each pixel in the image.
16| P a g e
Figure 2.2: Shuttle Radar Topography Mission (SRTM) digital elevation
dataset, DEM (left) and Slope (right) over the regions of San Francisco, USA
2.2.1.3 Atmospheric correction: Atmospheric effects, such as scattering and
absorption, can cause image distortions. Atmospheric correction algorithms are
used to remove these effects and produce a more accurate image.
2.2.1.4 Image projection: The corrected image is projected onto a planar
surface or a map using a cartographic projection. The projection method
depends on the image's intended use and the area being mapped. Map
projections are used to represent the curved surface of the Earth on a flat map.
Geometric correction techniques convert satellite images and aerial
photographs to a map projection, which is then used for spatial analysis and
mapping (Figure 2.3).
17| P a g e
Figure 2.3: Example of Geometric Correction
Source:http://wiki.awf.forst.unigoettingen.de/wiki/index.php/File:Geometric_c
orrection.png
Orthorectification is a complex and computationally intensive process that
requires accurate sensor and terrain data and advanced image processing
techniques. However, the resulting orthorectified image provides a valuable
tool for geospatial analysis, mapping, and visualization.
2.2.2 Georeferencing: Georeferencing is the process of assigning spatial
coordinates to an image or a map. Georeferencing aims to align the image or
map with a geographic coordinate system, allowing it to be accurately located
and integrated with other spatial data.
Georeferencing is critical in many geospatial applications, such as mapping,
remote sensing, and geographic information systems (GIS). The process
involves several steps, including:
18| P a g e
2.2.2.1 Selecting a coordinate system: The first step in georeferencing is
selecting a coordinate system that defines the geographic reference system for
the image or map. This system can be based on various global or local
coordinate systems, such as latitude/longitude or Universal Transverse
Mercator (UTM).
2.2.2.2 Identifying control points: Control points are known points on the
image or map that can be identified on both the image and the reference data,
such as ground control points (GCPs) or features readily identifiable in both
the image and the reference data.
2.2.2.3 Assigning spatial coordinates: The spatial coordinates for each control
point are assigned by matching the locations of the control points on the image
to their known locations in the reference data. This process is typically
performed using software that allows the user to select the control points and
assign their coordinates.
2.2.2.4 Transforming the image: Once the spatial coordinates for the control
points have been assigned, the image is transformed to align with the
coordinate system of the reference data. The transformation is based on a
mathematical model that relates the coordinates of the control points in the
image to their coordinates in the reference data.
2.2.2.5 Checking accuracy: The accuracy of the georeferenced image is
checked by comparing the locations of additional control points in the image to
their known locations in the reference data. Any discrepancies are corrected by
adjusting the transformation model or selecting additional control points.
19| P a g e
Georeferencing is a critical step in many geospatial applications, and the
accuracy of the georeferencing process is essential for ensuring the accuracy of
subsequent analyses and visualizations.
Geometric corrections are essential in remote sensing, cartography, and GIS
applications. Accurate spatial information is necessary for making informed
decisions in environmental monitoring, land use planning, and resource
management.
2.3 Image Filtering
Filtering is a technique used in image processing to enhance or modify an
image. It involves the use of a mathematical operation known as a filter to
transform an image. Filters can be used for various purposes, such as noise
reduction, image smoothing, edge detection, and feature extraction.
The most commonly used filters in image processing are:
2.3.1 Gaussian filter: A Gaussian filter, also known as a Gaussian smoothing
filter, is a commonly used image processing technique for reducing noise and
smoothing an image. It is based on the Gaussian distribution, a bell-shaped
probability density function.
The Gaussian filter works by convolving an image with a Gaussian kernel. The
Gaussian kernel is a two-dimensional array of numbers that defines the shape
of the filter. The values in the kernel are calculated using the Gaussian
function, which gives more weight to the central pixels and less weight to the
pixels farther away from the center.
The amount of smoothing applied to the image is determined by the standard
deviation of the Gaussian function and the size of the Gaussian kernel. A larger
20| P a g e
kernel size and a higher standard deviation result in more smoothing, while a
smaller kernel size and a lower standard deviation result in less smoothing.
The Gaussian filter is widely used in image processing for edge detection,
feature extraction, and image segmentation applications. It is also commonly
used as a pre-processing step for other image-processing techniques.
However, it is essential to note that the Gaussian filter can also blur important
image details, especially if the kernel size and standard deviation are too high.
Therefore, the selection of the appropriate filter parameters should be carefully
considered depending on the specific application and image characteristics.
2.3.2 Median filter: A median filter is a nonlinear image processing technique
used to remove noise from an image (Figure 2.1). Unlike linear filters, such as
the Gaussian filter, which perform a weighted average of pixel values in a local
neighbourhood, the median filter replaces each pixel value with the median
value of its neighbouring pixels.
The median filter works by sliding a window over the image, and for each
pixel within the window, it sorts the neighbouring pixel values and replaces the
pixel value with the median value. The size of the window determines the size
of the neighbourhood used to calculate the median value. Removing impulse
noise, such as salt-and-pepper noise, which randomly changes the pixel values
to the maximum or minimum possible intensity values, is particularly
achievable with the median filter.
21| P a g e
Figure 2.4: True Color Image (left) and after median filtering image (right)
using Landsat 8 OLI February 2020, over the regions of San Francisco, USA
Compared to linear filters, the median filter can better preserve edges and other
sharp details in an image since it does not blur the pixel values in the same way
as a weighted average. However, the median filter can also introduce some
blurring if the window size is too large, and it may need to be more effective in
removing other types of noise, such as Gaussian noise.
The median filter is commonly used in various image-processing applications,
including medical imaging, satellite imagery, and digital photography. It is a
simple and effective method for removing impulse noise, and it can be easily
implemented on digital platforms with low computational requirements.
2.3.3 Sobel filter: A Sobel filter is an image processing filter that is used for
edge detection in images. It is a spatial filter that calculates the gradient of the
image intensity values in the horizontal and vertical directions separately.
The Sobel filter works by convolving the image with two separate 3x3 kernels,
one for the horizontal gradient and one for the vertical gradient. The kernels are
designed to approximate the first derivative of the image intensity concerning
22| P a g e
the spatial coordinates. The filtered image is obtained by combining the
horizontal and vertical gradient images using the square root of the sum of their
squared values.
The Sobel filter is a commonly used edge detection filter due to its simplicity
and effectiveness. It can detect edges in an image with high accuracy and low
computational cost. It is also robust to noise, as the filter has a smoothing
effect due to the averaging of neighbouring pixels in the convolution process.
In addition to edge detection, the Sobel filter is often used for other image
processing tasks such as feature extraction, image segmentation, and object
recognition. It is a fundamental tool in computer vision and is widely used in
various applications such as autonomous driving, robotics, and medical
imaging.
2.3.4 Laplacian filter: A Laplacian filter is an image processing filter for edge
detection and image sharpening. It is a second-order derivative filter, which
calculates the second derivative of the image intensity concerning the spatial
coordinates.
The Laplacian filter works by convolving an image with a kernel that
approximates the second derivative of the image intensity. The kernel is a 3x3
or 5x5 matrix, and the filter operation involves subtracting the sum of the
surrounding pixels from the value of the central pixel. This process highlights
the regions in the image where the intensity changes rapidly, which typically
correspond to the edges or boundaries between different objects or regions in
the image.
The Laplacian filter can be used for edge detection by thresholding the output
image to identify regions with high second derivative values corresponding to
firm edges. It can also be used for image sharpening by adding the filtered
image to the original image, which enhances the contrast and detail in the
edges of the image.
However, the Laplacian filter is sensitive to noise in the image and can amplify
the noise if the noise is not removed before applying the filter. Therefore, it is
often used in conjunction with other image processing techniques, such as
smoothing filters or noise reduction methods.
23| P a g e
Overall, the Laplacian filter is a powerful tool for detecting edges and
enhancing the detail in an image. It has a wide range of applications in image
processing, computer vision, and machine learning.
2.3.5 Canny filter: The Canny filter, also known as the Canny edge detection
algorithm, is an image processing technique used for edge detection. It was
developed by John F. Canny in 1986 and is considered one of the most
accurate and widely used edge detection filters.
The Canny filter works by first smoothing the image using a Gaussian filter to
remove noise. Then, it calculates the image intensity gradient using a Sobel
filter, which gives the direction and strength of the edge at each pixel. Next, it
applies non-maximum suppression to thin the edges by only keeping the pixels
with the highest gradient value in the direction of the edge. Finally, it applies
hysteresis thresholding to determine the final edge map by selecting only those
edges above a high threshold and connecting to edges above a lower threshold.
The Canny filter is known for its ability to detect edges while also accurately
minimizing false positives and negatives. It is robust to noise and can detect
edges with low contrast and variable lighting conditions. Additionally, the
parameters used in the Canny filter can be adjusted to fine-tune the sensitivity
and specificity of edge detection.
The Canny filter is widely used in various applications such as object
recognition, tracking, and segmentation in computer vision and image
processing. It is a powerful tool for detecting edges in images and plays a
fundamental role in many algorithms and applications.
In image processing, filters can be applied in the spatial or frequency domain.
Spatial filtering involves applying the filter directly to the image pixels, while
frequency filtering involves converting the image to its Fourier transform and
applying the filter in the frequency domain.
2.4 Image Thresholding
Thresholding is a common technique used in image processing to convert a
grayscale or color image into a binary image. The process involves selecting a
threshold value, which is used to segment the image into two categories:
foreground and background. Pixels with intensity values above the threshold
24| P a g e
are classified as foreground, while those below the threshold are classified as
background.
Thresholding is helpful in many applications, such as image segmentation,
edge detection, and object recognition. There are several types of thresholding
techniques, including:
2.4.1 Global thresholding: Global thresholding is a simple technique for image
segmentation that separates an image into foreground and background regions
based on a single threshold value. The threshold value is chosen based on the
image's histogram, which shows the distribution of pixel intensities.
The global thresholding technique works by comparing each pixel in the image
to the threshold value. If the pixel intensity is greater than or equal to the
threshold value, it is assigned to the foreground region; otherwise, it is
assigned to the background region.
Global thresholding can be performed using different methods, such as Otsu's
method, which automatically calculates the optimal threshold value by
maximizing the between-class variance of the image intensity distribution.
Another method is the triangle method, which selects the threshold value at the
peak of the histogram divided by the line connecting the histogram's edge and
the highest point of the histogram.
Global thresholding is a simple and efficient technique for image segmentation,
and it works well for images with a bimodal intensity distribution, where there
are apparent intensity differences between foreground and background regions.
However, it may not be suitable for images with complex or uneven intensity
distributions, where multiple threshold values may be required for accurate
segmentation. More advanced segmentation techniques such as clustering,
region growing, or deep learning methods may be necessary in such cases.
2.4.2 Adaptive thresholding: Adaptive thresholding is an image processing
technique used for image segmentation that allows for more accurate
segmentation of images with non-uniform illumination or varying contrast.
Unlike global thresholding, where a single threshold value is applied to the
entire image, adaptive thresholding calculates the threshold value for each
pixel based on its local neighbourhood.
25| P a g e
Adaptive thresholding works by dividing the image into small, overlapping
regions and calculating the threshold value for each region based on its local
statistics, such as the mean or median intensity. Each pixel within the region is
subjected to the threshold value to determine whether it belongs to the
foreground or background region. Adaptive thresholding is a powerful
technique for image segmentation, especially in cases where the illumination or
contrast varies across the image. It is widely used in various applications such
as document image analysis, medical imaging, and computer vision.
2.4.3 Multi-thresholding: multi-thresholding, also known as multiple
thresholding, is an image processing technique used for image segmentation,
where an image is divided into multiple segments based on several threshold
values. It is a more advanced technology than global thresholding, where only
one threshold value is used to segment the image.
Multi-thresholding works by dividing the image into multiple intensity
intervals, where each interval corresponds to a specific segment or region of
the image. A threshold value is selected for each interval to separate it from the
adjacent intervals. The threshold values can be determined using various
methods, such as histogram-based, clustering, or optimization techniques.
One popular method for multi-thresholding is Otsu's method, which selects
threshold values to maximize the between-class variance of the image. Another
method is the k-means clustering algorithm, which groups pixels into clusters
based on their intensity values, and the number of clusters corresponds to the
number of segments desired.
Multi-thresholding is particularly useful in cases where the image has more
than two regions with different intensity levels or when the objects of interest
have different shades or colors. It has numerous applications in various fields,
such as medical imaging, remote sensing, and industrial inspection.
However, multi-thresholding may not be suitable for images with complex or
overlapping regions, where more advanced segmentation techniques such as
region growing or deep learning methods may be necessary.
Thresholding can be performed using mathematical operations, such as logical
operations, histogram analysis, or statistical methods. The choice of the
26| P a g e
thresholding technique depends on the image characteristics and the
application requirements.
2.5 Image Segmentation
The process of dividing an image into multiple segments or regions, with each
segment representing a meaningful object or part of an object in the image, is
known as image segmentation (Michael, et al, 2000). The purpose of image
segmentation is to simplify the representation of an image and make it more
meaningful and easier to analyse. It can be performed using various techniques,
including thresholding, clustering, edge detection, region growing, and more
advanced methods such as deep learning. Each of these techniques has its
strengths and weaknesses, and the choice of method depends on the specific
application and image characteristics. Standard techniques used for image
segmentation include thresholding, edge detection, and feature extraction.
2.6 Feature Extraction
Feature extraction is the process of selecting and transforming relevant features
from raw data to improve the performance of a machine-learning model. The
goal is to reduce the dimensionality of the data and to highlight the most
critical aspects of the input data that are relevant to the problem at hand.
Feature extraction can be done manually or automatically. In manual feature
extraction, domain experts select the relevant features based on their
knowledge of the problem domain. In automatic feature extraction, machine
learning algorithms identify the most informative features.
Examples of feature extraction techniques include principal component
analysis (PCA), linear discriminant analysis (LDA), and wavelet transforms.
PCA is a technique for reducing the dimensionality of data by projecting it
onto a lower-dimensional space while preserving the most critical information.
LDA is a technique for finding the linear combinations of features that best
separate different classes in the data. Wavelet transforms used to decompose
signals into different frequency bands and to extract features representative of
different aspects of the signal.
Feature extraction is a critical step in many image processing applications, and
the choice of feature extraction method depends on the specific task and the
27| P a g e
characteristics of the image data. Once features have been extracted, they can
be used for subsequent analysis and processing, such as classification, object
recognition, or detection.
2.7 Resampling Techniques
Resampling techniques refer to methods used in digital image processing to
change an image's spatial resolution or size. The most common resampling
techniques include the following:
2.7.1 Nearest Neighbor: Nearest Neighbor is a simple algorithm used in many
data analysis fields, including image processing, machine learning, and data
mining. In image processing, the nearest neighbour algorithm is often used for
tasks such as image classification, image registration, and image segmentation.
The nearest neighbour algorithm works by finding the closest data point in a
training dataset to a given input data point. This is done using a distance metric
such as Euclidean distance or Manhattan distance. Once the closest data point
is found, the algorithm assigns the class label of that data point to the input
data point.
In image processing, the nearest neighbour algorithm can be used for tasks
such as image classification, where the algorithm assigns a class label to an
input image based on the closest training image. It can also be used for image
registration, where the algorithm finds the closest match between two images
based on the pixel values. Finally, it can be used for image segmentation,
where the algorithm assigns each pixel in an image to a class based on the
closest training data point.
While the nearest neighbor algorithm is simple and easy to implement, it has
several limitations (Figure 2.5). For example, it can be sensitive to noise in the
input data, and it may not be suitable for high-dimensional datasets with many
features. Additionally, it may only perform well if the training dataset is
representative of the data distribution. As a result, more advanced algorithms
such as decision trees, random forests, and deep learning networks are often
used in image processing applications.
28| P a g e
Figure 2.5: Example of Nearest Neighbor Interpolation in Image
Resampling
Image Source: http://wiki.awf.forst.unigoettingen.de/wiki/index.php/File:Interpolation_NN.png
2.7.2 Bilinear Interpolation: Bilinear interpolation is a method of estimating
the value of a function at a point within a rectangular grid based on the values
of the function at the four nearest grid points. Bilinear interpolation is
commonly used in image processing and computer graphics to resize or rescale
images.
When an image is rescaled, the original pixels are moved to new locations,
resulting in gaps between the original pixels. Bilinear interpolation fills in
these gaps by computing the weighted average of the four nearest pixel values.
Specifically, the value at a new pixel location is computed as a weighted
average of the pixel values at the four closest grid points, with the weights
determined by the distance between the new pixel location and each of the four
grid points.
2.7.3 Bicubic Interpolation: Bicubic interpolation involves using a weighted
average of a more significant number of neighbouring pixels to assign a value
to the new pixel in the resized image. This technique is more computationally
intensive than bilinear interpolation but produces smoother results with less
loss of image detail.
29| P a g e
Bilinear interpolation is preferred over other interpolation methods, such as
nearest neighbor interpolation and bicubic interpolation, because it provides a
good balance between computation time and image quality. Nearest neighbor
interpolation is the fastest method, but it can result in pixelated images, while
bicubic interpolation provides smoother images but requires more computation
time.
One drawback of bilinear interpolation is that it can result in image artifacts
such as blurring and aliasing if the image is scaled too much or if the
interpolation is performed multiple times. More advanced interpolation
methods, such as Lanczos interpolation or spline interpolation, can be used to
avoid these artifacts, but these methods are more computationally intensive.
Resampling techniques are used in various applications, including image
scaling, image registration, and image mosaicking. Choosing the appropriate
resampling technique depends on the specific application and the desired level
of image quality.
2.8 Commonly used Vegetation Indices
Vegetation indices are mathematical formulas that use spectral bands from
remote sensing data to estimate vegetation health, vigor, and productivity. To
maximize the sensitivity of the vegetation characteristics while minimizing
factors such as atmospheric and soil reflectance effects, various vegetation
indices are utilized. Some commonly used vegetation indices are listed below:
2.8.1 Normalized Difference Vegetation Index (NDVI): NDVI is a widely
used vegetation index that measures the difference between the reflectance of
near-infrared (NIR) and visible red (VIS) light. NDVI values range from -1 to
+1, with higher values indicating healthier vegetation.
−
=
+
and represent the spectral reflectance in NIR and Red regions,
respectively.
NDVI is the most commonly used vegetation index mainly used for assessing
the vegetation cover, vegetation monitoring over time, estimation of crop
acreage, productivity, and plant stress or disease in agricultural applications.
30| P a g e
2.8.2 Enhanced Vegetation Index (EVI): EVI is another popular vegetation
index that is more sensitive to changes in vegetation cover than NDVI. EVI
takes into account the blue band and corrects for atmospheric influences. EVI
is used for optimizing the vegetation signal with improved sensitivity in
regions of high biomass
−
=
+ × − × +
The surface reflectance at the blue band, denoted by ρ_BLUE, the coefficients
of the aerosol resistance term represented by C1 and C2, and the Canopy
background adjustment factor given as X, are the components used to calculate
the gain factor G.
2.8.3 Soil Adjusted Vegetation Index (SAVI): SAVI is similar to NDVI, but it
also considers the amount of soil cover in the area. SAVI can be helpful in
areas with sparse vegetation cover.
−
= +
+
The spectral reflectance in NIR and Red regions is denoted by ρ_NIR and
ρ_RED, respectively, with the constant C introduced to reduce the impact of
soil brightness. The variation of SAVI from zero to infinity is dependent on the
canopy density. SAVI is equivalent to NDVI if C=0
2.8.4 Leaf Area Index (LAI): LAI measures the area's vegetation. It is
calculated as the total surface area of leaves per unit of the ground area covered
by vegetation. LAI is typically expressed as a dimensionless ratio or as a unit
of area (e.g., m2 leaf area per m2 ground area). LAI can be measured using
various techniques, including direct measurements of leaf area, indirect
methods such as optical sensors, and remote sensing techniques such as
satellite or airborne imagery.
( )
=
( )
LAI values can vary widely depending on factors such as vegetation type, plant
density, and environmental conditions. LAI is an essential parameter in many
31| P a g e
applications, such as crop growth modeling, carbon cycle studies, and climate
change research. It provides a measure of the amount of photosynthetic surface
area available for plant growth and can be used to estimate plant productivity
and water use efficiency.
2.8.5 Normalized Difference Water Index (NDWI): NDWI is the spectral
index commonly used to estimate the presence and quantity of water in
vegetation or soil. It measures the difference between the reflectance of NIR
and shortwave infrared (SWIR) light.
−
=
+
Where and represent the spectral reflectance in NIR and SWIR
regions, respectively, the NIR band is sensitive to vegetation, while the SWIR
band is sensitive to water. So, by comparing the two bands, NDWI can
distinguish between water and vegetation, and its values range between -1 to
+1, with higher values representing the higher presence of water.
This Vegetation index is used in many applications in environmental science,
including hydrology, agriculture, and forestry. NDWI may be used to detect
water stress in plants, which can help farmers optimize irrigation practices. It
can also use to monitor water bodies, such as lakes and rivers, and to estimate
water content in crops and soil.
2.8.6 Chlorophyll Index (CI): CI is a spectral index used to estimate the
amount of chlorophyll in plants. Chlorophyll is the green pigment responsible
for photosynthesis, and its concentration in leaves is an essential indicator of
plant health and productivity. It is calculated as the difference between the
reflectance of NIR and red edge (RE).
−
=
+
The NIR band is sensitive to leaf structure, and the RE band is sensitive to
chlorophyll absorption. Chlorophyll Index can estimate the chlorophyll
concentration in plant leaves by comparing the difference between these two
bands. CI values range from -1 to 1, with higher values indicating a greater
chlorophyll concentration. Chlorophyll Index has many applications in
32| P a g e
agriculture, forestry, and environmental science. It can monitor plant health
and productivity, estimate crop yields, and detect plant stress caused by
nutrient deficiencies or water scarcity.
These are just a few examples of the many vegetation indices available for
remote sensing analysis. The choice of index depends on the specific research
question, data availability, and the characteristics of the study area.
Summary
In this chapter, firstly the overview of multi-temporal remote sensing data
processing after the image acquisition to image classification is provided.
Secondly, widely used image processing techniques including image
rectification and geometric correction, image filtering, image thresholding,
image segmentation, edge detection, feature extraction, and resampling
techniques are discussed in a brief manner. The details of the image acquisition
including satellite data products, formats, widely famous data sources are also
provided in Chapter1. The pre-processing steps of microwave data is discussed
in Chapter 6. The widely used classification techniques in remote sensing field
are discussed in Chapter 4. This chapter basically focuses only the image
processing techniques in remote sensing.
33| P a g e
Chapter 3
Image Enhancement
3.1 Introduction
The process of Image Enhancement is a method of accentuating image
characteristics such as image contrast, and brightness by applying various
technologies to improve the visual quality of an image. Image enhancement is
a tool rendering clearer and more informative images for better visual
interpretation and analysis. Remote sensing involves collection of data in the
form of stored images from sensors mounted on aerial vehicles or satellites.
These images supply a wealth of knowledge regarding the types of vegetation,
surface water and land use patterns on Earth's surface. Images captured by
remote sensing sensors are often affected by various types of noise and
distortion that reduce the clarity and usefulness of the data. For more accurate
identification and interpretation of the Earth’s features there is a constant need
of enhancement of data quality. Applying image enhancement to remote
sensing has many practical advantages. For example, enhanced images are
applied in detection and monitoring changes in land use patterns, such as
deforestation or urbanization. They can also be used to identify and map land
cover, which is essential for monitoring agricultural productivity and forest
health (Jain 1989). The enhanced imagery can also be used to identify and map
water bodies, which is important for water resource management and
monitoring changes in hydrological systems. Needs and benefits of improving
remote sensing imagery are: Remote sensing imagery captured by satellites or
aircraft are often affected by atmospheric and environmental conditions such as
haze, cloud cover and lighting conditions. These conditions can affect the
quality and accuracy of the images which make them difficult to interpret.
Therefore, image enhancement in remote sensing is necessary to improve
image quality and enhance its visual appearance. The benefits of enhanced
remote sensing imagery are:
34| P a g e
3.1.1 Improved Image Quality:
Using enhancement techniques such as removing noise, distortions and other
artifacts the quality of the image improves drastically. This improves the visual
appearance of the images and makes them easier to interpret.
3.1.2 Better Feature Extraction:
Image enhancement techniques helps to accurately extract features and details
from the data regarding Land Use Land Cover. Which is essential in various
fields, such as agriculture, land management, and urban planning.
3.1.3 Increased Accuracy:
Image enhancement in remote sensing helps to increase the accuracy of image
interpretation by providing clearer and more detailed images. This accuracy is
critical in fields such as environmental management, where accurate
measurements are necessary for decision making.
3.2 Techniques of Image Enhancement
The various techniques of image enhancement are given in below (Li et al.,
2019; Jensen 2016; Lillesand et al.,2014):
3.2.1 Contrast Stretching
Contrast stretching is a technique used to enhance the contrast of an image.
This technique involves adjusting the brightness and contrast of an image to
make it more attractive and sharper. It increases the range of brightness values
for efficient display in an image. The ratio of the maximum intensity to the
minimum intensity of an image defines contrast stretching. The contrast ratio
(CR) is valuable for comprehending the resolution and edge of an image.
Greater the ratio, easier the interpretation. Contrast difference (CD) is used to
visualize the difference between the maximum grey level (Imax) and the
minimum grey level (Imin) of an image. Contrast index (CI) is a useful tool in
assessing the contrast level of the image.
CD = Imax – Imin
35| P a g e
I max
CR =
I min
I
max I min
CI =
I I
max min
Where, Imax is the maximum grey level of the image and Imin is the minimum
grey level of the image
There are two types of Contrast stretching: Linear and Non – Linear.
3.2.2 Linear Stretching
Linear stretching is a type of contrast stretching that involves adjusting the
brightness and contrast of an image in a linear way. This means that the same
number of adjustments is applied to all parts of the image, regardless of their
original brightness or contrast.
3.2.2.1 Min-Max Linear Stretch
Linear min-max stretching is a type of linear stretching used in image
processing to improve image contrast (Khan et al., 2018). To adjust the
contrast and brightness of an image, so that the minimum and maximum pixel
values of the image match the minimum and maximum values of the display
range this technique is employed. In simple terms, the min-max linear
stretching technique involves scaling the pixel values of an image so that the
darkest pixel is the darkest possible hue in the display range and the pixel the
lighter is the lightest shade possible as shown in Figure 3.1. This results in
images with improved contrast and enhanced visual appeal. This technique is
particularly useful for low contrast images, where the range of pixel values is
compressed and appears dull.
36| P a g e
Figure 3.1: Min-Max Linear Stretch using Cartosat-3 data of a Part of Delhi
3.2.3 Non - Linear Stretching
Nonlinear stretching, on the other hand (Figure 3.2), involves adjusting the
brightness and contrast of an image in a nonlinear way (Chan et al., 2020; Wei
et al., 2020). This means applying different amounts of adjustments to different
parts of the image, depending on their original brightness or contrast. It helps
bring out more details in an image and creates a more visually appealing end
product. It has been more preferable for enhancing the colour contrast between
various nearly-classes and subclasses.
Square root: In square root stretch, it transforms the pixel values of an image
by taking the square root of each value. This type of stretching is useful for
increasing contrast in low dynamic range images.
Square: In square stretching, pixel values are transformed by squaring each
value. This type of stretching is useful for increasing contrast in high dynamic
range images.
Logarithmic: In logarithmic stretching, pixel values are transformed using a
logarithmic function. This type of stretching is useful for improving contrast in
unevenly lit or high dynamic range images.
37| P a g e
Exponential: In exponential stretching, pixel values are transformed using an
exponential function. This type of stretching is useful for increasing contrast in
low dynamic range images.
Histogram equalization: In the process histogram equalization, the pixel
values of an image are transformed to evenly distribute the pixel values. This
type of stretching is useful for increasing contrast in images with low dynamic
range or uneven lighting.
Figure 3.2: Non-Linear Histogram equalization using Cartosat-3 data of a Part
of Delhi
3.3 Spatial Filtering
Spatial frequency is another element of an image. For any given part of the
image, it is the per unit distance changes in the number of brightness values.
Few changes in brightness value are called low frequency and dramatic
changes over short distance is called high frequency. The spatial filtering
process involves dividing the image into its constituent spatial frequencies and
selectively modifying certain spatial frequencies to accentuate particular
features. Spatial filtering is also a process where an image is modified by
applying a filter or a mathematical operation to it. The filter is a matrix of
numerical values that is placed over the image, and each pixel in the image is
replaced with a new value that is a function of the values in the corresponding
positions in the filter.
38| P a g e
Figure 3.3: Type of Spatial Filtering
There are two main types of spatial filtering: low pass and high pass as shown
in Figure 3.3.
3.3.1 Low Pass Filtering
Low pass filtering is a type of spatial filtering that removes high frequency
information from an image, leaving only low frequency components. Low-
frequency components are parts of an image that change slowly, such as the
overall brightness or contrast of an image as shown in Figure 3.4. It can be
useful for smooth images, reduce noise, or blur images.
Figure 3.4: Low pass filtered image of Cartosat-3 data
39| P a g e
3.3.2 High Pass Filtering
High-pass filtering, on the other hand, is a type of spatial filtering that removes
low-frequency information from an image, leaving only the high-frequency
components as shown in Figure 3.5.
Figure 3.5: High pass filtered image of Cartosat-3 data
High-frequency components refer to parts of the image that change rapidly,
such as edges or image detail. High pass filtering can be used to sharpen
images or detect edges and boundaries.
3.4 Image Fusion
Image fusion, also known as data fusion, is a process that combines
information from multiple satellite images or different sensor bands to create a
composite image that contains more detailed and comprehensive information
than any individual image. The goal of image fusion is to exploit the
complementary strengths of different data sources to improve the
interpretation, analysis, and understanding of the scene being observed.
There are several methods commonly used for satellite image fusion:
Pan-Sharpening: Pan-sharpening combines high-resolution panchromatic
imagery with lower-resolution multispectral imagery (which provides images
with lower spatial resolution but more spectral information). By fusing these
40| P a g e
two sources, a single image is created that has both high spatial and spectral
resolution.
Brovey Transform: The Brovey transform is a simple ratio-based fusion
method. It assigns more weight to the panchromatic band than to the
multispectral bands, preserving the spectral information while enhancing the
spatial details.
Intensity-Hue-Saturation (IHS) Transform: The IHS transform separates the
intensity (or brightness), hue, and saturation components of an image. It fuses
the high-resolution panchromatic image with the intensity component of the
multispectral image and then transforms the fused image back to the RGB
color space.
Principal Component Analysis (PCA): PCA is a statistical technique that
transforms the original multispectral bands into a new set of uncorrelated
variables called principal components. The high-resolution panchromatic
image is then combined with selected principal components to create a fused
image.
Wavelet Transform: The wavelet transform decomposes an image into
different frequency components. In image fusion, the high-resolution
panchromatic image is combined with the low-frequency components of the
multispectral image, while the high-frequency components are preserved from
the multispectral image.
Sparse Representation-based Methods: Sparse representation-based methods
aim to represent the multispectral image using a linear combination of atoms
from a dictionary that is learned from the panchromatic image. This allows for
the extraction of high-frequency details from the panchromatic image and their
fusion with the multispectral image.
These methods are just a few examples of satellite image fusion techniques.
The choice of fusion method depends on factors such as the characteristics of
the satellite data, the level of detail required, and the specific application or
analysis being conducted. Each method has its strengths and limitations, and
researchers continue to explore new approaches to improve the quality and
accuracy of fused satellite images.
41| P a g e
3.5 Band Math
Band math is a technique used in remote sensing to extract useful information
from satellite images. It involves combining two or more spectral bands, which
are different wavelengths of light, to create a new image. This new image can
reveal features that may not be easily visible in the original image and can be
used for various purposes such as vegetation monitoring, land use mapping,
and environmental monitoring.
The basic concept of band math is to combine bands in a way that enhances the
information content of the image. For example, combining the near-infrared
and red bands can highlight vegetation, as plants reflect more strongly in the
near-infrared range than other objects. Similarly, combining the blue and green
bands can help to identify water bodies, as water absorbs strongly in these
wavelengths.
There are several techniques for band math, including addition, subtraction,
multiplication, and division (Zhou et al.,2019). The choice of technique
depends on the specific application and the desired outcome. For example,
adding two bands together can help to enhance the contrast between different
features, while subtracting two bands can help to isolate specific features such
as urban areas or vegetation.
Band math requires knowledge of the characteristics of different spectral bands
and their interactions with the objects and surfaces being observed. It also
requires careful calibration and validation to ensure accurate results. The
results obtained through band math should always be interpreted in conjunction
with ground truth data and other sources of information, to ensure that they are
meaningful and useful for the intended purpose.
3.6 Indices
Indices in remote sensing are mathematical formulas that use combinations of
spectral bands to calculate specific properties of objects or surfaces. These
indices can provide valuable information about vegetation, water bodies, urban
areas, and other features in an image.
The NDVI (Normalized Difference Vegetation Index) is used for monitoring
vegetation health and productivity (Figure 3.6). A simple formula is used, that
42| P a g e
subtracts of the reflectance value of red band from NIR band upon the sum of
the two (Gonzalez et al., 2008). NDVI values lies between -1 to +1, with
positive value indicating healthier vegetation as shown in Figure 3.6. NDVI
image can further be classified in different classes based on NDVI values as
shown in Figure 3.7.
Figure 3.6: NDVI Image
Figure 3.7: Classified NDVI Image
43| P a g e
The NDBI (Normalized Difference Built-Up Index) is an index used to
detect urban areas and the built environment. A simple formula is used to
subtract the reflectance of SWIR band from the reflectance of NIR band, upon
the sum of these two bands. NDBI values lies between -1 to +1; +1 being
densely built-up area and -1 indicating water bodies as shown in Figure 3.8.
Figure 3.8: NDBI Image
NDSI (Normalized Difference Snow Index) is an index used to track snow
and ice cover. It has different various but the most used on subtracts the
reflectance value of SWIR band from the green band, upon the sum of these
two bands. NDSI values lies between -1 to +1, with values above 0.4 indicating
snow or ice.
NDWI (Normalized Difference Water Index) is an indicator used to locate
bodies of water. A simple formula is used that includes subtraction of the near
infrared band from the green or blue bands and, upon the sum of these two
bands. NDWI values lies between -1 to +1, with higher values indicating
higher humidity percentage. As shown in Figure 3.9.
44| P a g e
Figure 3.9: NDWI Image
Summary
Image enhancement techniques play a critical role in remote sensing
applications, as they improve the quality of images acquired by remote sensors.
The application of these techniques has become increasingly important due to
the wide range of remote sensing applications, including environmental
monitoring, agriculture, and urban planning.
There are various techniques available for image enhancement in remote
sensing, including filtering, contrast stretching, and image fusion. These
techniques can be applied to different types of remote sensing data, including
optical, radar, and thermal images, to enhance their spatial, spectral, and
radiometric properties.
However, it is imperative to note that the choice of technique should be made
based on the specific requirements of the remote sensing application. For
example, filtering techniques such as low-pass filtering are effective in
reducing noise and enhancing edge details in optical images, while image
fusion techniques such as principal component analysis and wavelet-based
fusion are useful for combining different types of remote sensing data to
produce a more informative image.
Moreover, it is essential to ensure that the image enhancement techniques used
do not result in the loss of important information in the remote sensing data.
45| P a g e
Over-enhancement can lead to the distortion of the image and the loss of
important details, making the remote sensing data less useful.
In addition, there are advanced image enhancement techniques, such as
machine learning-based approaches, which are becoming increasingly popular
in remote sensing applications. These techniques have shown promising results
in various applications, including land cover classification and change
detection. However, it is important to note that the use of these advanced
techniques requires a significant number of computational resources, which
may not be readily available to everyone. As such, simpler techniques such as
contrast stretching and filtering remain relevant and useful for basic image
enhancement needs in remote sensing applications.
Image enhancement techniques are essential tools for improving the quality of
remote sensing data. The choice of technique is dependent on the specific
requirements of the exercise, and care should be taken to avoid over-
enhancement that can result in the loss of important information. While
advanced techniques such as machine learning-based approaches are
promising, simpler techniques remain relevant and useful for basic image
enhancement needs in remote sensing applications. For most display purposes,
linear stretch is a preferred method; Non-linear enhancements always enhance
pixels at the cost of others – for example, logarithmic stretch enhances dark
pixels at the cost of brighter pixels Histogram equalization gives a sharp output
but gives rise to the problem of reduction in number of classes. In summary,
Contrast stretching, spatial filters, band math, and indices are all important
image processing techniques that can be used to enhance and analyse satellite
or aerial imagery., and we must keep in mind the end and process the data
accordingly.
46| P a g e
Chapter 4
Image Classification
4.1 Introduction
Remote sensing satellite data refers to the data acquired by sensors onboard
satellites that orbit the Earth, which can be used to study and monitor the
Earth's surface and atmosphere. There are several types of remote sensing
satellite data, including:
Optical data: This type of data is acquired by sensors that detect
visible and near-infrared light, such as the Cartosat, Landsat and
Sentinel satellites. Optical data can be used to map land cover and
vegetation, monitor changes in land use, and detect natural disasters.
Radar data: This type of data is acquired by sensors that emit and
receive microwave radiation, such as the Synthetic Aperture Radar
(SAR) on EOS-4(RISAT), Sentinel-1 and the Advanced Land
Observing Satellite (ALOS) PALSAR. Radar data can penetrate
clouds and vegetation, allowing for the detection of changes in
topography, soil moisture, and land cover.
Thermal data: This type of data is acquired by sensors that detect the
amount of thermal radiation emitted by the Earth's surface, such as the
VNIR sensor on IMS-1 and Thermal Infrared Sensor (TIRS) on
Landsat 8. Thermal data can be used to study urban heat islands,
monitor volcanic activity, and detect wildfires.
LiDAR data: This type of data is acquired by sensors that emit laser
pulses to the source and measure the return time of pulse (To sensor).
LiDAR data can be used to generate high-resolution topographic maps
and measure forest structure and biomass.
Remote sensing satellite data can be used for a wide range of applications, such
as natural resource management, planning of land use, environment
monitoring, and disaster management activities. The availability and
accessibility of remote sensing data have greatly increased in recent years,
allowing for more efficient and effective monitoring of the Earth's surface and
atmosphere.
47| P a g e
4.2 Remote Sensing Image Classification
To extract meaningful information from the vast amounts of data generated by
remote sensing sensors, remote sensing satellite data needs to be classified into
different categories. Some of the main reasons why remote sensing image
classification is necessary include:
Land cover mapping: Remote sensing image classification can be
used in identification and land cover mapping, such as forests, crops,
water bodies, and urban areas. This information is essential for land
use planning, resource management, and environmental monitoring.
Change detection: Remote sensing image classification can be used
to detect changes in land cover over time, like deforestation,
urbanization, and natural disasters. This information is crucial for
assessing the impacts of human activities on the environment and for
developing strategies to mitigate these impacts.
Environmental monitoring: Remote sensing image classification is
being to monitor environmental parameters such as water quality, air
pollution, and soil moisture. This information is essential for assessing
the health of ecosystems and for identifying potential environmental
risks.
Disaster management: Remote sensing image classification can be
used to assess the impacts of natural disasters, such as floods,
wildfires, and earthquakes. This information is crucial for developing
effective response strategies and for planning for future events.
Overall, remote sensing image classification is a powerful tool for studying and
monitoring the Earth's surface and atmosphere, and it is essential for different
applications in environmental science, natural resource management, and
disaster management.
Remote sensing satellite image classification is the process of giving land
cover categories to different pixels or groups of pixels in an image acquired by
remote sensing sensors, such as satellite or airborne sensors. Its goal is to find
meaningful information from the input image, which can be further used for
different applications, like land use planning, natural resource management,
and environmental monitoring. Figure 4.1 shows image classification process.
48| P a g e
The classified map shows classes A and B depecting different land features of
the input surface.
Figure 4.1: Digital image and classified image representation
(Source: Canada, Natural Resources. Image Classification and Analysis. 29
Jan. 2008, https://natural-resources.canada.ca)
4.3 Remote sensing image classification procedure
Remote sensing satellite image classification is used to determine the land use
and land cover. Land cover in LULC means type of material present on the
input landscape such as, crops, water, crops, wetland, forest, man-made
materials. Land use in LULC refers to how humans utilize the land surface,
such as for agriculture, commerce, settlement.
The various steps involved in extracting thematic land-cover information from
remote sensing satellite data (Gong and Howarth 1990) include:
a. Scheme of image classification: It includes information classes such as
urban, agriculture, forest areas, etc. field survey data and other ancillary
data of the study area.
b. Pre-processing of the input satellite image, (Corrections of atmospheric,
radiometric, atmospheric, topographic & geometric, image enhancement, &
clustering of initial image.
c. Representative area selection of the input image and its analyze, involves
initial clustering results or generation of training signs.
d. Running of Image classification techniques & algorithms.
49| P a g e
e. Post-processing step: Geometric correction, filtering and classification on
pre-processed image.
f. Accuracy assessment: It includes comparison of classification results with
field studies.
4.3.1 Image classification scheme
The first step for the analyst is to identify the ROI (geographic region of
interest) to conduct hypothesis testing. Next, a classification scheme is
carefully developed to define the specific classes of interest for examination.
Depending on the classes chosen, the analyst may opt for producing hard or
fuzzy output products and decide whether to use per-pixel or object-oriented
classification logic. It is crucial to meticulously select and define all classes of
interest to successfully classify remotely sensed data for land-use, land-cover
information. This involves using a classification scheme with accurately
defined classes, organized based on logical criteria. For a hard (crisp)
classification, the classification system should ideally have mutually exclusive,
and hierarchical classes. While fuzzy classification systems are more flexible,
they may not be easily transferable to different environments; hence, hard
classification schemas are typically preferred for classification purposes.
4.3.2 Acquire appropriate remote sensing, initial ground reference data
and pre-processing.
Afterward, the analyst acquires the suitable digital remote sensor data,
considering both the sensor system's capabilities and environmental limitations.
Whenever possible, ground reference information is collected simultaneously
with the remote sensing data acquisition. Subsequently, the remote sensor data
undergoes radiometric and geometric corrections to prepare it for further
analysis.
4.3.3 Training data generation
Training data is one of the deciding factors in supervised classification.
Preparation of the training data depends on various factors and the key
characteristics of training in classification of remote sensing images are:
50| P a g e
Spectral bands: Remote sensing images usually have multiple
spectral bands, which provide information about the reflectance of
different wavelengths of electromagnetic radiation. The choice of
spectral bands is critical to the performance of the algorithm, as they
determine the information available to distinguish between different
land cover types.
Spatial resolution: Remote sensing images can have varying spatial
resolutions, which affects the size of the smallest feature that can be
detected. Higher spatial resolution images provide more detailed
information but can be computationally intensive to process.
Training sample selection: The selection of training samples is
critical to the accuracy of the classification algorithm. Training
samples should be representative of the land cover classes present in
the image and should be spread throughout the image.
Class imbalance: Remote sensing images often have imbalanced
class distributions, where some land cover classes are much more
common than others. This can affect the performance of the
classification algorithm, as it may not be able to accurately predict the
less common classes.
Overall, data requirement for classification of remote sensing images requires
careful consideration of the spectral bands, spatial resolution, training sample
selection, class imbalance to achieve accurate land cover classification. In
addition of the above parameters, shape, location, number, placement and
uniformity factors also needs to be taken care while preparing the training
datasets for remote sensing image classification.
Shape: When preparing the training data, it is essential to consider shapes that
minimize the number of vertices, such as rectangles or polygons.
Location: The training areas should be strategically positioned to facilitate
accurate and easy transfer of their outlines from maps to the digital image.
Number: The ideal number of training areas depends on the number of classes
to be categorized. Usually, representing each class with 5-10 training areas
ensures the adequate representation of spectral properties for each category.
51| P a g e
Placement: The placement of training areas within the image is crucial,
allowing for precise location with respect to distinct features and boundaries
between different features on the image.
Uniformity: Within each training area, the data should demonstrate a unimodal
frequency distribution for every spectral band to be utilized.
4.4 Image classification techniques
A variety of classification methods have been developed and extensively
employed to create land cover maps (Figure 4.2). These methods differ in their
logic, including supervised and unsupervised, parametric and nonparametric, or
hard and soft (fuzzy) classification, as well as per-pixel, sub-pixel, and prefield
approaches. In the processing of remote sensing images, two primary types of
classification procedures are commonly utilized: supervised classification and
unsupervised classification. Although they can be used as independent
approaches, they are often combined into hybrid methodologies, leveraging
multiple methods simultaneously.
52| P a g e
Methods Examples
Parametric Maximum Likelihood
classification and Unsupervised
classification etc.
Non-Parametric Nearest-neighbor classification,
Fuzzy classification, Neural
networks and support Vector
machines etc.
Non-metric Rule-based Decision tree
classification
Supervised Maximum Likelihood, Minimum
Distance, and Parallelepiped
classification etc.
Unsupervised ISODATA and K-means etc.
Hard (parametric) Supervised and Unsupervised
classifications
Soft (non- Fuzzy Set Classification logic
Parametric)
Pre-Pixel
Object-oriented
Hybrid
Approaches
Figure 4.2: Remote sensing image classification techniques
(Mehmood et al. 2022)
53| P a g e
4.4.1 Supervised Classification
Supervised classification is a type of machine learning algorithm which
involves training a model, on a labelled dataset to predict the class of new,
unlabelled data. In supervised classification, the algorithm is given a set of
training data consisting of input features and its corresponding output labels.
The algorithm then learns to map the input features to their correct output
labels by minimizing a loss function, such as cross-entropy or mean squared
error.
The trained model can then be used to predict the class of new, unlabelled data
based on its input features. Supervised classification is widely used in various
fields such as image recognition, natural language processing, and fraud
detection. Common supervised classification algorithms include logistic
regression, decision trees, random forests, and support vector machines.
The basic steps involved in supervised classification are:
a. Data collection and pre-processing: The first step is to collect and
pre-process the dataset. This may involve cleaning and filtering the
data, removing outliers, and converting the data into a suitable format
for the algorithm.
b. Data splitting: The dataset is divided into distinct subsets, the
training set and the testing set. The training set is utilized to train the
algorithm, whereas the testing set is applied to assess the performance
of the trained model.
c. Feature extraction: Features are extracted from the data that can be
used to train the classification algorithm. Feature extraction involves
selecting the most relevant attributes of the data and transforming
them into a suitable format.
d. Model training: In the training set, the algorithm is utilized to predict
the output label using the input features. By minimizing a loss
function, the algorithm learns to accurately map the input features to
their corresponding output labels.
e. Model evaluation: The performance of the trained model is evaluated
using the testing set. The evaluation metrics used may vary depending
on the application, but commonly used metrics include accuracy,
precision, recall, and F1-score.
54| P a g e
f. Model deployment: The model can be deployed in a real-world
application to predict the class of new, unlabelled data after training
and evaluation.
Overall, supervised classification involves the iterative process of selecting and
pre-processing the dataset, extracting relevant features, training the model,
evaluating its performance, and its deployment in a real-world application.
There are several supervised classification algorithms available for assigning
an unknown pixel to one of the possible classes (m). The selection of a specific
classifier or decision rule depends on the characteristics of the input data and
the desired output, making it a crucial decision. Parametric classification algos
assume that the measured vectors (Xc) for each class in each spectral band
during the training phase of supervised classification follow a Gaussian or
normal distribution (Schowengerdt, 2007). In contrast, nonparametric
classification algorithms do not make such assumptions (Lu and Weng, 2007).
Some examples of nonparametric classification algorithms are:
- Parallelepiped
- Minimum distance
- Nearest-neighbour
- Neural network and expert system analysis
Among the widely used parametric classification algorithms is the maximum
likelihood algorithm.
4.4.1.1 Parallelepiped Classifier:
The parallelepiped classification approach is computationally simple approach.
For instance, two bands' digital number (DN) values are plotted in a scatter
diagram, similar to the minimum distance to mean classifier. In this method, a
rectangular box is created for each class. It is defined by the maximum and
minimum values (of each band), as depicted in Figure 4.3. The classification of
pixels is determined based on whether they fall inside any rectangular box
called as parallelepiped decision region or not. If a pixel falls within a
parallelepiped, it is assigned to that class. However, if a pixel falls within the
boundaries of more than one class, it is labelled as the overlap class. If the
pixel doesn't fit into any parallelepiped, it is assigned to the null class.
55| P a g e
Figure 4.3: Parallelepiped classification strategy
(Source: RS&GA: Lesson 12 Image Classification. http://ecoursesonline.iasri.
res.in/mod/page/view.php?id=2065. Accessed 15 June 2023)
In below mentioned example, three unknown pixels (A, B, and C) are
considered. Pixel A will be classified as the water class since it falls within the
parallelepiped of the water class. However, pixel B will be labelled as an
unknown class, and pixel C will be labelled as the overlap class. Overlapping
occurs due to high correlation or covariance between bands. Covariance refers
to the tendency of spectral values to vary similarly in other bands. This method
is less effective because spectral response patterns are often highly correlated,
resulting in significant covariance.
4.4.1.2 Minimum Distance to Mean Classifier:
The minimum distance to mean classifier is simple one and commonly used
classifier. It involves plotting of DN values of the training sets in a scattergram
(Figure 4.4).
56| P a g e
Figure 4.4: Minimum distance to means classification strategy
(Source: RS&GA: Lesson 12 Image Classification.
http://ecoursesonline.iasri.res.in/mod/page/view.php?id=2065. Accessed 15
June 2023)
The process involves plotting DN (Digital Number) values of various training
sets for different classes and calculating their means. When dealing with an
unknown pixel A, it is classified or assigned to a specific class by computing
the distance between the mean of each class and pixel A. Pixel A is then
assigned to the class whose mean value is closest to it. In Figure 4.4, for
instance, the unknown pixel A would be assigned to the sand class. This
process is applied to all pixels in the image, resulting in the classification of
various land use and land cover classes. For an n-Dimensional multispectral
data, an n-D scatter diagram is plotted and the mean of every class is
determined, and finally the image is classified based on the class with the
shortest distance. Euclidean distance is a commonly used method for
calculating the distance.
4.4.1.3 Nearest neighbor classifiers
Nearest neighbor classifiers, K-nearest neighbor (KNN) classifiers, and K-
nearest neighbor distance-weighted classifiers are all related algorithms that
rely on finding the nearest neighbors of a given data point in the training set to
57| P a g e
make predictions. Nearest neighbor classifiers make predictions by finding the
single nearest neighbor to a given data point in the training set and using its
label to predict the label of the new data point.
K-nearest neighbor classifiers find the k nearest neighbors to a given data point
(in the training set) and use a majority vote of their labels to predict the label of
the new data point. The value of k is a hyperparameter. It can be tuned to
optimize performance.
K-nearest neighbor distance-weighted classifiers are similar to KNN
classifiers, but they give more weight to the labels of the nearest neighbors that
are closer to the new data point. This means that the algorithm considers the
distances between the new data point and its nearest neighbors when making
predictions.
The main advantage of these algorithms is their simplicity and ease of
implementation, as they don't require a training phase or any assumptions
about the underlying data distribution. However, they can be computationally
expensive, it may also be suffering from the curse of dimensionality, if the
features are large. In addition, they may not perform well on imbalanced
datasets or datasets with noisy or irrelevant features.
4.4.1.4 Maximum likelihood classification
In maximum likelihood classification, it is assumed that the statistics for each
and every input class in each band follow a normal distribution. The algo
calculates the probability that a particular pixel belongs to a specific class. By
default, all pixels are classified until the user sets a probability threshold. Each
pixel is then assigned to the class with the highest probability. It is known as
the max. likelihood classification. If the highest probability is lower than the
threshold, the pixel remains unclassified.
Where:
i = indicated class
x = n-dimensional data
58| P a g e
p(ωi ) = probability that class ωi occurs in the image and it is same for
all classes
|Σi | = covariance matrix of the data in class ωi
Σi -1 = inverse matrix
mi = mean vector
Figure 4.5 : LISS 4 data (2016) over Delhi (LULC classification
using Maximum Likelihood Classification over Delhi)
(Source: Balha and Chander Kumar Singh, 2023)
The maximum likelihood method offers advantages from a probability theory
perspective, but certain considerations should be taken into account:
(1) Ground truth data must be sampled to estimate the mean vector and
variance-covariance matrix of the population.
(2) In cases of high correlation between two bands or the ground truth data is
highly homogeneous, the inverse matrix of the variance-covariance matrix may
become unstable. In such situations, it is advisable to reduce the number of
bands using principal component analysis.
59| P a g e
(3) The maximum likelihood method is not applicable when the population
distribution deviates from the normal distribution.
Figure 4.5 illustrates the application of the Maximum Likelihood classification
to the LISS 4 data over Delhi.
4.4.1.5 Feature Selection
After systematically gathering the training statistics from each band for the
classes of interest, the next step is to decide which bands are most effective in
distinguishing each class from the others. This procedure is known as feature
selection and can be accomplished using graphical or statistical methods.
Graphical methods of feature selection in remote sensing satellite data
involve visualizing the relationship between different spectral bands and the
target variable to identify the most informative bands. Here are some common
graphical methods of feature selection in remote sensing:
Spectral signature plots: Spectral signature plots are used to visualize
the spectral response of different land cover types in a particular
scene. A spectral signature plot of each band can be used to identify
the bands that show the strongest response for different land cover
types, indicating that the band is informative for classification.
Scatter plots: Scatter plots may be used view the relationship between
2 continuous variables. A scatter plot of each band against the target
variable can be used to identify the bands that show the strongest
correlation with the target variable.
Histograms: Histograms are used to visualize the distribution of a
continuous variable. Histograms of each band for each class of the
target variable can be used to identify bands that show significant
differences between classes, indicating that the band is informative for
classification.
Color composites: Color composites are used to visualize multiple
bands as a single RGB image. Color composites can be used to
identify bands that show the strongest contrast between different land
cover types, indicating that the band is informative for classification.
60| P a g e
Statistical methods of feature selection in remote sensing satellite data
involve evaluating the statistical significance of each band and selecting the
most informative bands based on a statistical criterion. Here are some common
statistical methods of feature selection in remote sensing:
Correlation analysis: Correlation analysis is used to evaluate the
strength of the linear relationship between two variables. A correlation
matrix can be computed between each spectral band and the target
variable, and bands with high correlation coefficients can be selected
as informative.
Mutual information: It is a statistical measure that evaluates the
amount of information that one variable provides for another variable.
Mutual information can be used to evaluate the information gain of
each spectral band with respect to the target variable. Bands with high
mutual information can be selected as informative.
Principal Component Analysis (PCA) is a widely employed statistical
technique for feature selection and for reducing the dimensionality. It
works by transforming the original features into a new set of
uncorrelated variables called principal components, which capture the
maximum variance in the data. The first principal component accounts
for the most variance, then the second principal component, and so on.
PCA can be utilized for feature selection by identifying the principal
components that explain the most variance in the data. For instance, if
the initial principal components account for a significant portion of
the total variance, the left ones can be discarded, as they may contain
relatively less essential information.
4.4.1.6 Logistic regression technique for image classification
Logistic regression is a supervised algo which can be used for image
classification tasks. In image classification using logistic regression, the input
image is represented as a vector of pixel values. The logistic regression model
then applies a linear transformation to this vector and applies the sigmoid
function to obtain a probability score for each category. The category with the
maximum probability is chosen as the predicted label for the image.
61| P a g e
The model parameters, including the weights and biases, are learned from a
labelled training dataset using a loss function such as binary cross-entropy or
categorical cross-entropy. The optimization process involves updating the
parameters using gradient descent to minimize the loss function.
One disadvantage of logistic regression for image classification is that it may
not be able to record complex nonlinear relationships in between the input
features and the output labels. In such cases, more complex models such as
neural networks may be more suitable. However, logistic regression can be a
good starting point for simple image classification tasks, and it is relatively
easy to interpret and implement.
4.4.1.7 Decision tree for image classification
In a decision tree for remote sensing image classification, each pixel in the
image is represented as a vector of spectral values, such as the intensity values
for different bands of the image. The decision tree algorithm then applies a
sequence of binary decisions based on thresholds to classify each pixel into one
of the predefined categories.
The decision tree algorithm constructs the tree by recursively splitting the
training data based on the input feature which provides the maximum
information gain. It measures the entropy reduction or uncertainty when a
feature is used to split the data. The splitting process continues until a stopping
criterion is reached.
One advantage of decision trees for remote sensing image classification is that
they can handle non-linear relationships in between the input features and the
output labels. They can also handle mixed data types, such as categorical and
continuous features. Decision trees are also relatively easy to interpret and can
provide insights into the decision-making process.
However, decision trees can be subjected to overfitting if the tree is too deep/if
the training data is biased/noisy. Ensemble methods, like random forests or
boosting, can help to mitigate the reported issues and improve the accuracy of
the classification.
62| P a g e
4.4.1.8 Support Vector Machine for classification
Support Vector Machines are a popular machine learning algorithm that can be
used for land use land cover classification tasks. SVMs are a type of binary
classifier that can be extended to handle multi-class classification problems.
In SVMs for land use land cover classification, each pixel in a satellite or aerial
image is represented as a vector of spectral values, such as the intensity values
for different bands of the image.
The SVM algorithm learns the optimal hyperplane by solving a quadratic
programming problem. The algorithm also uses regularization to minimise
overfitting and ensuring of generalization to new data.
One advantage of SVMs for land use land cover classification is their ability to
handle high dimensional data and nonlinear relationships between the input
features and the output labels. SVMs are also less prone to overfitting
compared to decision trees.
However, SVMs can be computationally expensive, mainly for huge datasets,
and may require careful tuning of hyperparameters to achieve optimal
performance (Neetu, and S. S. Ray, 2019). SVMs can also be sensitive to the
kernel function choosen and regularization parameters (Figure 4.6).
(a)
63| P a g e
(b)
(c)
(d)
Figure 4.6: Sentinel 2 (Feb 2019) derived False Color Composite Image (a)
and crop classification done using (b) CART, (c) Random Forest, and (d) SVM
of IARI Farm New Delhi, India (Source: Neetu and Ray, 2019)
64| P a g e
4.4.2 Unsupervised Classification
This method involves clustering similar pixels into groups based on their
spectral properties without any prior knowledge of the land cover classes. The
user then assigns these clusters to land cover classes based on their spectral
characteristics.
4.4.2.1 Unsupervised classification using chain method
The chain method is a widely used unsupervised classification technique for
remote sensing data analysis. It is an iterative approach that combines several
unsupervised classification algorithms to produce a more accurate and reliable
classification result.
The chain method typically involves the following steps:
1. Initial clustering: This step is to perform initial clustering to pre-processed
satellite data, using one of the unsupervised classification algorithms such
as K-means, ISODATA, or hierarchical clustering. This step produces an
initial classification result, which serves as the starting point for the
subsequent iterations.
2. Feature selection: The next step is to select the relevant features for
classification based on the initial clustering results. This step involves
analyzing the spectral characteristics of the clusters and selecting the
features that can discriminate between the different land cover classes.
3. Refinement: The third step involves refining the initial clustering result
using one or more unsupervised classification algorithms such as fuzzy C-
means, neural networks, or Markov random fields. This step helps to
improve the classification accuracy by incorporating spatial information
and reducing the effects of mixed pixels.
4. Validation: The final step is to validate the classification result using
ground truth data and statistical metrics (Like overall accuracy, kappa
coefficient, etc.).
The chain method is a powerful technique for unsupervised classification of
remote sensing data, as it combines the strengths of different algorithms to
produce a more accurate and reliable result. However, it can also be time-
consuming and computationally intensive, particularly for large datasets.
65| P a g e
4.4.2.2 ISO Data clustering algorithm
ISO cluster algorithm can be used for image classification by assigning a label
to each cluster based on the characteristics of the pixels in the cluster. Here are
the steps to use the ISO cluster algorithm for image classification:
1. Convert the image to a grayscale image if it is a color image.
2. Define the number of clusters you want to create. This is the only
parameter you need to set for the algorithm.
3. Randomly select initial cluster centers (Figure 4.7 & 4.8).
4. Calculate the distance between each pixel and each cluster center
using a distance metric such as Euclidean distance.
5. Assign each pixel to the cluster with is closest to the center.
6. Recalculate the cluster centers based on the pixels assigned to them.
Repeat steps 4-6 until the algorithm converges, which means that the cluster
centers no longer move or move less than a predefined threshold (Figure 4.8 a
& b).
Figure 4.7: ISO data arbitrary clusters
(Source: GISRSSTUDY. “What Is ISODATA - ISODATA Clustering
Method.” GISRSStudy, 14 June 2022, https://gisrsstudy.com/isodata/.)
66| P a g e
a)
b)
Figure 4.8 a: ISO data first pass, b. ISO data second pass
(Source: GISRSSTUDY. “What Is ISODATA – ISODATA Clustering
Method.” GISRSStudy, 14 June 2022, https://gisrsstudy.com/isodata/.)
Once the algorithm converges, the clusters can be used for image classification.
Each cluster represents a group of pixels that are similar to each other. User
can assign a label to each cluster based on the characteristics of the pixels in
the cluster. For example, for classifying satellite images, it can assign labels
such as "forest", "water", or "urban" to each cluster.
To classify a new image, the same clustering algorithm cab be used and the
labels assigned to each cluster can be used to classify the new image. User can
67| P a g e
assign the label of the cluster that the majority of the pixels in the new image
belong to as the label of the new image.
It's important to note that the ISO cluster algorithm is a basic clustering
algorithm and may not perform as well as more advanced algorithms for image
classification. User may need to experiment with different parameters and
distance metrics to achieve the desired results.
4.4.2.3 K-means unsupervised classification
The K-Means algorithm is a type of unsupervised classification used to
partition data into distinct clusters. It begins by initializing class means that are
evenly distributed inside the data space. The pixels are then iteratively assigned
to the nearest class based on a minimum distance technique. After each
iteration, the class means are recalculated, and pixels are reclassified
accordingly. Generally, all pixels are assigned to the nearest class unless
specific criteria, such as a standard deviation / distance threshold, are specified.
In such cases, some pixels might remain unclassified if they fail to meet the
given criteria. The algorithm continues iterating until the number of pixels in
each class changes by less than the designated pixel change threshold /the
maximum number of iterations is reached (refer to Figure 4.9).
Figure 4.9 Multispectral images of Ahmedabad city (Resourcesat , 2015) (a)
Image differencing (b) Image ratio (c) watershed segmentation (d) K-mean
clustering (Yellow colour - changes) (Source: Mewada et al. 2020)
It is important to note that the performance of the algorithm depends on the
initial values of the centroids, and the algorithm may converge to a local
minimum instead of the global minimum. Therefore, multiple initializations
with different centroid values are recommended to improve the robustness of
the algorithm.
68| P a g e
4.4.3 Pixel based or object-based classification
Pixel-based and the object-based classification, are the two approaches for
classification of remote sensing images. The main differences between the two
approaches are in the scale of the analysis and the features used for
classification.
Pixel-Based Classification:
Pixel-based classification is the traditional approach for classification of
remote sensing images. In pixel-based classification, every pixel in the image
is assigned to a land cover class based on its spectral values. The spectral
values are derived from the spectral bands of the image, and the classification
algorithm makes a decision based on the spectral values of each pixel. Pixel-
based classification assumes that the spectral characteristics of a given land
cover class are homogeneous and can be characterized by the pixel’s spectral
values.
Object-Based Classification:
Object-based classification takes into account the spatial context of the pixels
by grouping them into objects or image segments. Image segments are created
by grouping pixels with similar spectral values and spatial proximity. Object-
based classification considers the spectral values, texture, shape, and context of
each object to make a classification decision. Object-based classification
assumes that land cover classes are not characterized solely by spectral values
but also by the spatial context of the objects.
The advantages of pixel-based classification include simplicity, speed, and the
ability to handle large datasets. However, it can be affected by mixed pixels,
noise, and variability in the spectral characteristics of land cover classes.
Object-based classification, on the other hand, can provide more accurate
classification results by taking into account the spatial context of the objects. It
can also reduce the effect of mixed pixels and improve the classification of
heterogeneous land cover classes. However, it can be computationally
intensive and require more data pre-processing and feature extraction.
Overall, the choice of the classification approach depends on the specific
requirements of the remote sensing application, such as the spatial and spectral
69| P a g e
resolution of the image, the complexity of the land cover classes, and the
available computational resources.
4.4.4 Hybrid methods for image classification
Hybrid methods for image classification combine multiple approaches,
including supervised and unsupervised methods, to improve the accuracy and
efficiency of image classification. Some of the commonly used hybrid methods
for image classification include:
Decision tree-based classification: Decision trees can be used to
combine supervised and unsupervised classification methods by first
applying an unsupervised classification method to the data and then
using a decision tree to assign class labels to the resulting clusters.
This method can be more efficient than pure supervised classification
as it reduces the number of training samples required.
Ensemble classification: Ensemble classification combines multiple
classification models to produce a more accurate classification result.
This method involves training multiple classifiers, and then
combining their predictions using methods such as majority voting or
stacking.
Deep learning-based classification: Deep learning-based classification
combines deep neural networks with supervised or unsupervised
methods to perform image classification. This approach can be used
for both pixel-based and object-based classification and has been
shown to achieve high accuracy on large-scale datasets.
Hybrid clustering: Hybrid clustering combines the strengths of
different clustering algorithms to improve the clustering results. This
approach involves first applying an unsupervised clustering algorithm,
such as K-means or hierarchical clustering, and then refining the
results using a supervised clustering algorithm, such as fuzzy
clustering or self-organizing maps.
Overall, hybrid methods for image classification can improve the accuracy and
efficiency of image classification by combining the strengths of different
approaches. However, the selection and combination of methods should be
70| P a g e
based on the specific characteristics of the data and the classification
objectives.
4.5 Image fusion techniques
The process of combining multiple images of the same scene to create a single
image that contains more information than any of the individual images is
known as image fusion techniques. There are several types of image fusion
techniques (Prasad et al., 2001), including multi-sensor image fusion, Multi-
resolution image fusion, multi-focus image fusion, Principal component
analysis (PCA) based image fusion and Wavelet-based image fusion.
4.6 Accuracy Assessment
Accuracy assessment holds great significance in remote sensing image
classification as it plays a pivotal role in evaluating the dependability and
effectiveness of classification outcomes. This process entails the comparison of
the classified image with a reference dataset to ascertain the level of accuracy
achieved. Through accuracy assessment, valuable insights can be obtained
regarding the quality of the classification, identification of errors, and potential
enhancements in algorithms and parameters.
Accuracy assessment commonly employs the confusion matrix, which serves
as a valuable tool. This matrix measures the level of agreement and
discrepancies between the classified image and the reference dataset. It
comprises four components: true positives, false positives, true negatives, &
false negatives. True positives (TP) are the count of pixels or instances
accurately classified as positive or belonging to a particular class. True
negatives (TN) denote the number of pixels or instances correctly classified as
negative or not belonging to a specific class. False positives (FP) correspond to
the tally of pixels or instances inaccurately classified as positive or belonging
to a particular class when they should not belong to that class. False negatives
(FN) indicate the quantity of pixels or instances inaccurately classified as
negative or not belonging to a specific class when they should belong to that
class. These components represent pixels that are correctly classified, correctly
unclassified, misclassified, and missed, respectively. The confusion matrix
enables the calculation of various accuracy metrics, including overall,
producer's, and user's accuracy. Overall accuracy shows the proportion of
71| P a g e
pixels correctly classified out of the total pixels. Confusion matrix can be
explained with the below mentioned example:
Forest Water Urban
Forest 9000 200 100
Water 150 9500 1000
Urban 50 300 8500
For instance, in this scenario where the total number of pixels is 30,000 and the
correctly classified pixels amount to 26,450, the overall accuracy is 88.2%.
Producer's accuracy indicates the classification's reliability for each class by
measuring the correct identification of each class. To illustrate, for the "forest"
class, the producer's accuracy is computed as TP (9000) divided by the sum of
TP (9000) and FN (150), resulting in 98.4%. On the other hand, user's accuracy
reflects the likelihood of a pixel being correctly classified for each class. For
instance, considering the "water" class, the user's accuracy is calculated as TP
(9500) divided by the sum of TP (9500) and FP (200), yielding 97.5%. By
evaluating these accuracy metrics, users can assess classification algorithm
performance, identify error sources, and refine the classification process to
enhance accuracy and reliability in subsequent analyses.
Summary
Remote sensing image classification methods are explained in this chapter.
Supervised & unsupervised classification techniques are explained with the
help of examples. Mixed approach i.e., hybrid method of applying both
supervised and unsupervised classification techniques is illustrated. The
classification methods can be used to classify the satellite images into different
classes.
72| P a g e
Chapter 5
Automatic Information Extraction
5.1 Introduction
Remote sensing has revolutionized the way we observe and understand the
Earth's surface. It has become a crucial tool for various applications such as
land use/cover mapping, disaster management, and natural resource
management. However, interpreting and analyzing the vast amount of data
generated by remote sensing platforms is a challenging task. The need for
automatic information extraction in remote sensing arises from the sheer
volume and complexity of the data that is collected through remote sensing
technologies. Remote sensing data can cover vast areas and provide a wealth of
information that is critical for various applications, such as environmental
monitoring, disaster response, and resource management. Manual analysis of
this data is time-consuming, expensive, and prone to errors. Automated
methods of information extraction can quickly process large amounts of data
and extract relevant information, providing real-time or near-real-time analysis
with greater accuracy and precision. Automated methods are also cost-
effective, consistent, and can handle the complexity of remote sensing data
more efficiently than manual methods. Automatic information extraction in
remote sensing is crucial for timely, accurate, and cost-effective analysis of
large and complex datasets.
5.2 Feature Extraction
Feature extraction in remote sensing involves identifying and extracting
meaningful information or features from remotely sensed images or data
collected through satellites, aircraft, UAV or other platforms. Remote sensing
data can be in the form of images, such as satellite images or aerial
photographs, or as digital data that represents the reflectance or emission of
electromagnetic radiation from the Earth's surface. The data collected can
include a wide range of information, such as the location, shape, size, and
spectral characteristics of different features on the Earth's surface.
Feature extraction in remote sensing involves using image processing
techniques to identify and extract relevant features from the raw data. These
73| P a g e
Feature Extraction
Semi-automatic feature Fully-automatic feature
extraction extraction
features can include natural features such as forests, rivers, and mountains, as
well as man-made features such as buildings, roads, and agricultural fields.
There are two main approaches to feature extraction: semi-automatic and fully
automatic.
Figure 5.1 Two main approaches to feature extraction techniques
1. Semi-automatic feature extraction involves a combination of manual
and automated methods. In this approach, an analyst or expert first
identifies features of interest in the image, such as buildings, roads, or
vegetation, using their knowledge and expertise. Then, they use
specialized software tools to automatically extract these features from the
image. Semi-automatic feature extraction can be more accurate and
efficient than manual methods alone, while still allowing for human
oversight and input.
2. Fully automatic feature extraction, uses entirely automated methods to
identify and extract meaningful information from satellite or aerial
imagery without human intervention. This approach typically involves the
use of advanced algorithms and machine learning algorithms, such as
neural networks or decision trees, that have been trained on large datasets
of annotated images. Once trained, these algorithms can automatically
identify and extract features of interest from new images, without the
need for human input. Fully automatic feature extraction can be faster and
more scalable than semi-automatic methods, but may be less accurate in
some cases.
Both semi-automatic and fully automatic feature extraction methods have their
advantages and disadvantages, and the choice of approach will depend on the
specific application and requirements of the analysis.
74| P a g e
Figure 5.2: Automatic Feature extraction methods
There are several approaches to automatic feature extraction in remote
sensing, which are mainly divided in to pixel-based and object-based methods
(Figure 5.2), while object-based methods involve grouping pixels together to
form objects and then analysing those objects to extract features.
Pixel-based Methods: Pixel-based methods involve analyzing individual
pixels in a remotely sensed image to extract information about the features on
the ground. For example, in a multispectral image, each pixel represents a
different combination of reflectance values across multiple spectral bands
(Figure 5.3). By examining the spectral values of each pixel, we can identify
and classify different features on the ground, such as land cover types,
vegetation density, and water bodies. These methods include spectral indices,
principal component analysis, and machine learning algorithms such as
decision trees and random forests. Pixel-based methods are commonly used in
remote sensing applications such as land cover mapping, environmental
monitoring, and urban planning.
Figure 5.3 A multispectral Landsat image processed to produce a land cover
classification
75| P a g e
Object-based Methods: Object-based feature extraction methods involve the
segmentation of an image into homogeneous regions, followed by the
extraction of features from these regions to represent their spectral, spatial, and
contextual properties. The process involves the classification of pixels by
considering their spectral characteristics, shape, texture, and spatial
relationship with nearby pixels. Object-based classification methods have been
developed more recently in comparison to traditional pixel-based classification
techniques. While pixel-based classification relies solely on the spectral
information of individual pixels, object-based classification utilizes
information from a group of similar pixels known as objects or image objects.
Image objects or features are clusters of pixels that share similarities in spectral
properties (such as color), size, shape, texture, and context within their
neighbouring area. This classification approach aims to replicate the type of
analysis performed by humans during visual interpretation.
Object-based feature extraction methods segments an image by grouping
pixels. It doesn’t create single pixels. Instead, it generates objects with
different geometries. If you have the right image, objects can be so meaningful
that it does the digitizing for you. For example, the segmentation results shown
in Figure 5.4 shows roads, buildings and vegetation.
Figure 5.4: Left: Original satellite image. Right: Semantic segmentation of
roads, building and vegetation (source: Ng, Virginia & Hofmann, Daniel.
(2018). Scalable Feature Extraction with Aerial and Satellite Imagery. 145-
151. 10.25080/Majora-4af1f417-015.)
76| P a g e
These methods typically involve the following steps:
1. Image segmentation: This is the process of dividing an image into
regions based on their spectral and spatial properties. Various
segmentation algorithms can be used such as watershed, region
growing, and mean shift.
2. Object classification: Once the objects have been defined, they are
classified into different categories based on their spectral and contextual
properties. Classification can be performed using supervised or
unsupervised methods.
3. Feature extraction: Features are extracted from each object to
represent its spectral, spatial, and contextual properties. These features
can include texture, shape, size, context, and spectral values.
4. Feature selection: This step involves selecting the most relevant
features for the analysis. Feature selection can be performed using
various techniques such as correlation analysis and principal component
analysis.
5. Object-based analysis: The extracted and selected features are used to
perform various analyses such as change detection, object recognition,
and object tracking.
Object-based feature extraction methods offer several advantages over pixel-
based methods, including the ability to incorporate contextual information,
improved accuracy, and reduced noise. However, they can be computationally
intensive and may require significant expertise to implement effectively.
Object-based feature extraction methods are commonly used for applications
such as land cover mapping, urban growth analysis, and vegetation monitoring.
5.2 Techniques used for feature extraction
There are several techniques used for feature extraction in remote sensing,
which can be broadly classified into three categories: spectral, spatial, and
temporal.
5.2.1 Spectral feature extraction: Spectral feature extraction involves the use
of spectral bands to extract information about the properties of the target. Each
spectral band represents a range of wavelengths that corresponds to a specific
region of the electromagnetic spectrum. The most commonly used spectral
77| P a g e
bands in remote sensing are the visible, near-infrared, and thermal infrared
bands. Spectral feature extraction techniques include:
a) Principal component analysis (PCA): PCA is a widely used technique that
reduces the dimensionality of the data by transforming the original bands into a
set of uncorrelated principal components. These principal components are
ordered in such a way that the first principal component accounts for the
maximum variance in the data, the second principal component accounts for
the maximum remaining variance, and so on. The first few principal
components often contain the most relevant information about the target.
Figure 5.5 Schematic of PCA transformation. Original data space presented on the left
with 3 (input) variables transformed to a component space with lower dimension and
1 and pc2 being the axes of the coordinate (source: Ghasemi, P.; Aslani, M.; Rollins,
D.K.; Williams, R.C. Principal Component Neural Networks for Modeling, Prediction,
and Optimization of Hot Mix Asphalt Dynamics Modulus. Infrastructures 2019, 4, 53.
https://doi.org/10.3390/infrastructures4030053)
b) Spectral indices: Spectral indices are mathematical formulas that combine
the values of two or more spectral bands to extract specific information about
the target. Examples of spectral indices include the normalized difference
vegetation index (NDVI) and the soil-adjusted vegetation index (SAVI)
(Figure 5.6), which are used to estimate vegetation cover and health.
78| P a g e
a)
b)
Figure 5.6 Generation of Spectral Indices on Landsat Image to estimate
vegetation cover and health a) Landsat Image b) NDVI c) SAVI (source:
https://grindgis.com/blog/vegetation-indices-arcgis)
c) Endmember extraction: Endmember extraction involves the identification
of pure spectral signatures of the target, which can be used to classify the target
into different categories. Endmembers can be extracted using various
techniques,
c) such as linear spectral unmixing and vertex component analysis.
5.2.2 Spatial feature extraction: Spatial feature extraction involves the
analysis of the spatial distribution of the target. Spatial features can provide
information about the size, shape, texture, and pattern of the target. Spatial
feature extraction techniques include:
a) Texture analysis: Texture refers to the spatial arrangement of pixels in an
image. Texture analysis involves analyzing the patterns and variations in pixel
values to identify and classify features. This can be done using techniques such
as grey-level co-occurrence matrices (GLCM), wavelet transforms, and fractal
79| P a g e
analysis. Texture analysis can be used to distinguish between different land
cover types, such as forests, grasslands, and agricultural fields.
b) Object-based image analysis (OBIA): OBIA involves the segmentation of
the image into objects based on their spectral and spatial properties. Objects
can be classified based on their size, shape, texture, and other attributes. OBIA
can be used as a semi-automatic or fully automatic method, depending on
whether the object selection and classification are done manually or
automatically. OBIA can be used to extract features such as roads, buildings,
and water bodies.
Land use/ land cover
De Op
De Op
IRS LISS –IV data Cartosat VHR data
Fa Fores
Op De
IRS LISS –IV data Cartosat VHR data
Figure 5.7 Temporal Change Analysis for Aguiboni GP, West Bengal
5.2.3 Temporal feature extraction: Temporal feature extraction involves the
analysis of the temporal changes in the target over time. Temporal features can
provide information about the dynamics of the target, such as growth, decay,
and disturbance. Temporal feature extraction techniques include:
a) Change detection: Change detection involves the identification of changes
in the target between two or more images taken at different times. Change
80| P a g e
detection can be done using various techniques, such as image differencing,
image ratioing, and vegetation indices.
b) Time series analysis: Time series analysis involves the analysis of a
sequence of images taken at regular intervals over time. Time series analysis
can be used to extract information about the seasonal patterns of vegetation, the
growth and development of crops, and the dynamics of land cover changes.
Feature extraction in remote sensing involves the identification and extraction
of relevant information from the raw data. Spectral, spatial, and temporal
feature extraction techniques are commonly used in remote sensing analysis
and can be used in combination to extract comprehensive information about the
target.
Machine learning techniques: In addition to above, recently machine learning
techniques has gained popularity for extracting information from remote
sensing data. Machine learning techniques can be used for feature extraction in
remote sensing to automatically identify and extract relevant information from
images. Here are some common machine learning techniques used for feature
extraction in remote sensing:
1. Convolutional neural networks (CNNs): CNNs are a type of deep
learning algorithm that have proven to be highly effective for feature extraction
in remote sensing. CNNs learn to identify patterns in the data by passing the
image through a series of convolutional layers, which apply filters to the image
to identify features such as edges, textures, and shapes. The output of the
convolutional layers is then passed through one or more fully connected layers
to perform classification or regression tasks. CNNs can be trained using
labeled data to learn to recognize specific features or classes in the image.
2. Random forests: Random forests are an ensemble learning algorithm that
can be used for feature extraction in remote sensing. Random forests consist of
a collection of decision trees, where each tree is trained on a random subset of
the data. The trees then vote on the final classification or regression output.
Random forests can be used to identify the most important features in the
image based on their contribution to the classification or regression task.
81| P a g e
3. Support vector machines (SVMs): SVMs are a type of supervised
learning algorithm that can be used for feature extraction in remote sensing.
SVMs work by finding the hyperplane that maximally separates the different
classes in the data. SVMs can be used for both classification and regression
tasks, and can handle high-dimensional data of remote sensing images.
4. Autoencoders: Autoencoders are a type of neural network that can be used
for unsupervised feature extraction in remote sensing. Autoencoders consist of
an encoder network that learns to compress the input image into a lower-
dimensional representation, and a decoder network that learns to reconstruct
the original image from the compressed representation. The compressed
representation can be used as a feature vector for downstream tasks such as
classification or clustering.
5. Transfer learning: Transfer learning involves using a pre-trained machine
learning model, such as a CNN, on a related task to extract features from
remote sensing images. The pre-trained model is fine-tuned on the remote
sensing data by retraining the final layers of the network for the specific
classification or regression task. Transfer learning can be effective for tasks
where there is limited labeled data available, as it can leverage the features
learned on a related task to improve performance.
Machine learning techniques can be powerful tools for feature extraction in
remote sensing, as they can learn to identify complex patterns in the data and
extract relevant information automatically.
82| P a g e
Summary:
Automatic feature extraction plays a crucial role in remote sensing by enabling
efficient and accurate analysis of large amounts of complex data. With the
growing availability of high-resolution satellite and aerial imagery, the need for
automated feature extraction has become increasingly important. Manual
feature extraction can be time-consuming, subjective, and error-prone, making
it impractical for analysing large datasets. Automated feature extraction
algorithms can quickly and objectively identify features of interest, such as
land cover types, objects of interest, and changes over time. This can provide
valuable information for a range of applications, including environmental
monitoring, urban planning, and disaster response. The accuracy and
consistency of automated feature extraction can help to improve the reliability
and efficiency of remote sensing analysis. It has the potential to significantly
reduce the time and cost involved in analysing satellite or aerial imagery,
making it an increasingly important area of research and development in the
field of remote sensing.
83| P a g e
Chapter 6
SAR Image Processing
6.1 Introduction
Microwave remote sensing utilizes electromagnetic radiation in the
microwaves range, which has a wavelength between 1 cm and 1 m. These
microwaves possess the ability to penetrate through clouds, fog, and other
obstructing substances such as ash or powder coverages (e.g., during a volcanic
eruption or building collapse). This unique characteristic enables microwave
remote sensing to operate effectively in any weather condition or environment.
Microwave remote sensing systems can be categorized into two groups:
passive and active. Passive systems capture naturally emitted radiation from
the observed surface, although the amount of energy emitted at microwave
frequencies is usually minimal. These systems generally have lower spatial
resolutions.
On the other hand, active systems feature their own source (transmitter) that
illuminates the observed scene, allowing their operation during both day and
night, regardless of sunlight availability. The sensor transmits a radio signal
within the microwave spectrum and records the portion of the signal that is
backscattered by the target and returned to the sensor. By analyzing the power
and timing of the backscattered signal, different targets within the scene can be
distinguished, and the distance to the target can be measured. This operational
principle is known as RADAR (short for RAdio Detection and Ranging), and it
enables the acquisition of microwave images of the observed scene.
The most commonly employed microwave imaging sensor is Synthetic
Aperture Radar (SAR), a radar system that provides high-resolution microwave
images. SAR images have distinct characteristics compared to conventional
optical images acquired in the visible or infrared bands. Consequently, radar
and optical data can complement each other, offering different informative
contributions.
84| P a g e
Synthetic Aperture Radar (SAR) employs different frequency bands for remote
sensing, each with distinct characteristics and applications (Figure 6.1). The X-
band (9.6 GHz) offers high-resolution imaging suitable for land cover
classification, coastal monitoring, and infrastructure analysis. The C-band (5.3
GHz) excels in urban monitoring, agriculture, and soil moisture detection due
to its ability to penetrate vegetation. The S-band (3.1 GHz) is effective for
forest mapping, biomass estimation, and flood monitoring. The L-band (1.25
GHz) penetrates vegetation and soil well, making it useful for land deformation
monitoring, subsurface imaging, and forest structure analysis. Additionally, the
P-band (0.3-0.7 GHz) has great potential for soil moisture estimation,
subsurface imaging, and ice monitoring due to its longer wavelength (Figure
6.2). The choice of SAR band depends on the specific objectives and
environmental characteristics of the remote sensing application.
Figure 6.1: Utilisation of Different SAR Wavelength for various application
Figure 6.2: Penetration of different SAR wavelength (Source: NASA SAR
Handbook.)
85| P a g e
6.1.1 Radar Equation
A radar equation explains the physical dependences of the range of a radar to
the characteristics of the transmitter, receiver, antenna, target, and distance
between the radar and target.
R= Range λ= Wavelength
Ps= Transmitted Power σ =Radar Cross
G=Antenna Gain Section
P e = Reflected Power
Radar Cross Section is the ratio of the backscattered energy to the energy that
the sensor would have received if the target was an ideal isotropic reflector.
6.2 SAR Data Processing:
Radiometric calibration of microwave satellite images involves the conversion
of raw digital numbers (DNs) into physically meaningful units, such as
backscatter coefficients or radar cross-section, that can be used for scientific
and operational purposes. The following are some commonly used terms in
radiometric calibration of microwave satellite images:
Digital number (DN): The raw measurement of the satellite sensor,
usually in the form of a digital number (DN), which represents the
amplitude of the received signal.
Sigma naught (σ0): The backscatter coefficient, or radar cross-section,
normalized to the incidence angle, which is a measure of the radar
reflectivity of the surface.
Beta naught (β0): The backscatter coefficient, or radar cross-section,
normalized to unit area, which is a measure of the radar reflectivity of
the surface.
Gamma naught (γ0): The backscatter coefficient, or radar cross-
section, normalized to both the incidence angle and unit area, which is
a measure of the radar reflectivity of the surface.
6.2.1 Multi-Looking
In microwave imaging, the spatial resolution is limited by the wavelength of
the radar signal, and it is usually coarser than that of optical images. The
speckle noise is caused by the interference of the electromagnetic waves
86| P a g e
reflected from the different surface elements of the target, which creates a
grainy appearance in the image.
Multi-looking works by averaging the radar signal from adjacent pixels within
a defined window or kernel, which reduces the speckle noise and increases the
signal-to-noise ratio. The result is a smoother image with improved spatial
resolution, but at the cost of reduced radiometric resolution due to averaging.
6.2.2 Speckle
It is a type of noise that is commonly found in Synthetic Aperture Radar (SAR)
images, which is caused by the interference of electromagnetic waves
reflecting off of different surfaces. The presence of speckle in SAR images can
make it difficult to extract useful information from the image. Therefore,
various speckle suppression filters have been developed to reduce the noise and
improve the overall quality of the image. Some of the most common types of
speckle suppression filters used in SAR image processing are.
Median Filter: Median filter is commonly used in Synthetic Aperture
Radar (SAR) image processing to reduce speckle noise. By replacing
each pixel's value with the median value of its neighboring pixels, the
filter effectively preserves edges and fine details while smoothing the
overall image. It improves SAR image quality and enhances
interpretability
Lee Filter: The Lee filter is a commonly used speckle reduction filter
that applies a statistical approach to reduce speckle in SAR images.
The filter works by using a sliding window technique to calculate the
mean and variance of the image pixels. The filter then compares each
pixel to the mean value of the surrounding pixels and replaces it with
a filtered value based on this comparison. This filter is effective at
reducing speckle while preserving the edges and details in the image
(Figure 6.3).
Frost Filter: The Frost filter works by assuming that the speckle in
the image is a combination of a signal and additive white Gaussian
noise. The filter then estimates the parameters of the noise model and
removes the noise from the image. This filter is effective at reducing
speckle in homogeneous regions of the image.
87| P a g e
Gamma Map Filter: The Gamma Map filter is a speckle reduction
filter that works by modelling the statistical properties of the image.
The filter calculates a gamma map for the image, which is used to
weight the contribution of each pixel to the filtered image. The filter
effectively reduces speckle while preserving image details and edges.
Kuan Filter: The Kuan filter is a speckle reduction filter that uses a
combination of adaptive filtering and non-linear diffusion to reduce
speckle in SAR images. The filter works by dividing the image into
small blocks and using the standard deviation of the pixel values
within each block to determine the degree of filtering applied. This
filter is effective at reducing speckle while preserving the sharpness of
edges in the image.
EOS 4 on 28-Mar-2023, EOS 4 on 28-Mar-2023 ,
Kanpur(Sigma Naught) Kanpur(Lee Filter)
Figure 6.3: EOS 4 image of 28 March 2023 over Kanpur
6.3 Types of Image Distortions
Layover occurs when a tall object such as a building or a mountain is
imaged, and its top appears closer to the radar than its bottom. This
creates an illusion of compression, and the image appears as if the top
of the object is leaning over the bottom.
Foreshortening, on the other hand, occurs when an object is imaged
from an oblique angle, and its apparent height appears shorter than its
actual height. This distortion occurs because the radar signal takes a
longer path when it hits the top of the object than when it hits the
bottom.
Shadow is a third distortion that occurs when an object blocks the
radar signal, creating a dark area in the image. This can make it
88| P a g e
difficult to interpret the image, and shadow detection algorithms are
used to identify and remove or minimize the impact of shadows
(Figure 6.4).
Figure 6.4: Image distortion observed in SAR
Geometric distortions such as foreshortening, layover, shadow and other
problems related to special imaging geometry of radar systems, decrease
reliability of radar imageries. Thus, radiometric and geometric corrections and
calibrations must be applied to the radar images before using them.
6.4 Types of Correction
Terrain Correction: Range Doppler Correction, Range Terrain Flattening and
Geocoding: Range Doppler Correction is a process that corrects the distortions
in SAR images caused by the motion of the satellite and the rotation of the
Earth. The motion of the satellite causes a shift in the frequency of the radar
signal, known as the Doppler shift, which leads to range compression or
expansion in the image. The rotation of the Earth causes a change in the
direction of the radar beam, leading to azimuth distortions in the image.
Range Doppler Correction compensates for these distortions by applying a
mathematical transformation to the radar signal. The transformation corrects
for the Doppler shift and azimuth distortions, and produces an image that is
free of motion-related distortions.
After the Range Doppler Correction, the RTF and Geocoding processing steps
can be applied to further improve the quality of the image. RTF removes the
89| P a g e
effect of terrain on the radar signal and produces a terrain-flattened image,
while Geocoding transforms the image coordinates to geographic coordinates
for accurate positioning of the image in geographic space.
6.5 SAR Polarimetry and Decomposition
6.5.1 SAR Polarimetry
It is the science of acquiring, processing and analyzing the polarization state of
an electromagnetic field including the magnitude and relative phase. SAR
polarimetry is concerned with the utilization of polarimetry in radar
applications (Table 6.1).
Table 6.1: SAR Polarimetry
Received Polarization
Transmitted Polarization
H or V H and V H and V and relative
phase
Single [1] Dual [2]
H or V Dual Polarimetric
Pol Pol
Dual [2] Quad [4]
H and V
pol Pol Fully Polarimetric
In SAR (Synthetic Aperture Radar) remote sensing, the decomposition of the
measured signal into different scattering mechanisms is an essential step for
understanding the physical properties of the imaged scene. There are two main
types of decomposition techniques: coherent decomposition and incoherent
decomposition.
6.5.1.1 Coherent Decomposition
It is based on the eigenvalue analysis of the coherency matrix (Figure 6.5). The
coherency matrix contains the second-order statistics of the backscattered
electromagnetic waves and describes the polarimetric properties of the
scattering medium. The eigenvalues and eigenvectors of the coherency matrix
are used to decompose the scattering mechanisms into coherent scattering
mechanisms. Coherent decomposition can be used to identify the different
90| P a g e
scattering mechanisms in a scene, such as surface scattering, double-bounce
scattering, volume scattering, and helix scattering.
Figure 6.5: Coherent Scattering
6.5.1.2 Incoherent Decomposition
It also known as the decomposition of the covariance matrix, is a statistical
approach that does not require the coherent phase information of the
backscattered signals (Figure 6.6). The most commonly used incoherent
decomposition technique is the Freeman-Durden decomposition, which is
based on the assumption that the backscattered signals from a scene can be
decomposed into three different scattering mechanisms: surface scattering,
double-bounce scattering, and volume scattering.
Figure 6.6: Incoherent Scattering
Both coherent and incoherent decomposition techniques have their advantages
and limitations, and the choice of the decomposition technique depends on the
specific application and the physical properties of the scene being imaged.
91| P a g e
Coherent decomposition provides a more detailed and accurate decomposition
of the scattering mechanisms, but it requires a high signal-to-noise ratio and a
stable coherent phase of the backscattered signals. In contrast, incoherent
decomposition is less sensitive to noise and phase errors but provides less
detailed information on the scattering mechanisms
6.5.2 SAR Decomposition Process
This process of decomposition involves several steps, as outlined below:
6.5.2.1 Multi-looking: The SAR data is often multilooked to reduce speckle
noise and improve the signal-to-noise ratio. Multilooking involves averaging
the complex data over a certain number of looks in both the range and azimuth
directions.
6.5.2.2 Scattering matrix: The fully polarimetric SAR data contains the
complete scattering matrix for each pixel, which provides detailed information
on the scattering properties of objects on the ground (Figure 6.7). The
scattering matrix is a 2x2 matrix that describes the relationship between the
transmitted and received electromagnetic waves. Scattering matrix is used to
identify the scattering behaviour of objects after an interaction with
electromagnetic wave. This matrix is represented by a systematic combination
of horizontal and vertical polarization states of transmitted and received
signals.
( )=
Sometimes scattering mechanism which is not observable with linear basis can
be enriched through transformation of basis from linear to Pauli for
representation of scattering information. Pauli basis is defined by sum and
difference of co-pol terms and twice the cross-pol term. The feature vector in
Pauli basis is given by
= √2
92| P a g e
/
+
= −
2
6.5.2.3 Coherency matrix: The coherency matrix is derived from the scattering
matrix and is a 2x2 matrix that contains the polarimetric information. It is
given by:
‘T
T = KP.
Where ‘T represents the Conjugate Transpose.
where T11, T12, T21, and T22 are complex values. The coherency matrix can
be used to calculate various polarimetric parameters, such as the polarimetric
entropy, degree of polarization, and polarimetric coherence.
6.5.2.4 Covariance matrix: The covariance matrix is another representation of
the polarimetric data and is defined as the Hermitian transpose of the
coherency matrix. It is given by:
‘T
C = KL. L
Where ‘T represents the Conjugate Transpose.
93| P a g e
where C11, C12,C13 are the complex covariance values. The covariance
matrix can be used to calculate various polarimetric parameters, such as the co-
polarization ratio, cross-polarization ratio, and orientation angle.
6.5.3 Types of Polarimetric Decompositions
The polarimetric decomposition is a mathematical process that separates the
scattering mechanisms of the target into individual components, such as
surface scattering, double-bounce scattering, and volume scattering. Different
polarimetric decompositions can be used, such as the Freeman-Durden
decomposition, Cloude-Pottier decomposition, and Touzi decomposition.
These decompositions involve eigenvalue and eigenvector analysis of the
coherency or covariance matrix.
6.5.3.1 Pauli Decomposition: This is the most basic and widely used
polarimetric decomposition technique in SAR. It is a simple 3-channel
decomposition that separates the total power, the power in the horizontal
polarization, and the power in the vertical polarization. This technique is
particularly useful for identifying and visualizing the orientation of scattering
structures, such as vegetation, buildings, and ships.
6.5.3.2 Freeman-Durden Decomposition: This decomposition technique is
used to distinguish between surface, double-bounce, and volume scattering
mechanisms. It separates the scattering from the surface and the volume and
provides information on the double-bounce scattering caused by the interaction
of the radar signal with two different scatterers. The Freeman-Durden
decomposition is often used for land cover classification, particularly in urban
areas. The Freeman-Durden model assumes reflection symmetry hence T13 T23
and their Conjugates are assumed to be zero (Figure 6.7).
94| P a g e
Total Scattering = Surface Scattering + Double-bounce + Volume Scattering
Figure 6.7: Types of Scaterring Mechanism
Figure 6.8: Decomposition of fully polarimetry data (Radarsat 2)
6.5.3.3 Cloude-Pottier Decomposition: This is a powerful technique that
provides information on the dominant scattering mechanisms within a pixel. It
decomposes the polarimetric data into a set of eigenvalues and eigenvectors
that can be used to identify the scattering mechanism. The Cloude-Pottier
decomposition is particularly useful for identifying and characterizing the
scattering properties of man-made objects such as buildings and roads.
6.5.3.4 Yamaguchi Decomposition: This technique separates the polarimetric
data into four scattering components: surface, double-bounce, volume, and
helix scattering. The helix scattering is a unique feature of the Yamaguchi
decomposition and is caused by the interaction of the radar signal with helical
structures, such as tree branches or wires. The Yamaguchi decomposition is
used for analyzing vegetation and forest structures, as well as for identifying
man-made objects.
95| P a g e
6.5.3.5 Touzi Decomposition: This technique is based on the coherent
decomposition of the scattering matrix and provides a detailed analysis of the
polarimetric scattering mechanisms. The Touzi decomposition separates the
polarimetric data into four parameters: the surface scattering coefficient, the
dihedral scattering coefficient, the volume scattering coefficient, and the helix
scattering coefficient. This technique is particularly useful for analyzing
complex scattering structures, such as forests and agricultural fields.
6.5.4 Interpretation:
The output of the polarimetric decomposition is a set of images that represent
the different scattering mechanisms. These images can be interpreted to
identify the different types of targets, such as vegetation, buildings, and water
bodies.
6.6 Interferometric Synthetic Aperture Radar
InSAR (Interferometric Synthetic Aperture Radar) processing is a complex
procedure (Figure 6.9) that involves several steps to extract valuable
information about ground deformation and create accurate digital elevation
models.
Figure 6.9: Working Principle of Interferometric SAR
6.6.1 Data Acquisition:
The first step in InSAR processing is to acquire pairs of SAR (Synthetic
Aperture Radar) images. These images should be taken from slightly different
positions in space and time, with a suitable time baseline between them. It is
96| P a g e
important to ensure that the images have similar acquisition geometries and are
coregistered to sub-pixel accuracy.
6.6.2 Calibration:
Once the SAR images are acquired, they need to be calibrated to account for
various system-induced artifacts. This involves correcting for sensor-specific
effects such as antenna phase center variations, platform motion, and
radiometric calibration.
6.6.3 Coregistration:
In order to compare the two SAR images accurately, they need to be
coregistered. Coregistration involves aligning the images to sub-pixel
accuracy, compensating for any geometric differences. This step is crucial for
achieving precise interferometric measurements.
6.6.4 Interferogram Formation:
The next step is to generate an interferogram by combining the two
coregistered SAR images (Figure 6.10). This is done by pixel-by-pixel
differencing of the complex radar data. The resulting interferogram represents
the phase difference between the two acquisitions and contains information
about ground deformation.
Figure 6.10: Interferogram of Turkey Earthquake using Sentinel
( Source :
https://www.esa.int/ESA_Multimedia/Images/2023/02/Tuerkiye_Syria_interferogram)
6.6.5 Phase Correction:
The interferometric phase contains not only the information about ground
deformation but also atmospheric and other noise-related effects. These
97| P a g e
artifacts need to be corrected to obtain accurate deformation measurements.
System-induced errors, such as topographic variations and antenna phase
center variations, are also corrected during this stage.
6.6.6 Phase Unwrapping:
The interferometric phase is inherently wrapped within a limited range,
causing phase discontinuities. To obtain a continuous phase field, phase
unwrapping algorithms are applied (Figure 6.11). These algorithms reconstruct
the unwrapped phase by adding multiples of 2π to the wrapped phase values,
ensuring a smooth and continuous phase distribution.
Figure 6.11: Wrapped Phase vs Unwrapped Phase
6.6.7 Interferogram Filtering:
Interferograms often contain noise and unwanted signals that can affect the
accuracy of the deformation measurements. Filtering techniques are applied to
suppress the noise and enhance the coherent deformation signal. Adaptive
filtering, multi-looking, and spatial filtering methods are commonly employed
to improve the signal-to-noise ratio and enhance the quality of the
interferogram.
6.6.8 Phase-to-Height Conversion:
The next step is to convert the interferometric phase values into meaningful
displacement measurements. This process, known as phase-to-height
conversion, requires information about the radar wavelength and the baseline,
which is the perpendicular distance between the satellite positions during
image acquisition. By combining this information, it is possible to estimate the
three-dimensional displacement of the Earth's surface.
98| P a g e
6.6.9 DEM Generation:
InSAR processing also enables the creation of highly accurate Digital
Elevation Models (DEMs). By comparing multiple interferograms acquired
from different pairs of SAR images, it is possible to obtain a dense network of
height measurements. These measurements are then combined and adjusted to
create a detailed and precise representation of the topography.
6.6.10 Interpretation and Analysis:
Once the InSAR processing steps are completed, the resulting products can be
interpreted and analyzed for specific applications. Ground deformation
measurements can be used to monitor volcanic activity, landslides etc.
6.7 Applications of SAR:
Flood Monitoring and Mapping: SAR is instrumental in monitoring and
mapping flood events. By capturing images during and after floods, SAR
can detect changes in water bodies, identify flood extent (Figure 6.12), and
aid in flood management and response. SAR's ability to penetrate clouds
and capture data day or night makes it particularly useful during disaster
situations.
Figure 6.12: Pre Flood Vs Post Flood Using Sentinel SAR over Sabari
River
Forestry: SAR plays a crucial role in forest monitoring and management. It
can provide information on forest structure, biomass estimation, and
deforestation detection.
99| P a g e
Geology and Geohazards: SAR is widely used in geology and geohazard
studies. It can detect and monitor ground deformation associated with
earthquakes, volcanic activity, and landslides.
Oil Spill Detection: SAR is highly effective in detecting and monitoring
oil spills in marine environments. SAR's sensitivity to the roughness of the
sea surface and its ability to detect changes in backscatter make it a
valuable tool for oil slick monitoring and tracking (Figure 6.13).
Figure 6.13: Oil spill from tanker Princess Empress off the coast of
Philippines.
Agriculture and Crop Monitoring: SAR is used in agriculture to monitor
crop growth, estimate biomass etc. It can provide valuable information on
soil moisture and vegetation characteristics. This data aids in optimizing
irrigation practices, predicting crop yield, and supporting agricultural
management decisions.
Glaciology and Cryosphere Studies: SAR is extensively used in
glaciology and polar studies to monitor ice sheets, glaciers, and polar
regions. It can measure ice motion, detect changes in ice cover, and
estimate ice thickness.
100| P a g e
Summary
This chapter provides details of SAR image processing. Types of image
distortion in SAR data is explained in detail. SAR Polarimetry & InSAR data
processing and its applications in various fields are elaborated with example.
101| P a g e
Chapter 7
Hyperspectral Image Analysis
7.1 Introduction
Hyperspectral remote sensing involves collecting and analyzing data from a
range of electromagnetic wavelengths in order to extract information about
objects on the Earth's surface or any other planetary surface (Camps-Valls et
al., 2011; Goetz et al., 1985; Richard and Jia, 2006). Sensors typically collect
this data on airborne or spaceborne platforms, which measure the radiance or
energy reflected or emitted by the objects. By analyzing the unique spectral
signatures of different objects, hyperspectral remote sensing is used to identify
and map different types of vegetation, minerals, water bodies, and other
features of interest. This technique is utilized in a variety of fields, such as
agriculture, environmental monitoring, geology, food technology, and
archaeology, among others.
Hyperspectral imaging, also called imaging spectroscopy, collects information
in a spectral vector with hundreds or thousands of elements from every pixel in
a given scene. This continuous information provides a continuous spectrum for
each image cell. The result is hyperspectral image (HSI) or hyperspectral data
cubes. It can be interpreted as a stack of images representing the radiance in
each respective band or wavelength interval, as shown in the illustration below
(Figure 7.1).
Multispectral remote sensors such as the Landsat, Thematic Mapper, IKONOS,
IRS LISS II, LISS III, and Sentinel 2, SPOT XS, on the other hand, produce
images with a few relatively broad wavelength bands. The primary drawback
of multispectral data is that it gathers information within wide wavelength
bands, consequently restricting the extent of available spectral information.
102| P a g e
Figure 7.1a: Concept of hyperspectral image (Source- Bioucas-Dias et al.,
2013) and spectral signature. b. EO-1 Hyperion 3D spectral cube of Udaipur in
a natural color composite. c & d shows the spectral signatures of vegetation,
soil, and water extracted to demonstrate their differences in spectral properties
in hyperspectral and multispectral data (Sentinel-2).
7.2 Hyperspectral Datasets
NASA's Earth Observing-1 (EO-1) Hyperion instrument was the first ever
space-borne hyperspectral instrument launched in year 2000. EO-1 provided
continuous spectral information in terms of spectral profiles across the broad
electromagnetic spectrum ranging from 400 nm to 2500 nm. Hyperion has a
total of 242 spectral channels with 30 m spatial resolution. Examples of other
hyperspectral sensors are listed in Table 7.1.
Interpreting the information captured by hyperspectral images requires a great
understanding of the ground material's properties under measurement and their
correlation to the actual measurements obtained by the hyperspectral sensor.
Hyperspectral images require the removal of atmospheric and terrain effects
before any interpretation, only after which the image spectra are comparable
with field or laboratory reflectance spectra. Therefore, pre-processing is critical
and an essential step before any scientific analysis.
103| P a g e
Table 7.1: Hyperspectral sensors
PARAMETER AVIRIS HYDICE CHRIS PRISMA HyspIRI EnMAP HYPERION
Altitude (km) 20 1.6 556 614 626 653 705
Spatial 20 0.75 36 5-30 60 30 30
Resolution (m)
Spectral 20 7-14 1.3–12 10 4-12 6.5–10 10
resolution (nm)
Spectral 0.4–2.5 0.4–2.5 0.4–2.5 0.4–2.5 0.38–2.5 0.4–2.5 0.4–2.5
Coverage (µm) & 7.5–
12
Number of 224 210 63 238 217 228 220
bands
7.3 Pre-processing of Hyperspectral Data
Hyperspectral sensors typically provide images in raw digital numbers (DN),
representing the sensor's measured radiance values. Several corrections and
calibrations must be applied to convert these raw digital numbers into surface
reflectance. Firstly, radiometric calibration is performed to convert the raw
digital numbers into at-sensor or top-of-atmosphere (TOA) radiance values,
which account for the sensor characteristics. This involves using the sensor's
characteristic gain and offset values to convert the raw digital numbers into
radiance values.
Next, atmospheric correction is applied to scientifically measure the radiance at
the Earth's surface, removing any unwanted spurious noise effects introduced
by atmospheric interferences. As sunlight passes through the atmosphere, it
gets partially absorbed and scattered, which can influence the spectral values
measured by the sensor. Various atmospheric correction algorithms are applied
to estimate atmospheric conditions and remove their influence on spectral
values.
Finally, geometric and surface corrections need to be applied for the removal
of illumination, viewing angle, and the surface's structural and optical
properties effect. This involves correcting for things like shadows, topography,
and surface orientation, as well as accounting for differences in reflectance due
to the surface properties such as vegetation, water, and soil. These corrections
and calibrations are necessary to convert the raw digital numbers into valid
surface reflectance values for advanced information extraction techniques such
as classification and feature extraction.
104| P a g e
Figure 7.2: Steps of conversion of DN values to Surface Reflectance (Source:
Bioucas-Dias et al., 2013).
Step 1. Conversion of DN to spectral radiance
The DN to radiance conversion step is based on gain and bias information in
each band of the sensor, which is provided by the calibration team (Figure 7.3).
This transformation relies on a calibration curve of DN to radiance, which is
measured pre-launch during the calibration of the sensor. As sensor accuracy
changes over time, re-calibration of the sensor is carried out periodically, and
gain and offset values are provided with the satellite data. The lower (Lmin)
and upper (Lmax) limits of the post-calibration spectral radiance range define
the gain and bias values.
105| P a g e
Figure 7.3: Calibration curve of Sensor. Gain represents the gradient and the
spectral radiance of the sensor for a zero DN is Bias.
The formula to convert DN to radiance using gain and bias values is:
= ∗ +
units: mW cm-2 ster-1 µm-1
Where:
Lλ is the cell value as radiance
DN is the cell value digital number
gain is the gain value for a specific band
bias is the bias value for a specific band
Gain can be calculated using the equation –
Lmax − Lmin
Gain =
255
Step 2. Spectral radiance to reflectance conversion
The apparent reflectance, which is also termed Top of the atmospheric
reflectance, ρ, defined as the ratio of measured radiance, L, to the solar
106| P a g e
irradiance incident at the top of the atmosphere and is expressed as a decimal
fraction between 0 and 1.
∗ ∗
∗ cos( )
ρ = unitless reflectance (ranges 0-1)
π = 3.141593
L = Spectral radiance at sensor aperture in mW cm-2 ster-1 µm-1
d2 = the square of the Earth-Sun distance in astronomical units = (1 - 0.01674
cos(0.9856 (JD-4)))2 where JD is the Julian Day (day number of the year) of
the image acquisition.
ESUN = Mean solar atmospheric irradiance in mW cm-2 µm-1.
SZ = sun zenith angle in radians when the scene was recorded.
Step 3. Removal of atmospheric effects due to absorption and scattering
Atmospheric correction techniques are categorized into absolute (empirical)
and relative atmospheric corrections (Van der Meer, 1999) and are explained in
the next section.
7.4 Absolute Atmospheric Correction Techniques
In this method, a priori knowledge of the surface characteristics and
atmospheric model is not required. This method corrects the image data for
scattering and absorption of water vapor, mixed gases, and topographic effects
(AIG, 2001). Radiative transfer codes (i.e., LOWTRAN - Low-resolution
propagation model and MODTRAN (MODerate resolution atmospheric
TRANsmission) utilize the scattering and transmission properties of the
atmosphere to assess the variance between the radiation emitted by the Earth's
surface and the radiation detected by the sensor. It can model the scattering
effects in the atmosphere (Van der Meer, 1999). MODTRAN is coded in
FORTRAN and is licensed to U.S. Air Force. It is designed for visible to far
infrared region modeling with a spectral resolution of 100 µm. These codes are
designed to model various types of atmospheric conditions and can be applied
107| P a g e
to a wide range of atmospheric scenarios. Their purpose is to calculate the
atmospheric radiance spectrum on a pixel-by-pixel basis.
Different atmospheric correction modules are available –
Atmospheric CORrection (ACORN) (Goetz, et. al., 2002),
ATmospheric CORrection (ATCOR2 and ATCOR 3),
Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes
(FLAASH))
For Hyperion images, FLAASH is commonly used for atmospheric correction.
FLAASH is developed by Air Force Research Laboratory, Space Vehicles
Directorate (AFRL/VS) and is a physics-based algorithm from the
MODTRAN4 radiative transfer code (Felde et. al., 2003). It is used to removed
unwanted atmospheric effects caused by scattering and absorption by
molecules and particulate matter from the sensor received radiance and provide
reflectance image of the surface. FLAASH employs a standard equation of
spectral radiance at a sensor pixel (L), specifically designed for the solar
wavelength range and flat Lambertian materials. FLAASH is available in
ENVI software and is widely used by scientific community for atmospheric
corrections (ENVI Manual, 2005).
7.5 Relative Atmospheric Correction Techniques
Relative atmospheric correction method uses directly image brightness values
and reflectance value of pixels are computed relatively to each other (Van der
Meer, 1999). The priori knowledge of the surface characteristics and
atmospheric model is not required in this method. Four different methods
which are commonly used are:
Logarithmic residuals
Flat field correction
Internal Average Relative Reflectance Correction,
Empirical Line Correction.
QUick Atmospheric Correction- QUAC
Logarithmic residuals, or shortly log residuals correction, account for the
illumination, reflectance, and topographic factors. The method employs the
108| P a g e
logarithm on the resulting data obtained after dividing the radiance value of
each wavelength by the geometric mean of all channels.
Flat Field Correction - The approach assumes that a particular area in the
image exhibits spectrally neutral reflectance (minimal variation with
wavelength). The mean reflectance curve of this "flat field" is then utilized to
derive the relative reflectance spectra of all other image pixels.
IARR correction allows the calibration when no sensor information is
available (Kruse, 1988). This technique involves using the average reference
spectrum of an entire image to divide the radiance spectrum of each pixel in
the image, resulting in the relative reflectance spectrum for each pixel. This
method may introduce wrong data as spectral features (Van der Meer, 1999).
The IAR approach and the "flat field" approach are Scene-Based Empirical
Approaches and are independent of any field measurements of reflectance
spectra.
The Empirical Line Method (ELM) requires two or more calibration targets
(at least one bright and one dark target) with known reflectance values. A
linear regression equation (i.e., the empirical line) is derived for each spectral
band to derive the gain and offset curves (Karpouzli & Malthus, 2003). After
obtaining the gain and offset curves, it is applied to the entire image to derive
surface reflectance for the entire scene. The resulting reflectance spectra
obtained after applying this approach resemble laboratory-based reflectance
spectra.
Quick Atmospheric Correction (QUAC) derives the atmospheric
compensation parameters directly using the pixel spectra of the scene (Figure
7.4). The approach is based on finding the mean spectrum of a diverse
collection of material spectra, such as the end-member spectra in a scene,
which remains essentially constant and unchanged from one image to another
(Bernstein et al., 2005). It allows the retrieval of reasonably accurate
reflectance spectra even without proper sensor radiometric or wavelength
calibration or when the solar illumination intensity is unknown. This method
works very faster than first-principles methods, making it potentially suitable
for real-time applications.
109| P a g e
Figure 7.4: Example of Hyperion spectra showing effect of atmospheric
correction. Note the difference between at-sensor pixel spectra (left) and
atmospherically corrected surface reflectance spectra (right) using QUAC. At
1400 and 1900 nm regions due to water vapours and strong atmospheric
attenuation, the effect cannot be removed.
Other important pre-processing steps involved in Hyperspectral image analysis
involves - Bad band removal i.e., removing the bands with no information, and
destriping.
7.6 Bad band removal
Different ground objects are characterized by different spectral characteristics
forming the physical basis for target detection or mapping. Band combinations
of hyperspectral data play an important role while detecting or separating one
specific target. Choosing a specific band combination is quite tricky and
depends upon the interpreter's knowledge. One band may be effective for a
particular target, but when dealing with different targets, the informative band
combinations may vary. Even for the same target, the informative band
combination can alter by changing the background or environmental
conditions. The choice of band combinations should be tailored to suit the
specific characteristics of the target and its surroundings to ensure optimal
results in different scenarios. However, certain bands in the datasets that
provide little information to detect any target in the scene and have a low SNR
value are considered Bad Bands. The number and locations of bad bands will
change in different scenes and regions of study.
In the case of Hyperion, which has 242 bands, bands 1 to 7 and 225 to 242
have zero values and are not useful. Additionally, bands 58 to 76 fall in the
overlap region of the two spectrometers and have higher noise levels, making
110| P a g e
them bad bands as well. Therefore, for the Hyperion dataset, only 196 bands
out of 242 are considered good bands. The water vapor absorption bands 120
to 132 (1346 nm to 1467 nm), bands 165-182 (1800 to 1971 nm), and bands
221 (above 2356) and higher also need to be eliminated (Beck, 2003). An
example of bad bands is shown in Figure 7.5. The list of bad bands of
Hyperion data is listed in Table 7.2.
Table 7.2: List of bands which are eliminated from Hyperion data including
the water absorption bands.
Bands Description
1 to 7 Not Illuminated
58 to 78 Overlap Region
120 to 132 Water Vapour Absorption Band
165 to 182 Water Vapour Absorption Band
185 to 187 Identified by Hyperion Bad Band List
221 to 224 Water Vapour Absorption Band
Figure 7.5: Example of bad bands in Hyperion data.
111| P a g e
7.8 Destriping
Hyperspectral data sets are often affected by striping noise, i.e., intensity
variations in either rows or columns of an image. Striping is resultant of sensor
or viewing conditions and can affect the image along either the scan direction
or the cross-track direction (Kruse, 1988). These stripes and the corrupted
pixels are referred to as abnormal pixels. Along-track striping is frequently
encountered and is primarily caused by either the drifting of the detector array
element's radiometric responses or issues arising from the readout electronics.
Vertical stripes occur due to the deviation of any detector from VNIR or SWIR
arrays from its normal response or its neighbors (Figures 7.6 and 7.7). The
impact of striping can be minimized or eliminated through "fine-tuning" the
calibrations, and it varies for each detector array. While striping may be
present in most channels to some extent, it affects mostly the SWIR and lower
signal-to-noise ratio (SNR) channels.
Figure 7.6: Striping in a Hyperion image FCC RGB = (468, 447, 427) nm.
Scan direction is from left to right. Source: https://doi.org/10.1117/12.2014317.
Figure 7.7: Examples of Vertical Striping in a Hyperion bands of
Udaipur Scene.
112| P a g e
Due to the high resolution of the images, the particular pixels do not hold much
information compared to the whole image. During de-striping, the values of
abnormal pixels or bad columns are approximated to an average or average
mean and standard deviation of the neighboring set of pixels.
In order to perform segmentation and classification on hyperspectral data sets,
it is essential to pre-process the images to ensure accurate spectral profiles and
expected pixel values. This typically involves applying various techniques such
as spectral calibration, atmospheric correction, noise reduction, and radiometric
calibration to prepare the image data for analysis. The complete processing of
the hyperspectral data is summarized in Figure 7.8.
Figure 7.8: Processing chain of the hyperspectral data.
7.9 Dimensionality Reduction
Hyperspectral images typically comprise hundreds of bands that offer high
spatial and spectral information. However, the large size of these images can
present significant constraints regarding data handling and processing. For a
more concise and meaningful interpretation, it may be necessary to reduce the
dimensionality of the image without sacrificing any information. The
dimensionality reduction can be achieved either by feature selection and
second one by feature extraction. Feature extraction methods involve creating a
new subset of features by selecting or combining existing information within
the feature space. On the other hand, feature selection involves analyzing a
subset of features chosen from the original set of features. It is achieved by
113| P a g e
using dimensionality reduction algorithms such as Principal Component
Analysis (PCA) and Minimum Noise Fraction (MNF) (Green et al., 1988; Lee
et al., 1990).
PCA is one of the best methods for feature extraction for dimensionality
reduction. PCA transforms multidimensional image data into a new
uncorrelated set of axis or vector spaces known as the principal axis (Rodarmel
et al., 2002; Wold et al., 1987). The maximum variance is observed along its
first axis in the transformed dataset. The second axis, which is mutually
orthogonal to the first, will exhibit the next highest variance, followed by
subsequent axes in descending order of variance, and the PCA images are
ordered based on the eigenvalues in decreasing order of variance. This means
that the image with the highest variance is assigned the first principal
component, followed by the second highest variance for the second
component, and so on until all components are determined. Useful PCA images
are then selected based on the eigenvalues or visual interpretation. However,
sometimes even lower-order PCs may contain valuable information. Figure 7.9
shows the PCA images derived from Hyperion data showing the majority of
the variability is accounted for in the first few PCA bands and that the
remaining bands contain noise. Comparing PC1 to PC9 in Figure 7.9 shows
that each PC is different from all the others (because all are de-correlated). The
use of PCA band combinations proves to be highly effective in distinguishing
between different surface materials highlighted with various colors, as in
Figure 7.9 (extreme right).
Figure 7.9: PCA components of Hyperion image over Udaipur showing the
increase the noise component from the PCA 1 to PCA 6 and discrimination of
different surface materials in PCA RGB combination.
114| P a g e
Minimum Noise Fraction (MNF) is a two-step component transformation used
to identify the number of critical informative bands in high-dimensionality
data, segregate the noise, and reduce computational bands for further
processing (Green et al., 1988). Minimum noise fraction (MNF) transform is a
linear transform performed in two steps. First, noise whitening is applied in
which the noise covariance matrix is generated to decorrelate and rescale the
noise in the data. Then Principal Component Analysis (PCA) transform is
performed on the noise-whitened data to obtain the MNF components. MNF
(Minimum Noise Fraction) relies on the eigenvectors derived from the
covariance structure of the noise present in the image dataset. Unlike PCA
(Principal Component Analysis), MNF is particularly advantageous in
generating images ordered based on image quality. Figure 7.10 shows the first
six MNF bands of the Hyperion data cube showing how the information is
decreasing and the noise component is increasing towards the higher
components. The first ten bands are free from the noise effect (Figure 7.10). As
shown in Figure 10, eigenvalues approach value of 2 beyond the first 20 MNF
bands are considered suitable and noise free for classification. After MNF band
20, eigenvalues are constant and parallel to the horizontal axis.
Figure 7.10 (left): Graphical representation of the eigen values versus eigen
numbers for the Hyperion Udaipur image. ‘A’ region represents image data
with high eigen values and B region having low eigen values (high values of
noise). (Right) MNF transform output channels for the Hyperion data cube
showing steadily increase in the noise level.
7.10 Endmembers Extraction
End members are considered the purest pixels in an imaged scene (Keshava &
Mustard, 2002). Spectral unmixing is often performed to unmix the mixed
pixels components in the hyperspectral image into their respective end
members and abundances. The abundance fractions represent the proportion of
115| P a g e
each end member that contributes to the mixed pixel spectrum. The Pixel
Purity Index (PPI) algorithm is used to extract the purest spectral signature
from the data cube (Boardman et al., 1995). The end members spectra are
further used to find the different classes present in the whole image. The
accuracy of this spectral profile totally depends on the pre-processing
corrections applied to the image. Figure 7.11 (left) shows the PPI output with
pure pixels in 10000 iterations which are used as the candidate points. Figure
7.11 (middle) shows the n-D visualizer utilized to visualize data in n-
dimensional space. It allows users to locate, identify, and cluster the purest
pixels and the most extreme spectral responses, also known as endmembers,
within a dataset (ENVI User's Guide, 2001). Different pixel classes are marked
in different colors, and the reflectance spectra of endmembers represent
vegetation, water, and soil class. These end members are utilized further for the
classification of image (Figure 7.11, right).
Figure 7.11: PPI output showing number of pure pixels extracted (left), n-D
visualizer used to locate pure pixels (middle), and pure pixels endmember
extracted (right).
7.11 Classification
Classification is an information extraction technique that uses spectral analysis
algorithms to characterize the study scene based on the spectral reflectance of
different objects or features. This process allows for identifying and
categorizing different classes or land cover types present in the imagery based
on their unique spectral signatures. (Pignatti et al., 2009). The classification
algorithm assigns a unique label or class to each pixel vector in an image based
on a given set of observations. These observations typically consist of spectral
signatures or other relevant features extracted from the pixels, and the
116| P a g e
classification process aims to categorize each pixel into predefined classes or
categories based on its similarity to the observed patterns.
Two main approaches are used in hyperspectral classification: Supervised and
Unsupervised, based on the usage of the training datasets. Most common
approaches for identification or classification of hyperspectral data are Spectral
Angle Mapper, Spectral feature Fitting, Maximum likelihood (ML) methods,
Neural Networks architectures (Zhong & Zhang, 2012), Support Vector
Machine (SVM) (Melgani & Bruzzone, 2004), Bayesian approach (Mohamed
& Farag, 2005) as well as Kernel methods (Camps-Valls et al., 2006).
7.11.1 Supervised Classification
This method relies on training samples for different classes of interest the user
provides. The image is classified into the desired categories based on the
training samples. The resultant accuracy is high since the user’s domain
knowledge is utilized in this classification. However, this collection of training
samples is a tedious and time-consuming task. The user first selects the classes
of interest which correspond to information classes. The algorithm compares
the similarity between the known and unknown pixels. The unknown pixels are
assigned to a particular class based on the highest likelihood of matching with
a member of that class.
Spectral Angle Mapper - SAM is a supervised classification algorithm that
employs spectral angular information to classify hyperspectral data (Kruse et
al., 1993). In this method, each pixel in a hyperspectral image is represented as
an n-dimensional vector, with n being the number of spectral bands. The
algorithm calculates the spectral angle between the target spectrum and a
reference. The variation in the angle between the image-derived spectrum and
the end member spectrum is the measure of discrimination in SAM
classification. A smaller angle represents a closer match to the reference
spectrum and vice-versa. The pixels farther beyond a specified maximum angle
threshold (in radians) will remain unclassified (Figure 7.12).
117| P a g e
Figure 7.12: (a) SAM classification image of the part of Hutti-Maski area,
Karnataka India using AVIRIS-NG data (b) Close view of the mineralised zone
(c). Reflectance spectra of endmembers selected using n-D visualizer.
Spectral Feature Fitting (SFF) – This algorithm operates on the continuum-
removed image and a reference spectrum. It compares the continuum-removed
image spectra with the reference spectral library spectra and performs the least
square fitting. The correlation coefficient of the fits determines the best
matching between spectral features of the reference and image spectra
(Boardman & Kruse, 1994). Continuum-removed image spectra can be derived
by dividing the original spectrum of each pixel in the original image by the
continuum curve
Where, Scr = Continuum removed spectra, S = Original Spectra, C =
Continuum curve
Minimum distance classifier (MDC) – This utilizes the distance between
pixels in the feature space for image classification. MDC uses a similarity
measure, which categorizes two modes as similar if their feature differences
fall below a specified threshold. In the feature space, points with similarities
(belonging to the same class) are clustered together (Wacker et al., 1972). The
118| P a g e
mean vector of these feature points is then computed and serves as the center
of the respective category. The dispersion of surrounding points is described by
the covariance matrix. Each category's points are measured similarly, and
distance serves as the primary basis for evaluating the similarity of samples.
For Distance calculation, approaches available like Ming's distance,
Mahalanobis distance, absolute value distance, Euclidean distance, Che's
distance, and Barth's distance are widely utilized.
Maximum Likelihood Classifier (MLC) - The Maximum Likelihood
Classifier (MLC) is a nonlinear classification method based on the Bayesian
criterion. It calculates the statistical feature values for every training sample
during classification and establishes a classification discriminant function
(Bruzzone et al., 2001). This function determines the probabilities of each pixel
in the hyperspectral image containing various classes. These probabilities are
then utilized to classify the test sample into the category with the highest
probability, subject to a provided probability threshold. The pixel remains
unclassified with probabilities smaller than the threshold. Other Supervised
algorithms commonly used are Neural Network Classification and Support
Vector Machine algorithms.
7.11.2 Unsupervised classification algorithms
Hyperspectral datasets come with the curse of high dimensionality.
Dimensionality reduction is applied to hyperspectral data for feature extraction
by selecting only the most prominent bands. In unsupervised methods, similar
pixels are automatically grouped into clusters using standard statistical criteria.
Unsupervised classification methods are independent of any prior knowledge
of the training dataset. The familiar unsupervised methods are principal
component analysis (PCA), independent component analysis (ICA) (Villa et
al., 2011), K-Means, ISODATA, and Hierarchical Clustering.
K-means: K-Means Clustering is a centroid-based algorithm that groups the
unlabelled dataset into diverse clusters by iteratively updating cluster centroids
and assigning pixels to the closest one (MacQueen, 1967). The clusters are
associated with a centroid, and the algorithm works on minimizing the sum of
distances between the data point and their corresponding cluster. In
terminology, K defines the number of predetermined clusters to be created. K-
means clustering enables the data to be grouped into distinct categories,
119| P a g e
serving as a convenient means to identify the groups within an unlabelled
dataset independently without requiring a training dataset.
ISODATA- Iterative Self-Organizing Data Analysis technique is an extension
of the K-Means algorithm and is the most commonly used algorithm. The
method is an iterative clustering technique that offers the flexibility of cluster
merging and splitting, guided by user-defined parameters. This aspect makes it
more adaptable and versatile compared to the K-means algorithm. In
ISODATA, the number of clusters selects automatically and assumes that each
class obeys a multivariate normal distribution (Ball & Hall, 1965). The
algorithm assigns arbitrary cluster centers and calculates cluster means and
covariance. The pixels are subsequently classified into the nearest cluster. New
cluster means and covariances are calculated based on all the pixels within that
cluster. This process is repeated for several iterations until the change between
iterations is considered 'sufficiently low.' The modification is quantified in two
ways- by measuring the distance the cluster mean has changed from one
iteration to the next or by calculating the percentage of pixels that have
changed between iterations.
7.12 Spectral unmixing
The spectral unmixing technique involves extracting and identifying pure
pixels/endmembers from the hyperspectral dataset. For each image pixel,
fractional abundances are determined, representing the proportion of each end
member present (Bioucas-Dias et al., 2012). This approach enables the
representation and recognition of various materials or components within the
hyperspectral image, making it valuable for applications such as remote
sensing, mineral exploration, agriculture, and environmental monitoring.
Unmixing algorithms operate based on the anticipated type of mixing, which
can be linear or nonlinear mixing. The observed reflectance spectrum is a
weighted combination of individual material spectra in linear mixing. Each
material spectrum is multiplied by a corresponding weight, representing the
relative amount or abundance of that material in the pixel. As shown in Figure
7.13, the reflecting surface is similar to a checkerboard mixture, and there is no
multiple scattering between components. The spectra observed in the reflected
radiation of a hyperspectral image have a linear relationship with the fractional
120| P a g e
abundance of the substances present in the imaged area (Keshava & Mustard,
2002). Linear unmixing model is either geometrical- or statistical-based.
Figure 7.13: Illustration showing linear (left) and nonlinear (right) mixing.
Solar radiation reflects from the surface through a single bounce in linear
mixing. In a nonlinear mixing scenario, solar radiation interacts with an
intimate mixture that induces multiple bounce interactions (right). (Source -
Keshava & Mustard, 2002).
Conversely, nonlinear mixing is usually due to physical interactions (classical
or multi-layered, level or at a microscopic or intimate level) between multiple
materials in the scene (Bioucas-Dias et al., 2012) (Figure 7.14). Classical
spectral mixing occurs when light scatters from one or more objects, reflected
from additional surfaces, before being measured by the hyperspectral imager.
Nonlinear mixing is generally explained by an intimate or multilayer model, as
given by Borel & Gerstl (1994). Figure 7.14 illustrates two nonlinear mixing
scenarios: an intimate mixture, in which the materials are close, and a multi-
layered scene, where there are multiple interactions among the scatterers at the
different layers.
Figure 7.14: Non-linear mixing models: intimate mixture (left); multi-layered
scene (right) (Source: http://dx.doi.org/10.1109/JSTARS.2012.2194696).
121| P a g e
Unmixing processing steps typically involve atmospheric correction,
dimensionality reduction, and unmixing. It can be achieved through
endmember determination plus inversion or by utilizing sparse regression or
sparse coding approaches. The key to linear unmixing is to determine spectral
endmembers that capture the spectral variability present in a given scene.
Various algorithms can derive these end members, utilizing criteria such as
field knowledge, ratios, or PCA. The results of spectral unmixing, including
endmember spectra and abundance estimates, form the foundation of
hyperspectral image classification routines used to identify the material
composition of mixtures. Unmixing is another significant research topic in
hyperspectral processing, particularly in addressing the subpixel target
detection problem.
Summary
In conclusion, hyperspectral image analysis is a powerful tool for
understanding and interpreting complex data sets. Given the numerous
applications in agriculture, environmental monitoring, and mineral exploration,
it is crucial to ensure accurate and reliable hyperspectral data analysis. The
interpretation of hyperspectral data can provide valuable insights and inform
decision-making processes in a range of industries. However, the resulting
information may be incomplete, inaccurate, and even misleading without
proper data processing and analysis. Therefore, it is essential to employ
rigorous analytical techniques and expertise in interpreting hyperspectral data
to ensure its effective use in various applications. By using sophisticated
algorithms and mathematical models, hyperspectral image analysis can extract
valuable information from the spectral data collected by remote sensing
devices.
However, hyperspectral data analysis is challenging due to its high
dimensionality. Having a comprehensive understanding of the underlying
principles is crucial for accurately identifying and characterizing the materials
or components present in the hyperspectral imagery. This knowledge helps in
distinguishing real signal variations from noise, artifacts, or atmospheric
interference, thereby ensuring reliable and meaningful results from
hyperspectral data interpretation.
122| P a g e
Bibliography
1. Analytical Imaging and Geophysics LLC (AIG). (2001). ACORN
User's Guide, Stand Alone Version: Analytical Imaging and
Geophysics LLC, 64.
2. Balha, A., & Singh, C. K. (2023). Comparison of maximum
likelihood, neural networks, and random forests algorithms in
classifying urban landscape. In Application of Remote Sensing and
GIS in Natural Resources and Built Infrastructure Management (pp.
29-38).
3. Bernstein, L. S., Adler-Golden, S. M., Sundberg, R. L., et al. (2005).
Validation of the Quick Atmospheric Correction (QUAC) algorithm
for VNIR-SWIR multi- and hyperspectral imagery. SPIE Proceedings,
Algorithms and Technologies for Multispectral, Hyperspectral, and
Ultraspectral Imagery XI, 5806, 668-678.Beck, R, 2003. EO-1 User
Guide-Version 2.3. Satellite Systems Branch, USGS Earth Resources
Observation Systems Data Center (EDC).
4. Borel, C. C., & Gerstl, S. A. W. (1994). Nonlinear spectral mixing
model for vegetative and soil surfaces. Remote Sensing of
Environment, 47(3), 403-416.
5. Boardman, J. W., & Kruse, F. A. (1994). Automated spectral analysis:
A geological example using AVIRIS data, northern Grapevine
Mountains, Nevada. In Proceedings, Tenth Thematic Conference,
Geologic Remote Sensing, 9, I-407 - I-418.
6. Boardman, J. W., Kruse, F. A., & Green, R. O. (1995). Mapping target
signatures via partial unmixing of AVIRIS data. In Summaries of the
Fifth JPL Airborne Earth Science Workshop, JPL Publication 95-1 (1),
23-26.
7. Bioucas-Dias, J. M., et al. (2012). Hyperspectral Unmixing Overview:
Geometrical, Statistical, and Sparse Regression-Based Approaches,
5(2).
8. Bruzzone, L., et al. (2001). Unsupervised retraining of a maximum
likelihood classifier for the analysis of multitemporal remote sensing
images. IEEE Transactions on Geoscience and Remote Sensing,
39(2), 456-460.
123| P a g e
9. Camps-Valls, G., Gomez-Chova, L., Muñoz-Marí, J., Vila-Francés, J.,
& Calpe-Maravilla, J. (2006). Composite kernels for hyperspectral
image classification. IEEE Geoscience and Remote Sensing Letters,
3(1), 93-97.
10. Camps-Valls, G., Tuia, D., Gómez-Chova, L., Jiménez, S., & Malo, J.
(2011). Remote Sensing Image Processing. Morgan and Claypool, San
Rafael, CA.
11. Chan, Y., Wang, C., & Chen, Y. (2020). A Survey of Deep Learning-
Based Object Detection in Remote Sensing Imagery. Remote Sensing,
12(23), 3943.
12. ENVI Manual. (2005). Flash Module User's Guide, Research Systems
Inc.
13. ENVI User's Guide. (2001). Research Systems Inc., 948p.
14. Felde, G. W., Anderson, G. P., Cooley, T. W., Matthew, M. W., Berk,
A., & Lee, J. (2003, July). Analysis of Hyperion data with the
FLAASH atmospheric correction algorithm. In IGARSS 2003. 2003
IEEE International Geoscience and Remote Sensing Symposium.
Proceedings (IEEE Cat. No. 03CH37477) (Vol. 1, pp. 90-92). IEEE.
15. Goetz, A. F. H., Vane, G., Solomon, J. E., & Rock, B. N. (1985).
Imaging spectrometry for Earth remote sensing. Science, 228, 1147-
1153.
16. Goetz, A., Ferri, M., Kindel, B., & Qu, Z. (2002). Atmospheric
correction of Hyperion data and techniques for dynamic scene
correction. IEEE, 1408-1410.
17. Gonzalez, R. C., & Woods, R. E. (2008). Digital Image Processing.
3rd ed. Pearson Prentice Hall.
18. Green, A. A., Berman, M., Switzer, P., & Craig, M. D. (1988). A
transformation for ordering multispectral data in terms of image
quality with implications for noise removal. IEEE Transactions on
Geoscience and Remote Sensing, 26(1), 65-74.
19. Jain, A. K. (1989). Fundamentals of digital image processing. Prentice
Hall, Inc.
20. Jensen, J. R. (2016). Introductory Digital Image Processing: A Remote
Sensing Perspective. 4th ed. Pearson.
124| P a g e
21. Karpouzli, E., & Malthus, T. (2003). The empirical line method for
the atmospheric correction of IKONOS imagery. International Journal
of Remote Sensing, 24(5), 1143-1150.
22. Keshava, N., & Mustard, J. F. (2002). Spectral unmixing. IEEE Signal
Processing Magazine, 19(1), 44-57.
23. Kenneth, R. C. (1996). Digital Image Processing. Prentice-Hall.
24. Khan, M. J., Rahman, M. A., & Basri, H. (2018). A Comprehensive
Review on Image Enhancement Techniques in Remote Sensing.
Geocarto International, 33(7), 695-725.
25. Kruse, F. A. (1988). Use of airborne imaging spectrometer data to map
minerals associated with hydrothermally altered rocks in the Northern
Grapevine Mountains, Nevada, and California. Remote Sensing of
Environment, 24(1), 31-51.
26. Kruse, F. A., Lefkoff, B., & Dietz, J. B. (1993). Expert System-Based
Mineral Mapping in Northern Death Valley, California/Nevada, Using
the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS).
Remote Sensing of Environment, 44(2), 309-336.Lee, J. B., Woodyatt,
A. S., & Berman, M. 1990. Enhancement of high spectral resolution
remote sensing data by a noise-adjusted principal components
transform. IEEE Transactions on Geoscience and Remote Sensing,
28(3), 295-304.
27. Li, Z., Li, J., & Lu, H. (2019). Image Enhancement in Remote
Sensing: A Review. Remote Sensing, 11(19), 2274.
28. Lillesand, T. M., Kiefer, R. W., & Chipman, J. W. (2014). Remote
Sensing and Image Interpretation. 7th ed. Wiley.
29. Lu, D., & Weng, Q. (2007). A Survey of Image Classification
Methods and Techniques for Improving Classification Performance.
International Journal of Remote Sensing, 28, 823-870.
30. MacQueen, J. B. (1967). Some methods for classification and analysis
of multivariate observations. In Proceedings of the fifth Berkeley
symposium on mathematical statistics and probability, 1(14), 281-297.
31. Mehmood, M., Shahzad, A., Zafar, B., Shabbir, A., & Ali, N. (2022).
Remote Sensing Image Classification: A Comprehensive Review and
Applications. Mathematical Problems in Engineering, 1-24. doi:
10.1155/2022/3376091.
125| P a g e
32. Melgani, F., & Bruzzone, L. (2004). Classification of hyperspectral
remote sensing images with support vector machines. IEEE
Transactions on Geoscience and Remote Sensing, 42(8), 1778-1790.
33. Mewada, H., Al-Asad, J. F., & Khan, A. H. (2020, November).
Landscape Change Detection Using Auto-optimized K-means
Algorithm. In 2020 International Symposium on Advanced Electrical
and Communication Technologies (ISAECT) (pp. 1-6). IEEE.
34. Mohamed R. M, & Farag A. A. 2005. Advanced algorithms for
bayesian classification in high dimensional spaces with applications in
hyperspectral image segmentation. In: IEEE International Conference
on Image Processing. Vol. 2. IEEE pp. II-646.
35. Michael, S., Lawrence, O'G., & Michael, J. S. (2000). Practical
algorithms for image analysis: description, examples, and code.
Cambridge University Press.
36. Neetu, & Ray, S. S. (2019). Exploring machine learning classification
algorithms for crop classification using Sentinel 2 data. The
International Archives of the Photogrammetry, Remote Sensing and
Spatial Information Sciences, 42, 573-578.
37. Pignatti, S., Cavalli, M. R., Cuomo, V., Fusilli, L., Poscolieri, M., &
San. (2009). Evaluating Hyperion capability for land cover mapping
in a fragmented ecosystem: Pollino National Park, Italy. Remote
Sensing of Environment, 3(12), 622-634.
38. Prasad, N., Sameer S., Kushwaha, S. P. S., & Roy, P. S. (2001).
Evaluation of various image fusion techniques and imaging scales for
forest features interpretation. Current Science, 1218-1224.
39. Rafael C. G., & Richard E. W. (2001). Digital Image Processing.
Prentice-Hall.
40. Research Systems Inc. (2001). ENVI User's Guide. Boulder, CO.
41. Research Systems Inc. (2005). ENVI Manual, Flash Module User's
Guide.
42. Richards, J. A., & Jia, X. (2006). Remote Sensing Digital Image
Analysis: An Introduction. Springer-Verlag, New York; Berlin,
Germany; Heidelberg, Germany.
43. Rodarmel, C., & Shan, J. (2002). Principal Component analysis for
hyperspectral image classification. Surveying and Land Information
Science, 62(2), 115-122.
126| P a g e
44. Stockman, A., & Sharpe, L. T. (2000). The spectral sensitivities of the
middle- and long-wavelength-sensitive cones derived from
measurements in observers of known genotype. Vision Research,
40(13), 1711-1737.
45. Van der Meer, F. D. (1998). Imaging spectrometry for geological
remote sensing. Geologie en Mijnbouw, 77, 137-151.
46. Van der Meer, F. D. (1999). Imaging spectrometry for land surface
characterization: Theory, algorithms and methods. In F. D. Van Der
Meer & S. M. De Jong (Eds.), Imaging Spectrometry—a Tool for
Environmental Observations (pp. 1-28).
47. Villa, A., Benediktsson, J. A, Chanussot, J., & Jutten, C. (2011).
Hyperspectral image classification with independent component
discriminant analysis. IEEE Transactions on Geoscience and Remote
Sensing, 49(12), 4865-4876.
48. Wayne, R. P. (1993). Photodissociation dynamics and atmospheric
chemistry. Journal of Geophysical Research: Planets, 98(E7), 13119-
13136.
49. Wacker, A. G., & Landgrebe, D. A. (1975). Minimum Distance
Classification in Remote Sensing. LARS Technical Reports. Paper 25.
http://docs.lib.purdue.edu/larstech/25.
50. Wei, X., Xu, W., Bao, K., Hou, W., Su, J., Li, H., & Miao, Z. (2020).
A water body extraction methods comparison based on FengYun
Satellite data: a case study of Poyang Lake Region, China. Remote
Sensing, 12(23), 3875.
51. Wold, S., Esbensen, K., & Geladi, P. (1987). Principal component
analysis. Chemometrics and Intelligent Laboratory Systems, 2(1-3),
37-52.
52. Zhong, Y., & Zhang, L. (2012). An adaptive artificial immune network
for supervised classification of multi-/hyperspectral remote sensing
imagery. IEEE Transactions on Geoscience and Remote Sensing,
50(3), 894-909.
53. Zhou, Q., Fellows, A., Flerchinger, G. N., & Flores, A. N. (2019).
Examining interactions between and among predictors of net
ecosystem exchange: A machine learning approach in a semi-arid
landscape. Scientific Reports, 9(1), 2222.
127| P a g e
Satellite Data Plates - Images at a Glance
Figure 1: Cartosat-3 Satellite data of a part of Delhi. (a) Band 1, (b) Band 2,
(c) Band 3, (d) Band 4, (e) True color composite and (f) False color composite
Figure 2: Min-Max Linear Stretch using Carto-3 (Part of Delhi)
128| P a g e
Figure 3.: Non-Linear Histogram equalization using Carto-3 data of a
Part of Delhi
Figure 4: Low pass filtered image of Cartosat-3 data
129| P a g e
Figure 5: High pass filtered image of Cartosat-3 data
(Sigma 0) After applying Lee Filter
Figure 6: EOS 4 image of 28March 2023 over Kanpur UP
130| P a g e
Glossary
A
ALOS - Advanced Land Observing Satellite. 55P,
AppEEARS - Application for Extracting and Exploring Analysis Ready
Samples. 13P
B
BIL format - Band interleaved by line. P5, BIP format - Band interleaved by
pixel. P5, BSQ format - Band Sequential. P5
C
CRT - Cathode Ray Tubes.
P9, CI - Chlorophyll Index. P38, CNN - Convolutional neural network, P96
D
DN - Digital numbers P3,
DEM - Digital elevation model. P19
E
EPSG - European Petroleum Survey Group.
P6, European Space Agency. P113, Earth Observing System. P55,
Enhanced Vegetation Index, P36
G
GIS - Geographic information systems
P 22, GCP - Ground control points P 22, GEDI - Global Ecosystem
Dynamics Investigation. P 55
I
ISODATA - Iterative Self-Organizing Data Analysis Technique, P 138
L
LAI - Leaf Area Index P 37,
LiDAR P 55
K
K-means P 80, KNN - K-nearest neighbor P 66
M
MOSDAC –
Meteorological & Oceanographic Satellite Data Archival Centre – P18
131| P a g e
N
NDBI - Normalized Difference Built-Up Index P51,
NDVI - Normalized Difference Vegetation Index – P35,
NDSI - Normalized Difference Snow Index – P 52,
NDWI - Normalized Difference Water Index P52,
NIR - Near-infrared P 35
O
OBIA - Object-based image analysis, P 94
P
POSC - Petrotechnical Open Software Corporation
P 6, PCA - Principal Component Analysis P 141
R
Remote Sensing, P1
S
SAR - Synthetic Aperture Radar,
P 105, SVM - Support Vector Machines, P72
SRTM - Shuttle Radar Topography Mission, P 20
SAVI - Soil Adjusted Vegetation Index, P36
T
TIRS - Thermal Infrared Sensor P 55
U
UTM - Universal Transverse Mercator (UTM) P 22.
132| P a g e
Deputy General Manager
Regional Remote Sensing Centre-North
National Remote Sensing Centre
Indian Space Research Organisation
New Delhi
Deputy General Manager
Regional Remote Sensing Centre-North
National Remote Sensing Centre
Indian Space Research Organisation
New Delhi