Transform theory plays a fundamental role in image processing, as working with the
transform of an image instead of the image itself may give us more insight into the
properties of the image. Two dimensional transforms are applied to image enhancement,
restoration, encoding and description.
1. UNITARY TRANSFORMS
1.1 One dimensional signals
For a one dimensional sequence represented as a vector
of size , a transformation may be written as
where is the transform (or transformation) of , and is the so called forward
transformation kernel. Similarly, the inverse transform is the relation
or written in a matrix form
where is the so called inverse transformation kernel.
If
the matrix T is called unitary, and the transformation is called unitary as well. It can be proven
(how?) that the columns (or rows) of an N N unitary matrix are orthonormal and therefore, form a
complete set of basis vectors in the N dimensional vector space.
In that case
The columns of , that is, the vectors are called the
basis vectors of .
1.2 Two dimensional signals (images)
As a one dimensional signal can be represented by an orthonormal set of basis vectors, an image can
also be expanded in terms of a discrete set of basis arrays called basis images through a two
dimensional (image) transform.
For an N N image f ( x , y ) the forward and inverse transforms are given below
where, again, and are called the forward and inverse transformation
kernels, respectively.
The forward kernel is said to be separable if
It is said to be symmetric if is functionally equal to such that
The same comments are valid for the inverse kernel.
If the kernel of an image transform is separable and symmetric, then the transform
can be written in
matrix form as follows
where is the original image of size N N , and is an N N transformation matrix with elements
. If, in addition, is a unitary matrix then the transform is called separable unitary
and the original image is recovered through the relationship
1.3 Fundamental properties of unitary transforms
1.3.1 The property of energy preservation
In the unitary transformation
T
it is easily proven (try the proof by using the relation T 1 T ) that
Thus, a unitary transformation preserves the signal energy. This property is called energy
preservation property.
This means that every unitary transformation is simply a rotation of the vector f in the N -
dimensional vector space.
For the 2-D case the energy preservation property is written as
N1N1 2 N1N1 2
f ( x , y ) g ( u, v )
x 0 y 0 u 0 v 0
1.3.2 The property of energy compaction
Most unitary transforms pack a large fraction of the energy of the image into relatively few of the
transform coefficients. This means that relatively few of the transform coefficients have significant
values and these are the coefficients that are close to the origin (small index coefficients).
This property is very useful for compression purposes. (Why?)
2. THE TWO DIMENSIONAL FOURIER TRANSFORM
2.1 Continuous space and continuous frequency
The Fourier transform is extended to a function f ( x , y ) of two variables. If f ( x , y ) is continuous
and integrable and F ( u, v ) is integrable, the following Fourier transform pair exists:
In general F ( u, v ) is a complex-valued function of two real frequency variables u, v and hence, it
can be written as:
1
The amplitude spectrum, phase spectrum and power spectrum, respectively, are defined as follows.
F ( u, v ) R 2 ( u, v ) I 2 ( u, v )
2
P( u, v ) F ( u, v ) R 2 ( u, v ) I 2 ( u, v )
2.2 Discrete space and continuous frequency
For the case of a discrete sequence of infinite duration we can define the 2-D discrete space
Fourier transform pair as follows
F ( u, v ) is again a complex-valued function of two real frequency variables u, v and it is periodic
with a period 2 2 , that is to say F ( u, v ) F ( u 2 , v ) F ( u, v 2 )
The Fourier transform of f ( x , y ) is said to converge uniformly when F ( u, v ) is finite and
for all u, v .
When the Fourier transform of f ( x , y ) converges uniformly, F ( u, v ) is an analytic function and is
infinitely differentiable with respect to u and v .
2.3 Discrete space and discrete frequency: The two dimensional Discrete Fourier
Transform (2-D DFT)
If f ( x , y ) is an array, such as that obtained by sampling a continuous function of two
dimensions at dimensions on a rectangular grid, then its two dimensional Discrete Fourier
transform (DFT) is the array given by
,
and the inverse DFT (IDFT) is
M1N1
f ( x , y ) F ( u, v )e j 2 ( ux / M vy / N )
u 0 v 0
When images are sampled in a square array, M N and
1 N1N1 j 2 ( ux vy )/ N
F ( u, v ) f ( x , y )e
N x 0 y 0
1 N1N1
f ( x , y ) F ( u, v )e j 2 ( ux vy )/ N
N u0 v 0
It is straightforward to prove that the two dimensional Discrete Fourier Transform is separable,
symmetric and unitary.
2.3.1 Properties of the 2-D DFT
Most of them are straightforward extensions of the properties of the 1-D Fourier Transform. Advise
any introductory book on Image Processing.
2
2.3.2 The importance of the phase in 2-D DFT. Image reconstruction from amplitude or
phase only.
The Fourier transform of a sequence is, in general, complex-valued, and the unique representation of
a sequence in the Fourier transform domain requires both the phase and the magnitude of the Fourier
transform. In various contexts it is often desirable to reconstruct a signal from only partial domain
information. Consider a 2-D sequence with Fourier transform so that
It has been observed that a straightforward signal synthesis from the Fourier transform phase
alone, often captures most of the intelligibility of the original image (why?). A
straightforward synthesis from the Fourier transform magnitude alone, however, does not
generally capture the original signal’s intelligibility. The above observation is valid for a large
number of signals (or images). To illustrate this, we can synthesise the phase-only signal
and the magnitude-only signal by
and observe the two results (Try this exercise in MATLAB).
An experiment which more dramatically illustrates the observation that phase-only signal synthesis
captures more of the signal intelligibility than magnitude-only synthesis, can be performed as
follows.
Consider two images and . From these two images, we synthesise two other images
and by mixing the amplitudes and phases of the original images as follows:
In this experiment captures the intelligibility of , while captures the
intelligibility of (Try this exercise in MATLAB).
3. THE DISCRETE COSINE TRANSFORM (DCT)
3.1 One dimensional signals
This is a transform that is similar to the Fourier transform in the sense that the new independent
variable represents again frequency. The DCT is defined below.
with a parameter that is defined below.
The inverse DCT (IDCT) is defined below.
3.2 Two dimensional signals (images)
For 2-D signals it is defined as
3
is defined as above and
3.3 Properties of the DCT transform
The DCT is a real transform. This property makes it attractive in comparison to the Fourier
transform.
The DCT has excellent energy compaction properties. For that reason it is widely used in image
compression standards (as for example JPEG standards).
There are fast algorithms to compute the DCT, similar to the FFT for computing the DFT.
4. WALSH TRANSFORM (WT)
4.1 One dimensional signals
This transform is slightly different from the transforms you have met so far. Suppose we have a
function where and its Walsh transform .
If we use binary representation for the values of the independent variables and we need bits to
represent them. Hence, for the binary representation of and we can write:
,
with for .
Example
If then and for ,
We define now the 1-D Walsh transform as
or
The array formed by the Walsh kernels is again a symmetric matrix having orthogonal rows and
columns. Therefore, the Walsh transform is and its elements are of the form
. You can immediately observe that or depending on
the values of and . If the Walsh transform is written in a matrix form
the rows of the matrix which are the vectors have the form of
square waves. As the variable (which represents the index of the transform) increases, the
corresponding square wave’s “frequency” increases as well. For example for we see that
and hence, , for any . Thus,
4
and . We see that the first element of the Walsh transform in the
mean of the original function (the DC value) as it is the case with the Fourier transform.
The inverse Walsh transform is defined as follows.
or
4.2 Two dimensional signals
The Walsh transform is defined as follows for two dimensional signals.
or
The inverse Walsh transform is defined as follows for two dimensional signals.
or
4.3 Properties of the Walsh Transform
Unlike the Fourier transform, which is based on trigonometric terms, the Walsh transform
consists of a series expansion of basis functions whose values are only or and they have
the form of square waves. These functions can be implemented more efficiently in a digital
environment than the exponential basis functions of the Fourier transform.
The forward and inverse Walsh kernels are identical except for a constant multiplicative factor of
for 1-D signals.
The forward and inverse Walsh kernels are identical for 2-D signals. This is because the array
formed by the kernels is a symmetric matrix having orthogonal rows and columns, so its inverse
array is the same as the array itself.
The concept of frequency exists also in Walsh transform basis functions. We can think of
frequency as the number of zero crossings or the number of transitions in a basis vector and we
call this number sequency. The Walsh transform exhibits the property of energy compaction as
all the transforms that we are currently studying. (why?)
For the fast computation of the Walsh transform there exists an algorithm called Fast Walsh
Transform (FWT). This is a straightforward modification of the FFT. Advise any introductory
book for your own interest.
5. HADAMARD TRANSFORM (HT)
5.1 Definition
5
In a similar form as the Walsh transform, the 2-D Hadamard transform is defined as follows.
Forward
, or
Inverse
etc.
5.2 Properties of the Hadamard Transform
Most of the comments made for Walsh transform are valid here.
The Hadamard transform differs from the Walsh transform only in the order of basis functions.
The order of basis functions of the Hadamard transform does not allow the fast computation of it
by using a straightforward modification of the FFT. An extended version of the Hadamard
transform is the Ordered Hadamard Transform for which a fast algorithm called Fast
Hadamard Transform (FHT) can be applied.
An important property of Hadamard transform is that, letting represent the matrix of order
, the recursive relationship is given by the expression
6. KARHUNEN-LOEVE (KLT) or HOTELLING TRANSFORM
The Karhunen-Loeve Transform or KLT was originally introduced as a series expansion for
continuous random processes by Karhunen and Loeve. For discrete signals Hotelling first studied
what was called a method of principal components, which is the discrete equivalent of the KL series
expansion. Consequently, the KL transform is also called the Hotelling transform or the method of
principal components. The term KLT is the most widely used.
6.1 The case of many realisations of a signal or image (Gonzalez/Woods)
The concepts of eigenvalue and eigevector are necessary to understand the KL transform.
If is a matrix of dimension , then a scalar is called an eigenvalue of if there is a nonzero
vector in such that
The vector is called an eigenvector of the matrix corresponding to the eigenvalue .
(If you have difficulties with the above concepts consult any elementary linear algebra book.)
Consider a population of random vectors of the form
6
The mean vector of the population is defined as
The operator refers to the expected value of the population calculated theoretically using the
probability density functions (pdf) of the elements and the joint probability density functions
between the elements and .
The covariance matrix of the population is defined as
Because is -dimensional, and are matrices of order . The element
of is the variance of , and the element of is the covariance between the elements and
. If the elements and are uncorrelated, their covariance is zero and, therefore, .
For vectors from a random population, where is large enough, the mean vector and covariance
matrix can be approximately calculated from the vectors by using the following relationships where
all the expected values are approximated by summations
Very easily it can be seen that is real and symmetric. In that case a set of orthonormal (at this
point you are familiar with that term) eigenvectors always exists. Let and , , be
this set of eigenvectors and corresponding eigenvalues of , arranged in descending order so that
for . Let be a matrix whose rows are formed from the eigenvectors of
, ordered so that the first row of A is the eigenvector corresponding to the largest eigenvalue, and
the last row the eigenvector corresponding to the smallest eigenvalue.
Suppose that A is a transformation matrix that maps the vectors into vectors by using the
following transformation
The above transform is called the Karhunen-Loeve or Hotelling transform. The mean of the
vectors resulting from the above transformation is zero (try to prove that)
the covariance matrix is (try to prove that)
and is a diagonal matrix whose elements along the main diagonal are the eigenvalues of (try
to prove that)
The off-diagonal elements of the covariance matrix are , so the elements of the vectors are
uncorrelated.
7
Lets try to reconstruct any of the original vectors from its corresponding . Because the rows of
are orthonormal vectors (why?), then , and any vector can by recovered from its
corresponding vector by using the relation
Suppose that instead of using all the eigenvectors of we form matrix from the
eigenvectors corresponding to the largest eigenvalues, yielding a transformation matrix of order
. The vectors would then be dimensional, and the reconstruction of any of the original
vectors would be approximated by the following relationship
The mean square error between the perfect reconstruction and the approximate reconstruction is
given by the expression
By using instead of for the KL transform we achieve compression of the available data.
6.2 The case of one realisation of a signal or image
The derivation of the KLT for the case of one image realisation assumes that the two dimensional
signal (image) is ergodic. This assumption allows us to calculate the statistics of the image using
only one realisation. Usually we divide the image into blocks and we apply the KLT in each block.
This is reasonable because the 2-D field is likely to be ergodic within a small block since the nature
of the signal changes within the whole image. Let’s suppose that is a vector obtained by
lexicographic ordering of the pixels within a block of size (placing the rows of the
block sequentially).
The mean vector of the random field inside the block is a scalar that is estimated by the approximate
relationship
and the covariance matrix of the 2-D random field inside the block is where
and
After knowing how to calculate the matrix , the KLT for the case of a single realisation is the same
as described above.
6.3 Properties of the Karhunen-Loeve transform
Despite its favourable theoretical properties, the KLT is not used in practice for the following
reasons.
Its basis functions depend on the covariance matrix of the image, and hence they have to
recomputed and transmitted for every image.
Perfect decorrelation is not possible, since images can rarely be modelled as realisations of
ergodic fields.
There are no fast computational algorithms for its implementation.
8
REFERENCES
[1] Digital Image Processing by R. C. Gonzales and R. E. Woods, Addison-Wesley Publishing
Company, 1992.
[2] Two-Dimensional Signal and Image Processing by J. S. Lim, Prentice Hall, 1990.