Efficient convolution algorithms.
Mr. Sivadasan E T
                                              Associate Professor
               Vidya Academy of Science and Technology, Thrissur
             Naive convolution
Naive convolution refers to the straightforward, brute-
force implementation of the convolution operation
without optimizations.
It is equivalent to compose d one-dimensional
convolutions with each of these vectors.
When the kernel is separable, naive convolution is
inefficient.
 Efficient Convolution Algorithms
Modern convolutional network applications often
involve networks containing more than one million
units.
It is also possible to speed up convolution by selecting
an appropriate convolution
algorithm.
 Efficient Convolution Algorithms
Convolution is equivalent to converting both the input
and the kernel to the frequency domain using a Fourier
transform, performing point-wise multiplication
of the two signals, and converting back to the time
domain using an inverse Fourier transform.
For some problem sizes, this can be faster than the
naïve implementation of discrete convolution.
Fourier Transform
             d-dimensional Kernel
A kernel in convolution is a matrix (or tensor for
higher dimensions) used for feature extraction,
filtering, or pattern matching in image processing or
neural networks.
In a d-dimensional case, the kernel operates across all
d dimensions simultaneously.
              Separable Kernel
A kernel is called separable if it can be decomposed
into the outer product of d one-dimensional vectors
(one for each dimension).
This decomposition allows the kernel to be represented
as:
        K(x1,x2,...,xd) = k1(x1)*k2(x2)* ... *kd(xd)
                Separable Kernel
Where each ki is a 1D vector applied in a specific
dimension.
For example, in 2D:
                    K2D = k1T ⊗ k2
This means the 2D kernel can be expressed as the
product of two 1D kernels.
               Separable Kernel
Example:
  2D Gaussian blur kernel:
  This can be decomposed into two 1D vectors:
              Separable Kernel
The composed approach is significantly faster than
performing one d-dimensional convolution with their
outer product.
The kernel also takes fewer parameters to represent as
vectors.
            Naïve and Decomposed
• If the kernel is w elements wide in each dimension,
• Then naive multidimensional convolution requires O(wd)
  runtime and parameter storage space.
• while separable convolution requires O(w * d) runtime
  and parameter storage space.
             Naïve and Decomposed
• Of course, not every convolution can be represented
  in decomposed way.
• Devising faster ways of performing convolution or
  approximate convolution without harming the
  accuracy of the model is an active area of research.
Thank You!