Chapter 5                            Image Compression
Introduction
   •   Image compression is a type of data compression applied to digital images, to reduce
       their cost for storage or transmission. It is concerned with minimizing the number of bits
       required to represent an image. The basis of the reduction process is the removal of
       redundant data.
   •   Application: Transmission and Storage of information
   •   Data redundancy is a control issue in digital image Compression
Why Compression is necessary?
   •  To reduce the volume of data to be transmitted (text, fax, images)
   •  To reduce the bandwidth required for transmission and to reduce storage requirements
      (speech, audio, video)
How is it possible?
   •   How is compression possible?
         – Redundancy in digital audio, image, and video data
         – Properties of human perception
  • Digital audio is a series of sample values; image is a rectangular array of pixel values;
     video is a sequence of images played out at a certain rate
Human Perception Factors
   •   Neighboring sample values are correlated. Compressed version of digital audio, image,
       video need not represent the original information exactly .Perception sensitivities are
       different for different signal patterns. Human eye is less sensitive to the higher spatial
       frequency components than the lower frequencies (transform coding)
Redundancy
   • Adjacent audio samples are similar (predictive encoding); samples corresponding to
     silence (silence removal)
   •   In digital image, neighboring samples on a scanning line are normally similar (spatial
       redundancy)
   •   In digital video, in addition to spatial redundancy, neighboring images in a video
       sequence may be similar (temporal redundancy)
Compression algorithms remove redundancy
If more data are used than is strictly necessary, then we say that there is redundancy in the
dataset. Data redundancy is not abstract concept but a mathematically quantifiable entity . If n1
and nc denote the number of information carrying units in two data sets that represent the same
information, the relative data redundancy RD of the first data set ( n1 ) can be defined as
RD= 1 – 1/CR
Where CR is compression ratio, defined as CR = n1/nc
Where n1 is the number of information carrying units used in the uncompressed dataset and nc is
the number of units in the compressed dataset. The same units should be used for n1 and nc; bits
or bytes are typically used. When nc<<n1, CR large value and RD 1. Larger values of C
indicate better compression. When nc>>n1 , CR 0 and RD large value.Undesirable Case.
When lossy compression techniques are employed, the decompressed image will not be identical
to the original image
       Classification
   •   Lossless compression
           •   lossless compression for legal and medical documents, computer programs
           •   exploit only data redundancy
   •    Lossy compression
                 •   digital audio, image, video where some errors or loss can be tolerated
                 •   exploit both data redundancy and human perception properties
   •    Constant bit rate versus variable bit rate coding
Three basic types of redundancy can be identified in a single image:
   1) Coding redundancy
   2) Inter-pixel redundancy
   3) Psycho-visual redundancy
5.1 Coding Redundancy
   -    Our quantized data is represented using code words.
The code words are ordered in the same way as the intensities that they represent; thus the bit
pattern 00000000, corresponding to the value 0, represents the darkest points in an image and the
bit pattern 11111111, corresponding to the value 255, represents the brightest points.
   -     if the size of the code word is larger than is necessary to represent all quantization levels,
        then we have coding redundancy
An 8-bit coding scheme has the capacity to represent 256 distinct levels of intensity in an image .
But if there are only 16 different grey levels in a image, the image exhibits coding redundancy
because it could be represented using a 4-bit coding scheme. Coding redundancy can also arise
due to the use of fixed-length code words.
Grey level histogram of an image also can provide a great deal of insight into the construction of
codes to reduce the amount of data used to represent it .
Let us assume, that a discrete random variable rk in the interval (0,1) represents the grey levels
of an image and that each rk occurs with probability Pr(rk). Probability can be estimated from the
histogram of an image using
Pr(rk) = hk/n for k = 0,1……L-1
Where L is the number of grey levels and hk is the frequency of occurrence of grey level k (the
number of times that the kth grey level appears in the image) and n is the total number of the
pixels in the image. If the number of the bits used to represent each value of rk is l(rk), the
average number of bits required to represent each pixel is :
          L 1
  Lavg   l (rk )Pr (rk )
          k 0
5.1.1 Huffman Coding
   1. Ranking pixel values in decreasing order of their probability
   2. Pair the two values with the lowest probabilities, labeling one of them with 0 and other
      with 1.
   3. Link two symbols with lowest probabilities.
   4. Go to step 2 until you generate a single symbol which probability is 1.
   5. Trace the coding tree from a root.
      Assigns fewer bits to symbols that appear more often and more bits to the symbols that
       appear less often
      Efficient when occurrence probabilities vary widely
      Huffman codebook from the set of symbols and their occurring probabilities
      Two properties:
            generate compact codes
            prefix property
Consider the symbols with different Probabilities
       Symbol      Probability
          a1       0.1
          a2       0.4
          a3       0.06
          a4       0.1
          a5       0.04
          a6       0.3
   •   The First step in Huffman’s approach => Create a series of source reductions by ordering
       the probabilities. [ Descending order ]
   •   The Second step => Code each reduced source starting with the smallest source and
       working back to the original source.
5.2 Inter-pixel Redundancy
The intensity at a pixel may correlate strongly with the intensity value of its neighbors. Because
the value of any given pixel can be reasonably predicted from the value of its neighbors much of
the visual contribution of a single pixel to an image is redundant; it could have been guessed on
the bases of its neighbor’s values.
We can remove redundancy by representing changes in intensity rather than absolute intensity
values .For example the differences between adjacent pixels can be used to represent an image.
Transformation of this type is referred to as mappings. They are called reversible if the original
image elements can be reconstructed from the transformed data set.
Run length encoding is one of the example which also take advantage of inter pixel redundancy.
A “run: of consecutive pixels whose gray levels are identical is replaced with two values: the
length of the run and the gray level of all pixels in the run. Example (50, 50,50,50) becomes
(4,50).Especially suited for synthetic images containing large homogeneous regions . The
encoding process is effective only if there are sequences of 4 or more repeating characters.
Applications – compression of binary images to be faxed.
5.2.1 Run Length Coding (Losseless)
    • Repeated occurrence of the same character is called a run
   •   Number of repetition is called the length of the run
   •   Run of any length is represented by three characters
           –   eeeeeee7tnnnnnnnn
           –   @e7t@n8
   •   It is supported by most bitmap file formats ( BMP, TIFF etc).
   •   It performs lossless data compression.
   •   Run length coding works by reducing the physical size of repeating string of characters.
   •   Generally represented => 2 bytes ( First byte : No of character and Second byte : value )
   •   AAAAAAAAAAAAAAA => 15A ( 15 bytes replaced by 2 bytes)
   •   Images with repeating gray value along rows (columns) can be compressed by storing
       runs of identical grey values.
   •   Compression is measured with Compression Ratio (CR)
   •   Compression Ratio (CR) = Original Size / Compressed Size : 1
   •   Consider a 16 character string => 000PPPPPPXXXXAAA => 306P4X3A
   •   CR = 16/8 : 1 => CR = 2 : 1
   •   The various Run Length Coding approaches are represented below.
    When no repeating bits need more bits to represent original image
Bit-Plane Coding
  An m-bit gray scale image can be converted into m binary images by bit-plane slicing.
  The intensities of an m-bit monochrome image can be represented in the form of the base-2
                     m-1
polynomial am-12         + am-22m-2 + … + a121+a020
Example:
Let I be the following 2x2 image where the pixels are 3 bits long
             101 110
             111 011
The corresponding 3 bitplanes are:
1      1       0    1       1    0
1      0       1    1       1    1
    An m-bit gray scale image can be converted into m binary images by bit-plane slicing.
    These individual images are then encoded using run-length coding.
  ― Code the bit-planes separately, using RLE (flatten each plane row-wise into a 1D array),
Golomb coding, or any other lossless compression technique. However, a small difference in the
gray level of adjacent pixels can cause a disruption of the run of zeroes or ones.
Eg: Let us say one pixel has a gray level of 127 and the next pixel has a gray level of 128.
In binary:
127 = 01111111 & 128 = 10000000
Therefore a small change in gray level has decreased the run-lengths in all the bit-planes!
Gray Code
    1. Gray coded images are free of this problem which affects images which are in binary
       format.
    2.    In gray code the representation of adjacent gray levels will differ only in one bit (unlike
         binary format where all the bits can change).
    Gray Code
    Let gm-1…….g1g0 represent the gray code representation of a binary number.
       g i  ai  ai 1    0i  m2
       g m1  a m1
   In gray code:
    127 = 01000000
    128 = 11000000
  To convert a binary number b1b2b3..bn-1bn to its corresponding binary reflected Gray code.
 Start at the right with the digit bn. If the bn-1 is 1, replace bn by 1-bn ; otherwise, leave it
unchanged. Then proceed to bn-1 .
  Continue up to the first digit b1, which is kept the same since it is assumed to be a b0 =0.
  The resulting number is the reflected binary Gray code.
  Dec           Gray            Binary
    0              000          000
   1             001            001
   2             011            010
   3             010            011
   4             110            100
   5             111            101
   6             101            110
   7             100            111
Decoding a gray coded image The MSB is retained as such, i.e.,
  ai  g i  ai 1   0i  m2
  am1  g m1
5.3 Psycho visual redundancy
Human eye does not respond with equal sensitivity to all visual information. Certain information
simply has less relative importance than other information in visual processing. This information
is said to be Phychovisual Redundant.It is associated with real or quantifiable visual information.
Elimination of psychovisually redundant data results in a loss of quantitative information.It is
referred as quantization.
Quantization => Mapping of broad range of input values to a limited number of output values.
It is an irreversible operation (Visual Information lost) => Lossy data compression.Consider an 8
bit image with 256 possible gray levels quantized to 16 possible gray levels .
Example First we have a image with 256 possible gray levels . We can apply uniform
quantization to four bits or 16 possible levels The resulting compression ratio is 2:1. Note , that
false contouring is present in the previously smooth regions of the original image.
The significant improvements possible with quantization that takes advantage of the peculiarities
of the human visual system . The method used to produce this result is known as improved gray-
scale ( IGS) quantization. It recognizes the eye’s inherent sensitivity to edges and breaks them up
by adding to each pixel a pseudo-random number, which is generated from the order bits of
neighboring pixels, before quantizing the result.
5.4 Image Compression Models
A typical image compression system consists of Encoder, Channel and Decoder.
   •   An input image f(x,y) is fed into the encoder which creates a set of symbols from input
       data. The decoder should be responsible to decode the encoded representation.
   •   Encoder => Source Encoder and Channel Encoder.
   •   Decoder => Source Decoder and Channel Decoder.
   •   Channel Encoder (Increases the noise immunity of the source encoder’s output).Channel
       encoder and decoder plays a vital role in the overall encoding and decoding process.
   •   Useful channel encoding technique => Hamming Code (Error detection and correction).
   •   Source Encode and Decoder
First Stage: the mapper transforms the input data into a format designed to reduce interpixel
redundancies in the input image. This operation is reversible and may or may not reduce directly
the amount of data required to represent the image. Eg. RLE
   •   Second Stage: Quantizer block reduces the accuracy of the mapper’s output in
       accordance with some pre-established fidelity criterion. This stage reduces the
       psychovisual redundancies of the input image.
   •   Final Stage: the symbol coder creates a fixed or variable length code to represent the
       quantizer output and maps the output in accordance with the code. In most cases, VLE
       code is used. It assigns the shortest code words to the most frequently occurring output
       values and thus reduces coding redundancies.
   •   In source encoding process, all 3 operations are not necessarily included in every
       compression system. No quantizer if there is error free compression. But Source decoder
       contains only two components, a symbol decoder and an inverse mapper. Because
       quantizer results in irreversible information loss, an inverse quantizer block is not
       included in the general source decoder model.
Predictive Coding
   •   Statistical estimation procedure where future random variables are predicted from past
       and present observable variables.
   •   Image Compression
                     Lossless Predictive Coding
                     Lossy Predictive Coding.
       Lossless Predictive Coding
   •   It is based on elimination of interpixel redundancy of closely spaced pixels by extracting
       and coding only the new information.
   •   The new information => (Actual pixel Value – Predicted Pixel Value).
   •   The basic components of lossless predictive coding system are shown below.
       Fig.: Lossless Predictive Coding Model a) encoder b) decoder.
   •   The encoder and decoder each contain identical predictor.
   •   At each successive pixel of input denoted by (fn) is introduced to the encoder.
   •   The predictor generates the anticipated value of that pixel based on some number of past
       inputs.
   •   The output of pixel => Rounded to the nearest integer denoted by (f’n).
•   Prediction error is given by en=fn - f’n. It is coded using variable-length code
•   The decoder reconstructs en from the received variable length code and performs inverse
    operation. fn = en + f’n
•   Prediction is formed by a linear combination of m previous pixels.
•   f’n = round
•   Where m is the order of the linear predictor.
•   for i = 1,2………….m are prediction coefficients.
•   1-D linear predictive coding can be written as
•   f’(x,y) = round [                  ]
    Lossy Predictive Coding
•   It is based on the concept of compromising the accuracy of the reconstructed image in
    exchange for increased compression. The basic components of lossy predictive coding is
    shown below.
    Fig.: A Lossy Predictive Coding Model a) encoder b) decoder.
•   The Quantizer which absorbs the nearest integer function of the error free encoder is
    inserted between the symbol encoder and the point at which the prediction error is
    formed. It maps the prediction error into a limited range of outputs denoted ėn which
    establish the amount of compression and distortion associated with lossy prediction
    coding.
•   The error free encoder donot have quantizer so the predictions generated by the encoder
    and decoder are equivalent. The above fig. shows this accomplished by placing the lossy
    encoder’s predictor within a feedback loop, where its input denoted by f’n is generated as
    a function of past predictions and the corresponding quantized error. The closed loop
    configuration prevents error build up at the decoder’s output.
                          ie      f’n = ėn + f’’n
    Example of lossy predictive coding : Delta Modulation