KEMBAR78
Chrominance Subsampling in Digital Images | PDF | Digital Technology | Imaging
100% found this document useful (1 vote)
27 views15 pages

Chrominance Subsampling in Digital Images

Uploaded by

Peter Wentworth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
27 views15 pages

Chrominance Subsampling in Digital Images

Uploaded by

Peter Wentworth
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Chrominance Subsampling in Digital Images

Douglas A. Kerr
Issue 3
January 19, 2012

ABSTRACT
The JPEG and TIFF digital still image formats, along with various digital video
formats, have provision for recording the chrominance information (which conveys
in a special way what the lay person would describe as the “color” of the pixels) in
a resolution lower than that of the image being encoded. This concept, followed for
over half a century in television broadcasting, takes advantage of the properties of
the human perceptual system to reduce the amount of data required to convey an
acceptable full-color image of certain pixel dimensions. There are various standard
“patterns” for performing this “chrominance subsampling”, and several curious and
confusing systems of notation for indicating them. In this article we discuss the
concept of chrominance subsampling and describe various systems of notation
used in this area.

BACKGROUND
The color space
A digital image that is to be encoded using the JPEG image data coding and
compression system, one form of the TIFF image coding system, and various digital
video formats is first put into what is called a luma-chrominance color space. In
this form, the color of a pixel is described by two values, one (luma) essentially
(but not exactly) describing its luminance (brightness), and one (chrominance)
describing what a lay person would think of as its “color”. The latter is a slightly
different concept from the basic color science concept of chromaticity, but we
need not concern ourselves here with the distinction. The metric for chrominance
is, as we might expect, two-dimensional in the mathematical sense: two numerical
values are actually required to express it (a total of three values for the color).

As hinted at just above, the first value in this scheme does not actually describe
the luminance of the pixel’s color. As a result, it is often called “luma”, a term
borrowed from the analog system used for television signals. This term is a tip that
the value does not quite describe luminance, because of its nonlinear form. And in
fact, paralleling this, the value pair giving the chrominance is also sometimes called
“chroma”, again primarily to tip us off to its nonlinear form. But here we will use
the term chrominance, as it best matches normal editorial practice for the topic
area we are considering.

Thus, for each pixel, there are three numerical values that collectively describe its
color. They are identified as Y, Cb, and Cr. Y is the luma value, and Cb and Cr
collectively form the chrominance value. These are derived from an RGB color
space, where R, G, and B are nonlinear representations of the relative contributions
of three primary chromaticities (also called R, G, and B.
Copyright 2005, 2012 Douglas A. Kerr. May be reproduced and/or distributed but only intact, including this
notice. Brief excerpts may be reproduced with credit.
Chrominance Subsampling Page 2

Chrominance subsampling
During the early work on color television systems (analog, of course), note was
taken of the fact that the human eye is able to discern finer detail conveyed by
differences in luminance than for detail conveyed by differences in chromaticity.
The encoding scheme adopted there separately conveys the luminance-related
value luma and the chromaticity-related value chroma (chrominance) over
“subchannels” having different bandwidth (and thus supporting different levels of
resolution)—the chrominance subchannel having reduced resolution capabilities.
The result was a system that well matched human perceptual response, allowing
the conveyance of quality images with less overall bandwidth requirement than if
equal bandwidth were allocated to luma and chrominance information.

Not surprisingly, the developers of systems for the encoding of digital still images
decided to exploit this same consideration to get the “biggest bang for the bit” in
digital images being prepared for transmission or storage. There, the process is
called chrominance subsampling.

Simply stated, here is the principle. We include in the digital data stream to be
encoded by the JPEG system the luma value (Y) for each pixel in the image. But we
only include a single Cb+Cr pair (a “chrominance value”, often described as a
chrominance sample) for a group of pixels—which in the schemes generally
recognized can comprise 2, 4, or even 8 image pixels. Thus the data load for the
chrominance information—which otherwise would be twice that for the luma
information (Y, Cb, and Cr are all recorded in the same number of bits, usually 8)—
is now reduced by a factor of 2, 4, or even 8.

In fact, it is often useful to think of this in terms of the chrominance being given
for “chrominance pixels” which are 2, 4, or even 8 times the size of the image
pixels.

This process is sometimes spoken of as “chrominance decimation”, where


decimation (in this context) essentially “means thinning out a data set by discarding
all but a certain fraction of the values” 1. However, the way chrominance
subsampling is usually done does not exactly fit that definition.

“Siting” of the chrominance samples


This now leads to another issue. Suppose we are using a pattern in which the
“chrominance pixel” is twice as wide and twice as high as an image pixel. Should
its “centroid” be at the center of an image pixel, or should it be at the center of the
group of four image pixels? In fact, there can be advantages to each, and both
possibilities are potentially available for each subsampling pattern. We’ll hear more
about that later.

1
Decimation originally referred to the practice in Roman times of killing one-tenth of the citizens of
a rebellious town. It later came to be misunderstood to mean keeping only one-tenth of a population
of items (perhaps data points), and was then broadened to the more general meaning used today.
Chrominance Subsampling Page 3

a. Image and chrominance pixels b. Subsampling pattern notation


(centered alignment)
H: chrominance resolution horizontal
V: chrominance resolution vertical Chrominance sample
T: chrominance resolution total No chrominance sample

Pattern identifier
reference "block"
Image pixel Chrominance pixel
Corner of pixel block shown at left
Centroid of chromiannce pixel

4:4:4 4:4:4 4
4
H: 1/1
V: 1/1
T: 1/1

4:4:0 4:4:0
4
H: 1/1 0
V: 1/2
T: 1/2

4:2:2 4:2:2 2
2
H: 1/2
V: 1/1
T: 1/2

4:2:0  4:2:0
2
H: 1/2 0
V: 1/2
T: 1/4

4:1:1 4:1:1 1
1
H: 1/4
V: 1/1
T: 1/4

4:1:0 4:1:0
1
H: 1/4 0
V: 1/2
T: 1/8

 This is the most common "centered" form for 4:2:0 for still images; others are used in video

Figure 1. Chrominance subsampling patterns (centered alignment)

SUBSAMPLING PATTERNS
Figure 1 shows, in part a, six chrominance subsampling patterns (actually, the first
one is no subsampling at all), including all the ones widely used in common image
Chrominance Subsampling Page 4

encoding schemes. These patterns are identified by a notation system we will


describe shortly.

Each example shows a portion of the original image 8 pixels wide and 4 pixels
high, and indicates (with heavy lines) the boundaries of the “chrominance pixels”.
The chrominance of all the image pixels covered by each chrominance pixel is
averaged and included (as a pair of Cb and Cr values) in the image data for the
chrominance pixel. The dots show the centroids of these chrominance pixels, and
also help us do a visual “head count” of the chrominance values. Note that all
these examples show the “centered” alignment: the centroids of the chrominance
pixels are located in the center of the set of the centroids of the associated
luminance pixels. The chrominance pixels each embrace a set of integral image
pixels.

Just below indicator for the pattern (e.g., 4:4:4—don’t worry for the moment
about what that means or why) we show how the resolution of the chrominance
pixels compares to the resolution of the image itself. The H value is the relative
resolution in the horizontal direction, the V value is the relative resolution in the
vertical direction, and the T (“total”) value is the relative resolution in terms of pixel
count (sometimes called the “areal” resolution), all as fractions.

Note that each image pixel gets a luma value (luma sample). In most writings about
this matter, resolution comparisons are made between the “chrominance samples”
and “luma samples”, rather than between the “chrominance pixels” and “image
pixels”, as we do here. And often the “ratio” is described other-side up as a
sampling factor—a sampling factor of “4” in the horizontal or vertical direction
means a resolution of 1/4 the image (or luma) resolution.

The first pattern shown (4:4:4) is in fact the case where there is really no
chrominance subsampling at all—every image pixel has its chrominance value
included.

There are two patterns (4:4:0 and 4:2:2) which have chrominance pixels twice the
size of image pixels (T:1/2). In the first of these the (rectangular) chrominance
pixels are vertically-oriented, and in the other, horizontally-oriented. There are two
patterns (4:2:0 and 4:1:1) which have chrominance pixels four times the size of
image pixels (T:1/4). In the first of these the chrominance pixels are square, and in
the other, rectangular and horizontally-oriented.

In the last pattern (4:1:0), the chrominance pixels are eight times the size of the
image pixels (T:1/8), and are rectangular and horizontally-oriented.

Note that the specification for the kind of JPEG image file used today by most
digital still cameras (the JPEG Exif file), only two of these patterns are allowed:
4:2:2 and 4:2:0.2

2
The 4:2:0 scheme is often incorrectly identified as “4:1:1”. The origin of this widespread error is
not known to me.
Chrominance Subsampling Page 5

Image and chrominance pixels


(co-sited alignment)

Image pixel Chrominance pixel

Centroid of chrominance pixel

4:4:4
H: 1/1
V: 1/1
T: 1/1

4:4:0
H: 1/1
V: 1/2
T: 1/2

4:2:2
H: 1/2
V: 1/1
T: 1/2

4:2:0
H: 1/2
V: 1/2
T: 1/4

4:1:1
H: 1/4
V: 1/1
T: 1/4

4:1:0
H: 1/4
V: 1/2
T: 1/8

Figure 2. Image and chrominance pixels (co-sited alignment)

Chrominance pixel alignment


The examples in Figure 1 all show the arrangement when the implied chrominance
pixel actually embraces a number of full image pixels (known as the “centered”
alignment). There, each implied chrominance pixel is centered on the center of the
related pixel block.

In figure 2, we see the other alternative (the ”co-sited” alignment) in one form.
There, each implied “chrominance pixel” is centered on the upper-left image pixel
of the related pixel block.

Some implications of this will be discussed in a later section.


Chrominance Subsampling Page 6

THE BOTTOM LINE


The intricacies of the charts above (and of the common notation system for
subsampling patterns, already glimpsed above, and to be explained shortly) hide the
fact that, for the cases of common interest to us, the subsampling pattern can
really be described by two numbers of simple meaning: the horizontal and vertical
subsampling factors:

• The horizontal subsampling factor tells us for how many image pixels, in the
horizontal direction, is there a chrominance “sample” (Cb+Cr). If that factor is
4, then there is one chrominance sample for every 4 image pixels in the
horizontal direction.

• The vertical subsampling factor tells us for how many image pixels, in the
horizontal direction, is there a chrominance “sample” (Cb+Cr). If that factor is
1, then there is one chrominance sample for every image pixel (that is, for every
row of image pixels) in the vertical direction.

Often, these two defining factors are called “h” and “v”, respectively, and are
often written in the form (for the examples above): “4x1” or “4/1”. Note that the
latter does not in any way have the significance of a fraction.

SUBSAMPLING PATTERN NOTATION


Unfortunately, the subsampling patterns we encounter are not ordinarily described
by the straightforward “h/v” notation, but rather by something far more arcane. We
saw it in the figures above, and now we are ready to tackle it. We can follow the
action on part b of figure 1.

The scheme indicator is of the form J:a:b. The notation revolves around the
concept of a “reference block”—a conceptual region J image pixel spacings wide
and 2 image pixel spacings high. (For all schemes we encounter, J, by convention,
is 4.) This block is not necessarily exactly aligned with the grid of image pixels (and
luminance values). The small chevron at the upper left of each reference block
shows the relative location of the upper left corner of the block of image pixels as
shown to the left.

The dots in the figure (white and black) represent the chrominance samples (each
recorded as a Cb value plus a Cr value) that would exist if there were no
subsampling. The black dots show the chrominance values that actually exist for
this scheme.

Note that, if we consider our reference block, the indicator value a shows the
number of chrominance samples actually present in the top row of the block; the
indicator value b shows the number of chrominance subsamples actually present in
the bottom row of the block. We see that emphasized by the little figures to the
left of the reference block in the figure.
Chrominance Subsampling Page 7

Note that there is a one-to-one correspondence between the black dots in part b of
the figure and the little black dots indicating the centroids of the chrominance
pixels in part a of the figure.

Note that the 4:2:2 pattern could as well have been designated “2:1:1”, as the
purpose of the notation is to convey relative sampling “frequencies”. However, for
patterns where the ratios involve only the numbers 1, 2, and/or 4, it is customary
to always make J=4. There are patterns, used in some specialized video systems,
in which J is 3, thus accommodating these patterns’ chrominance subsampling
factor of 3 in the horizontal direction.

Relationship with “h/v” notation


The correspondence between the J:a:b notation and the “h/v” notation is shown
here all the possible variations (including some rarely-encountered ones):

J:a:b h/v
4:4:4 1/1
4:4:0 1/2
4:2:2 2/1
4:2:0 2/2
4:1:1 4/1
4:1:0 4/2
Irregular notation
Recall that the vertical subsampling factor is expressed in the J:a:b notation in
terms of a pattern of two consecutive rows of pixels. The scheme only allows for
value of “v” of 1 and 2, as follows:
v=2: b=0
v=1: b=a

In some situations, we encounter a pattern in which both v and h are 4. This


cannot be represented by the J:a:b notation as defined above.

A special convention has apparently been adopted to cater to this. It works like
this:

J:a:b h/v
4:4:1 1/4
4:2:1 2/4

Basically, if J is 4, and b 1, but a is not 1, then the vertical sampling factor is 4.


(One can construct all sorts of clever rationalizations for this; I leave that exercise
to the reader.)
Chrominance Subsampling Page 8

Misunderstandings
Not surprisingly, this peculiar system of notation has been subject to some
misunderstandings, unfortunately widespread. We will mention three of them here.

The meaning of a and b in the “J:a:b” notation


Often, especially in the area of digital video work, we hear the subsampling pattern
notation system described this way:

“The first number gives the number of luma samples that we consider. The
second number gives the number of Cb values over that span, and the third
number gives the number of Cr values over that span.”

This is generally followed by something like this:

“Notations such as 4:2:0 do not follow the rule.” (No kidding!)

Note that the erroneous definition does in fact appear to be true when a=b.

We will see later that this in fact describes a different notation system that has
been used in the past; it does not apply to the system mostly encountered today
(which is why it seems anomalous).

4:2:0 vs. 4:1:1


Very commonly, the 4:2:0 pattern is erroneously described as “4:1:1”. The author
has not been able to track down the origin of this error.

This error is found in many image editing packages offering the opportunity to
select different subsampling patterns when an image is saved in JPEG form.

U and V vs. Cb and Cr


This is not really an error, but a matter of editorial practice. It can however be
confusing in following the literature.

Often we will hear the Cb and Cr values described as U and V.

U and V are the coordinates of the color space YUV color space which underlies
the YCbCr color space. Cb and Cr are the quantized digital representations of the U
and V values of a color in the YUV color space. Thus it may be reasonable to
speak, conceptually, of the chrominance of a pixel itself in terms of U and V, or of
a chrominance sample as comprising U and V values. However, in a digital image
context, it is more useful to make reference to Cb and Cr (which is how the values
are designated in the actual digital image data).

REPRESENTATION IN Exif FILES


Two different ways of representing the chrominance subsampling are used in Exif
files. We would not ordinarily be interested here in such “internal “representations,
but in fact two systems used to present a subsampling pattern, or even to set it in
Chrominance Subsampling Page 9

an image-generating program, flow directly from these. Those “human” notation


schemes are best understood by first looking at the “file” context.

Uncompressed JPEG Exif files


In an uncompressed JPEG Exif file (rarely encountered), the subsampling pattern is
represented in the most straightforward way we will encounter.

The metadata tag YCbCrSubSampling comprises two eight-bit numbers, the


horizontal and vertical “subsampling factors”. These are just the horizontal and
vertical subsampling factors, h and v, discussed above.

In compressed JPEG Exif files


In a compressed JPEG Exif file (the type we almost always encounter in digital
photography), a different scheme of representing the subsampling pattern is used.

Here, in marker SOF0, there are four 8-bit values, designated H1, V1, H2, V2, H3,
and V2. Each pair (e.g., H1 and V1) is listed in the portion of the marker pertaining
to one of the three “components” of the image, Y, Cb, and Cr. They are said to be
the chrominance subsampling factors, in the horizontal and vertical directions, of
those three components.

But that is misleading as to H1 and V1, since there is no subsampling of the Y


(luma) component. Actually, those two values are reference values. They can be
thought of as describing the horizontal and vertical dimensions (in pixels) of a block
of pixels defined only for purposes of stating the subsampling arrangement. (They
are rather like the value “J” in the J:a:b: scheme of notation.)

The subsampling factors (in the same sense as mentioned earlier) for Cb and Cr are
these:

H1 V1
For Cb—horizontal (h): ; vertical (v):
H2 V2

H1 V1
For Cr—horizontal (h): ; vertical (v):
H3 V3

Of course, in most cases of interest, the subsampling factors are the same for Cb
and Cr, and among other things, this means that H3=H2 and V3=V2.

In table 3 we show the implications of 12 patterns of the H- and V- values both in


J:a:b notation and h/v notation. The reason for the choice of this particular
repertoire will be seen shortly.
Chrominance Subsampling Page 10

Compressed JPEG Exif file


H1 V1 H2 V2 H3 V3 J:a:b h/v
1 1 1 1 1 1 4:4:4 1/1
1 2 1 1 1 1 4:4:0 1/2
1 4 1 1 1 1 4:4:1* 1/4
1 4 1 2 1 2 4:4:0 1/2
2 1 1 1 1 1 4:2:2 2/1
2 2 1 1 1 1 4:2:0 2/2
2 2 2 1 2 1 4:4:0 1/2
2 4 1 1 1 1 4:2:1* 2/4
4 1 1 1 1 1 4:1:1 4/1
4 1 2 1 2 1 4:2:2 2/1
4 2 1 1 1 1 4:1:0 4/2
4 4 2 2 2 2 4:2:0 2/2

* Irregular notation

Figure 3. Compressed JPEG Exif file subsampling encoding

It would seem that these three H/V combinations would produce the same
subsampling pattern (shown in J:a:b and h/v notation):

1,2,1,1,1,1 4:4:0 1/2

1,4,1,2,1,2 4:4:0 1/2

2,2,2,1,2,1 4:4:0 1/2

As you can see from the table, there are other seemingly-redundant sets of values.
This may just be an artifact of this peculiar notation, although there may in fact be
some subtlety of the notation unknown to me that would give these combinations
different implications.

IN IMAGE EDITING PROGRAMS


Image editing programs generally allow the user to choose which subsampling
pattern will be used when writing JPEG files, generally one factor in establishing a
“degree of compression” or, conversely, an “image quality”. Rarely is the degree of
compression expressed in a way that is easily grasped by the user (such as the
“h/v” notation).

Further, in the three programs described here, only one offers the widely-accepted
(if still confusing) J:a:b notation, and it gets it wrong in one choice out of three.

In Photoshop
In Photoshop CS2 (the latest version I have!) one can change the compression
settings for saved JPEG files, but there is no explicit setting for the chrominance
subsampling aspect. Rather, one of two patterns is preordained for any given
numerical “quality” level.
Chrominance Subsampling Page 11

In the regular Save As operation, where the “quality” can be set over the range 0
through 10, for all values up through 6 the chrominance subsampling is “2x2”
(4:2:0); for 7 and above it is “1x1” (4:2:2).

In the Save for Web operation, where the quality can be set from 0-100 (go
figure!), for all values up through 50 the chrominance subsampling is “2x2”
(4:2:0); for 51 and above it is “1x1” (4:2:2).

In Paint Shop Pro


The popular image editing program Paint Shop Pro 9 allows the user to set one of
12 different subsampling patterns to be used for the writing of JPEF Exif files.
There are described in the H1,V1,H2,V2,H3,V2 notation actually used inside the
file (completely incomprehensible to the user), which was described above.

The presentation in the Save Options dialog, Chroma Subsampling dropdown box,
looks like this:

YCbCr 2x1 1x1 1x1

where the six numerical values are H1, V1; H2, V2; and H3, V3.

The repertoire of combinations is in fact that seen in the table of Figure 1 (that’s
why it was chosen there: to get ready for this section).

In fact, although we might expect the 1x2, 1x1, 1x1 and 2x2, 2x1, 2x1 choices to
produce the same subsampling pattern, the resulting file sizes are slightly different,
so there is certainly some subtlety there I do not pretend to understand.

In Picture Publisher
In Picture Publisher 10, when you invoke File>Save As, if you select the JPEG file
type, the Save As dialog includes an Options button, which brings up the JPEG
Options dialog. It includes a dropdown selector for Subsampling, which offers
these choices:
YUV 4:4:4 (High Resolution) That produces 4:4:4, or 1x1.
YUV 4:2:2 (Medium Resolution) That produces 4:2:2, or 2x1.
YUV 4:1:1 (Low Resolution) That produces 4:2:0, or 2x2.
The misidentification of the 4:2:0 pattern as “4:1:1” is widespread. The
misidentification of the encoding system as YUV (rather than YCbCr) has been
earlier discussed.

DATA PACKING
Although it is not part of the real topic of this article, an interesting related matter
is the way in which the Y, Cb, and Cr values for an image are arranged as a data
stream, perhaps for presentation to the software routines that encode the ensemble
of data into JPEG or TIFF form (a matter often called data packing). For each
Chrominance Subsampling Page 12

subsampling pattern, there may be several standardized data packing


arrangements. Just to give some insight into this, we show on figure 4 a common
data packing arrangement for the 4:2:0 subsampling pattern (centered alignment).

Sampling pattern Image pixel

Y Y Y Y Y Y Y Y
Chrominance pixel
1,1 1,2 1,3 1,4 1,5 1,6 1,7 1,8

C
1,1
C
1,3
C
1,5
C
1,7
Y
1,1 Luma sample
Y Y Y Y Y Y Y Y
2,1 2,2 2,3 2,4 2,5 2,6 2,7 2,8
C
1,1 Chrominance sample

Byte stream

Cb 1,1 Y1,1 Cr 1,1 Y1,2 Y2,1 Y2,2 Cb 1,3 Y1,3 Cr 1,3 Y1,4 Y2,3 Y2,4 Cb 1,5

Figure 4. Data packing for 4:2:0 subsampling


The figure shows a block of image pixels 8 pixels wide and two pixels high, divided
into chrominance pixels 2 x 2 image pixels in size, in the way intimated by the
“centered” form of the 4:2:0 subsampling pattern. The yellow dots show the
centroids for the luma samples, the green dots the centroids for the chrominance
samples. The indexes for the chrominance samples (and their Cb and Cr values) are
those of the nearest luminance sample above and to the left.

The data packing arrangement operates on an entire chrominance pixel at a time


and then moves to the next chrominance pixel; it does not operate the basis on
image pixels. The four Y values (one for each image pixel) and the Cb and Cr
values (for the chrominance pixel) are placed in the byte stream as shown.

The calculation of the analog quantities U and V underlying Cb and Cr involve B


and R, respectively, thus the notation Cb and Cr. The reason the color space is
called YCbCr (rather than YCrCb) is because of the natural order of U and V.

A word of caution: especially for other subsampling patterns, there are data
packing arrangements which seem to follow a similar principle regarding the
placement of Cb and Cr but in which their order is opposite that shown here, the
idea being to more closely match the familiar sequence R, (G), B.

A UNIQUE VARIANT
The “DV” digital video standard, in its “European” (PAL-compatible) version, uses a
unique form of the 4:2:0 subsampling pattern. It is shown in figure 5.
Chrominance Subsampling Page 13

Chrominance sample (Cr) and luma sample


Image pixel (luma pixel)
Chrominance sample (Cb) and luma sample

Chrominance pixel (Cr) Luma sample only (no chrominance sample)

Pattern identifier
Chrominance pixel (Cb) reference "block"

Corner of pixel block shown at left

4:2:0 4:2:0 "2 x 1/2"

"2 x 1/2" 
H: 1/2
V: "1/2"
T: 1/4

 Attributed to "first line" with regard to pattern identifier (4:2:0)

Figure 5. DV-PAL subsampling pattern

The unique feature of this pattern is that the Cb and Cr values are not associated
with the same location on the image; that is, to use our notation, with the same
chrominance pixel.

If in fact the chrominance values are derived from true chrominance pixels (that is,
as an average of the chrominance over several image pixels), it probably has to be
done as a weighted average over nine image pixels (all of which fall, at least in
part, within the chrominance pixel). The figure shows the chrominance pixels based
on that concept.

However, evidently the standard for this subsampling pattern does not prescribe
just how that is to be done.

Of course, associating a J:a:b identifier with this subsampling pattern requires a


little creativity; the notation system doesn’t really apply cleanly there. Officially, it
is given the identifier 4:2:0. The right hand part of the figure offers a fanciful
rationale for that.

AN EARLIER FORM
Early in the development of digital imaging, another form of subsampling notation
was used, one that unfortunately was presented in just the same form as the J:a:b
notation used today. We still find it used today in articles about subsampling, often
mixed with J:a:b notation without the difference being mentioned.

As we mentioned at the outset, in the NTSC television signal format (the standard
for North American analog television broadcast, among other things), a
luma-chrominance scheme is used (called YIQ). The two axes of the chrominance
plane were designated I and Q—a back-formation from the way they are conveyed,
by quadrature amplitude modulation of a subcarrier (I relates to the in-phase
component, while Q relates to the quadrature component).
Chrominance Subsampling Page 14

As we mentioned before, the resolution of the chrominance component is lower


than that of the luma component (exploiting the greater acuity of the human eye
for luminance changes than for chromaticity changes). But beyond that (not
mentioned earlier), the resolution of the Q coordinate of chrominance is less than
that for the I coordinate. This is to exploit the fact that the acuity of the human
visual system to chromaticity difference was less along the Q axis than along the I
axis. The benefit is that even less total bandwidth is thus required to transport the
entire signal. The way this is done is very clever and a bit tricky, but we need not
go into it for our purposes here.

When digital representation of images was coming into play, some workers wanted
to follow the YIQ concept, including using a lower “resolution” for the Q chroma
axis. To express this, a forerunner of the J:a:b notation system was used, which I
will call “K:c:d”. Here, as in the modern scheme, K represented (arbitrarily) the
resolution of the luma (Y) coordinate; c represented the horizontal resolution of the
i coordinate (the digital equivalent of I), and d the horizontal resolution of the q
coordinate (the digital equivalent of Q). There was no concept of vertical
subsampling: each row had the same pattern of Y and i+q values.

A common format, expressed in K:c:d: form, was “4:2:1”. This meant that for
every four pixels (and thus every four luma values), there were two i values but
only 1 q value.

When the YCbCr coordinate system came into play, there was an early attempt to
follow the same concepts of asymmetrical resolution in the chrominance plane:
different subsampling for Cb and Cr. Again, the hope was to reduce the overall
required “bandwidth” (of course, we were now actually speaking of bit rate, but by
parallel to the analog situation, this was often called “bandwidth”, as unfortunately
it is today) without degradation of perceptual quality.

This never really caught on, for a couple of reasons, one of which was that the Cb
and Cr axes did not correspond to the highest- and lowest-chromatic acuity axes of
the human eye—they were not chosen for that (as were the I and Q axes), but just
flowed from the R and B coordinates of the RGB color space, which were dictated
by the R and B primaries.

Unfortunately, when the J:a:b notation for (symmetrical) subsampling came into
play, the presentation looked just like K:c:d.

Interestingly enough, the arrangement we today call “4:2:2” would also be called,
in “K:c:d” notation, “4:2:2” (even though the meaning of the third number differs
between the two conventions). The arrangement we call today (in J:a:b form)
“4:2:0” cannot be represented in K:c:d form (since that does not accommodate
any vertical subsampling: different subsampling on even and odd rows).

Similarly, the arrangement called, in K:c:d form, “4:2:1” (not often encountered
today) cannot be represented in J:a:b form (since that does not accommodate
different subsampling for Cb and Cr values).
Chrominance Subsampling Page 15

There is some possibility that the confusion between K:c:d notation and J:a:b
notation is responsible for some of the errors we find in this area, although I cannot
construct a scenario for that.

A DOSE OF REALITY
In order to most clearly illustrate the concepts and principles involved, I have
spoken in terms of “chrominance pixels” and have intimated that the chrominance
values are in fact determined over these (by some appropriate type of averaging of
the their chrominance values.

But that is not always done. In some cases, a more primitive means of determining
what chrominance to “send” is used. In the worst case, the chrominance of one
image pixel is snagged and transmitted on behalf of the chrominance pixel.

In any event, what happens at the “receiving” end? There, decoding the YCbCr
data stream (which does not contain Cb and Cr values for every pixel) is expected
to produce a Y, Cb, and Cr value for every image pixel. From those values, we
derive an RGB representation of every pixel for further handling.

Ideally, this would be done by interpolation between the transmitted chrominance


samples. But that’s not always done. For example, in many video systems
(especially those using a co-sited arrangement of chrominance pixel centroids), the
value of a received chrominance sample (one Cb,Cr pair) is used for the
reconstruction of several image pixels (four pixels if we imagine a 4:1:1
subsampling pattern).

This typically results in the following:

• The chromaticity of the resulting image will seem to be applied in “blobs”,


rather than changing smoothly as we move across an object.

• The chromaticity will seem to be shifted to pixels to the right compared to the
luminance (by two image pixels in the example of 4:1:1).

You might also like