KEMBAR78
Image Graphics | PDF | Image Resolution | Pixel
0% found this document useful (0 votes)
14 views26 pages

Image Graphics

Chapter 4 discusses images and graphics, focusing on digital image representation, formats, and computer image processing. It explains how images are spatial representations, their sampling and quantization, and the various formats used for capturing and storing images. Additionally, it covers the principles of image synthesis and the role of graphics in user interfaces, office automation, and scientific visualization.

Uploaded by

yamkumar077a
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views26 pages

Image Graphics

Chapter 4 discusses images and graphics, focusing on digital image representation, formats, and computer image processing. It explains how images are spatial representations, their sampling and quantization, and the various formats used for capturing and storing images. Additionally, it covers the principles of image synthesis and the role of graphics in user interfaces, office automation, and scientific visualization.

Uploaded by

yamkumar077a
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Chapter 4

Images and Graphics

An image is a spatial representation of an object, a two-dimensional or three-


dimensionit scene or another image. It cin be ieal oi.'"iituuf. An i-rg; *"y
be abstractiy tliought of as a continuous function defining usualti t rectangular re-
gion of a plane. For example, for optic or photographic sensors,an im18e iflyprytlly
proportional to the radiant energy received in the electromagnetic band to which
the sensor or detector is sensitive. In this case, the image ii called ai iitensliy
image. For range finder sensors,an image is a function of the line-of-sight distance
from the sensorposition to an object in the three-dimensionalworld; tu.h u" i-ag;
is called a r&nge image. For tactile sensors,an image is proportional to the sensor
deformation caused,by the suiface of or around an object

A recorded image may be in a photographic, analog video signal or digital format'


In computer vision, an image ls'-usua"Uta reCordedimage suih as a-video-image,
digitii image or iplcture. In computer graphics, an image is always a digital image.
In multimedia appiications, 6,ll formats can be presented.

In this chapter, the digital image will be discussed;this is the computer represen-
tation most important for processingin multimedia systems. We present basic con-
cepts of image representation. Further, computer processingof images is described
with an overview of topics on image generation, recognition and transmission'

DD
56 CHAPTER 4. IMAGES AND GRAPHICS

4.L Basic Concepts

An image might be thought of as a function with resulting valuesof the light intensity
tr-914h-lli.1-.9"",t.3p-lariariaBion. For ili$T5l-compnteropei5lio-ns,
th-GmncEon
needs to be sampled at discrete interviG. The ;ampmg-quantizes th-e-Intensity
values inlo diJCreie lliveiJ.

4.L.L Digital Image Representation

A digilal llqgg is represented by a matrix of numeric .values each representing a


---,
qollliz i- intensity v:,lue. When 1is a two-dimeniional matrix, then l(r,c) is the
l._" -- -

intensityvalueat the position corieffi[ing to-row r and column c oTah" *rttix.


The points at which an
image is sampled are known as picture elements,commonly
abbreviated as pirels. ihe pixel valuesof intensity i-u,g", are.iii"d gro|yiiiatifeiils
(we encodehere the "colort' of the image). The intensity at each pixel is represented
by an integer and is determined from the continuous image by averagingover a small
neighborhood around the pixel location. If there are just two infensity lrulo"r, fo,
example, blaCliand white, they are representedby the numbers [i ;nd 1fs-ucliimages
are called btiaiy-iatued iiiges. When S-bit integers are ,rild io store eacli piiel
value, the griy levels iinge from 0 (black) to 255 (white| An-example of such
an
image ii iliown in Pigure +.1.

It is common to use a square sampling grid with pixels equally spaced along the two
sides of the grid. The distance between grid points obviously affects the accuiacy
wittr wtiictr tG original image li repiesented,ind iC de-termine-ffiw much ae-iiii ian
1 r i- h'
be resolved.The resolutiondependson the imaging system as well.

Digital pictures are often very large. For example, suppose we want to sample
and quantize an ordinary (525-line) television picture (NTSC) with a VGA (Video
Graphics Array) video controller, so that it can be redisplayed without noticeable
degradation. We must use a matrix of 640 x 480 pixels, where each pixel is repre-
sented by an 8-bit integer. This pixel representation allows 256 discrete gray levels.
Hence, this image specification gives an array of 807,200 g-bit numbers, and a total
of 2,4571600bits. In many cases,even finer sampling is necessary.
BASIC CONCEPTS

Figure4.I: An enampleof an imagewith 256 gray leuels'

4.L.2 Image Format

There are differentkinds of imageformatsin the literature. We shall considerthe


captured,image
image format that comesout of an image frame grabber, i.e', the captured'-imag
-
Jormat,and the format when imagesir" stored,i.e., the
"t*AT;"g4**;L

Captured Image Format

The image format is specified by two main param eterc: spatial resolution, which
is ;be11fied
v'rn:ltefi trv bits per
?: ?:!_"t'\_?i!_"_!-:2n! cot91!i1;od'in4
is speci_fied
pixel. Both qarame!91 g,lue.sd919nd,9nh,ardwargand softwle 1-o1 iT-:-,t4llq:l
of images. At utt *" *iti pietent image formats supported on SPARC and
""i-pi"
IRIS iomputers.

For image capturing on a SPARCstation, the VideoPirTMcard and its software


240 pixels. The color can
[Sun90] can be used.. The spatial resolution is 320 x
be encoded with 1-bit (a binary image format), 8-bit (color or grayscale) or 24-bit
58 CHAPTER 4. IMAGES AND GRAPHICS

(color-,EGB). Another video frame grabber is the Parallar XVid,eo, which includes
a 24-bit frame buffer and 640 x 480 pixeis resolution. The new multimedia kit
in
the new SPARCstations includes the SunViitefMcard (Sun Microsystems, Inc.),
a
color video camera and the CDware for CD-ROM discs. The SPARCstation 10
M
offers 24-bit image manipulation. The new SunVid,eocard is a capture and com-
pression card and its technology captures and compresses30 frames/second
in real
time under the Solaris 2.3 operating system. Further, SunVideo offers capture and
compressionof video at resolution (320 x 240) pixels in severalformats
[Moog4]:

CellB 30 fps (frames per second)


JPEG 30 fps
MPEG1 I frames 30 fps
MPEG1 IP frames 17 fps
CaptureYUV 30 fps
CaptureRGB-8 30 fps
CaptureRGB-24 12 fps

IRISTMstationsprovide high-quality imagesthrough, for example, add-on IndigoVideo


or IndyVideoTMvideodigitizers and correspondingsoftware
[Royga]. The IRIS video
board VINCf,Mis supported only on IndyTMsystems. It does not include image com-
pression. VINO offers image resolution of 640 x 480 pixels at about four frames per
second. Speed can be increased at the cost of resolution. The resulting formats are
RGB and YUVfotmats (desmibedin Section 5.1.1.).
t

Stored Image Format

When we store an image, we are storing a two-dimensional array of values,in which


each value representsthe data associatedwith a pixel in the image. 1..oii litmap,
this value is a binary digit. For a color image, the value may be a collection of:

Three numbers representing the intensities of the red, green and blue compo-
nents of the color at that pixel.

Three numbers that are indices to tables of red, green and blue intensities.
BASICCONCEPTS

. tha_ti1 an to a tlble of colortriples'


1_1lel: 11-b"] lndex
o An index to any number of other data structures that can represent a color,
including XYZ color sYstem'

o Four or five spectral samples for ea,chcolor.

In addition, each pixel may have other information associatedwith it; for example,
thiee numbers indiciting the nofmal to the surfaie <lii,wn at fhat plx-l:-Thni-' we
consider-an image as consisting of a collection of Red, Green and Blue channels
(RGB it.ooels), each of *hich gives some single piece of information about the
pixels in the image.

If there is enough storage space, it is convenient to store an image in the form


of RGB triples. Otherwise, it may be worth trying to compress the channels in
some way. When we store an image in the conventional manner, as a collection of
channels,information about each pixel, i.e., the value ol each channel at each pixel,
must be stored.. Other information may be associated with the image as a whole,
such is width and height, as well as the depth of the image, the name of the cteator,
etc. The need to store such properties has prompted the creation of flexible formats
such as RIFF (ResourceInterchange File Format) and BRIi\'I (derived from RI-fl)
tMei$i;*tti.tt are used i" g"""iit attribute value database systems. RIFF includes
ior*u.i, for bitmaps, vector drawingi, animation, audio ind viaeo. In B[fM, in
image always has a width, height, creator and history field, which describes the
creation of the image and modifications to it.

Some current image file formats for storing imaqes i""tf* GIF (Gr-alhical In-
terchange Format), Xl1 Bitmap, Sun Rasterfile, PostScript, IRIS, JPEG, TIFF
(Tagged Image File Format) and others.

4.L.3 Graphics Format

Graphics image formats are specified through graphics primitiues and their af-
tributes. To the category of graphics primitives belong lines, rectingleJ, circles ind
ellipses, text strings specifying two-dimensional objects (2D) in a graphical image
or, e.g., polyhedron, specifying three-dimensional objects (3D). A graphics package
60 CHAPTER4. IMAGES AND GRAPHICS

determineswhich primitives are supported. Attributes such as line style, line width,

Graphics primitives and their attributes represent a higher level of an image rep-
resentation,i.e., the graphical images are not representedby a piiet matrix. This
higher level of representation needs to be converted at some point- of the image pro-
cessinginto the lower level of the image representation;for example,when ln-image
is to be-displayed.The advantageof the higher levef primitineJ G the reduition of
data to be stored per one graphical image- and easier manipulation of the graph-
ical image. The disadvantageis the additional conveision step from the gripEidal
primitive,.sand their attributes t9 its_p!1gl Some graphics pl!!3ges
,ry-p,re:-gnt_1tion.
like SRGP (Simple Raster Graphics Package)provide such a conversion,i.e., they
take the giaphics primitives and attributes and generateeither a bitmap or pirmap.
A bitmap is an array of pixel values that map on" by one to pixels on the ,.r""n;
the pixel information is stored in 1 bit, so we get a binary image. Pixmap is a more
-liave
general term describing a multiple-bit-per-pixel image. Low-end color'syste-ins
eight bits per pixel, allowing 256 colors simultaneously. More expensivesystems
have 24 bits per pixel, allowing a choice of any of l6 million colofs. flefr-effi-butrers
with 32 bits per pixel and a screenresolution of 1280 i iiii+pi*uir ur" u,nuGbl"
on personalcomputeis. Ot ttt" 32 bits per pixel, 24 bits are devoied to representing"*t
color and 8 bits to control purposes.Beyond that, bufferswith-96 bits (or more) per
pixel are availableaL 7280x T}z4resolutionon high-endsystems[FDFH92]. S-RGP
does not convert the graphical image into primitives and attributes after generating
a bitmap/pixmap. In this case,after the conversionphase, the graphics format is
-
presentedis a digitat i-ig" iot-ut.

Packagessuch as PHIGS (Programmer's Hierarchical Interactive Graphics Sys-


tem) and GKS (Graphical Kernel System) [FDFH92] take graphical images spec-
ified through primitives and attributes, generatea graphical image in the form of
pixmap and after image presentation,continue to work basedon the object primi-
tive/attribute representation. In this case, the graphical image format is presented
after the generation phase as a structure, which is a logical grouping of primitives,
attnDutes and otner information
4.2. COMPATERIMAGE PROCESSIIfG 61

4.2 Computer Image Processing

Computer graphics concern the pictorial synthesisof real or imaginary objects from
g ir"utr th" .."
theii compued;lia ryg{"fr the reiited-fleld-ofimag.gp.roc".rit
p-ic-
1_h9gffu.y::f- scenes,or_t\e r:c?Jy_tfy:tr":of modelsfrom
vejs,e*,p-_r.9_g9rtl
tures of 2D or 3D objects. In the following sections, we describe basic principles
;f ir";d;-;V"tt*rir (SJn"..tion) and image analysis (recognition). The literature on
computer graphics and image processingpresentsfurther and detailed information
[FDFH92, KR82, Nev82,HS92].

4.2.L Image Synthesis

Image synthesisis an integral part of all computer user interfaces and is indispensable
for visualizing 2D,3D and higher-dimensionalobjects. Areas as diverseas education,
science,engineering, medicine, advertising and entertainment all rely on graphics.
Let us look at some representative samples:

o User Interfaces

Appiications running on personal computers and workstations have user inter-


faces that rely on desktop window systems to manage multiple simultaneous
activities, and on point-and-click facilities to allow users to select menu items,
icons and objects on the screen.

o Office Automation and Electronic Publi,shing

The use of graphics for the creation and dissemination of information has
increasedenormously since the advent of desktop publishing on personal com-
puters. Office automation and electronic publishing can produce both tradi-
tional printed documents and electronic documents that contain text, tables,
graphs and other forms of drawn or scanned-in graphics. Hypermedia sys-
tems that allow browsing networks of interlinked multimedia documents are
proliferating.

Simulation and Animation for Sci,entif,c Visualization and Entertainrnent


62 CHAPTER 4. IMAGES AND GRAPHICS

Computer-produced animated movies and displays of time-varying behavior of


real and simulated objects are becoming increasingly popular for scientific and
engineering visualization. We can use them to study abstract mathematical
entities and models of such phenomena as fluid flow, relativity and nuclear
and chemical reactions. Cartoon characters will increasingly be modeled in
computers as 3D shape descriptions whose movements are controlled by com-
puters rather than by the figures drawn manually by cartoonists. Television
commercialsfeaturing flying logos and more exotic visual trickery have become
common, as have elegant special effects in movies.

Interactive computer graphics are the most important means of producing images
(pictures) since the invention of photography and television; it has the added ad-
vantage that we can make pictures not only of concrete, ,,real world,' objects, but
also of abstract, synthetic objects such as mathematical surfacesin 4D.

Dynamics in Graphics

Graphics are not confined to static pictures. Pictures can be dynamically varied; for
a user can control animation tyla;usting-the speed, porti<in-i6fTl6-total
"*u-pl",
scenein view, amount of detail shown, etc. Hence, dynamics is an integral part
of graphics (d'ynami'cgraphics). Much of interactive graphics technology io"t"i"t
hai?Lwareand software for user-controlled motion d,ynamics and updaTedynamics:

Motion Dynamics

With motion dynamics, objects can be moved and enabled with respect to
a stationary observer. The objects can also remain stationary and the view
around them can move. In many cases,both the objects and the camera are
moving. A typical example is a flight simulator which contains a mechanical
platform, which supports a mock cockpit and.a display screen. The computer
controls platform motion, gauges and the simulated world of both stationary
and moving objects through which the pilot navigates.

Update Dynamics
4.2. COMPUTER IMAGE PROCESSING

Update dynamics is the actual changeof the shape, color, or other properties of
the objects being viewed. For instance, a system can display the deformation
of an in-flight airplane structure in responseto the operator's manipulation of
the many control mechanisms. The smoother the change, the more realistic
and meaningful the result. Dynamic, interactive graphs offer a large number of
user-controllablemodes with which to encode and communicate information,
e.g., the 2D or 3D shape of objects in a picture, their gray scale or color and
the time variations of these properties.

The Framework of Interactive Graphics Systems

Images can be generated by video digitizer cards that capture NTSC (PAL) analog
fGsJkinds of digital im;ges are
signals and create a digital_ima,_ge.
"#A;f- "*
ample, in image processing for image recognition and in communication for video
conferencing. In this section we concentrate on image generation via graphics sys-
tems. We discussin more detail image and video generation via video digitizers in
Chapter 5.

Graphical i.mg,gesare generated using interactive graphics systems. An example of


such a graphics system is SRGP, which borrows features from Apple's Qui,ckDrawin-
teger raster graphics package[RHA+85] and MIT's X Window SystemrMlSGN88]for
output, and from GKS and PHIGS for input. Th" bieh,lgygl ,o1g"Llual fram_ewolI
of almost any interactive graphics system consistsof three software components: an
application mod,el,an application program und i giopt trt tW*,;d i ttiiatut"
component: graphicshardware.

The application model representsthe data or objects to be pictured on the screen;


it,is storedin an-appli:gli_q Cr'" modellvqi9ilt ft;iil-q:"l.lglgT .t
"d"_t4l:9
primitives that define the shape of components of the object, object attributes and
connectivityrelationshipsthat describehow the componentsfit together. The model
and is createdindependentlyof uny purti*f -.t aitpf tJ,tt"r-
is application-specific
"y
fherefore, the appliCitio" piofru- -"rt con"eiti deiaiptiori ot Tti" poition of ih"
model to whatever procedurecalls or commands_thegraphicssystem usesto Cieite
an image. This conversionprocess has two phases. First, the application program
traversesthe application databasethu,t stores the model to extract the portions to
64 CHAPTEN 4. IMAGES AND GRAPITICS

be viewed, using some selection or query system. Second,the extracted geometry is


put in a format that can be sent to the graphics system.

The application program handles user input. It produces views by sending to the
third component, the graphic" svstem, a series of graphics output"iommands t}at
contain bolh a detiiled geometrii a""C"iptio" .f ,;i;i ilT; t" viewed i"a irr"
attributesdescribingnoi tie ;bj ;ilii;u1[ ;ttff .
The graphics system i, ,"rpor,"iur" ro, r"trr.rly producing the picture from the de-
tailed desCriptionsa1a r"r passing the usei's input to iiie-appliiiiio. piod.;fo,
qt".-e-:lilg: The glaphics system is thus r" i"lur*"diary component U"t*""" ift"
application program and the display h;ra;;;. iC."a".ir aA outpai tiaisforio,uon
from objects in itre appiiiation modgl lo a view of rhe model. s;-;uili;;iiy, itlr-
fects an input.trgnsfomnationftorn"r"r'""tio* i" program inputs that
"ppii."tion
cause the applicafion io make changesi" ift".model and/orpict"r". fft" g".phi.,
system typically consists of a set of output subroutines corresponding to
primitives, attributes and other elements. These are collected in a "rtio*
iaphici ttoiiou-
tine librarg or package. The application program specifiesgeometric pri-itirr"".id
attributes to these s-g-b19utines,
and the subroutines then drive the specific disffi
device.rd ii io aLpfuytft" i*.!".
"",r""
At the hardware level, a computer receivesinput from interaction devices and out-
puts images to display devices.

To connect this highJevel conceptual framework to our model shown in Figure


1.1, the application-oa"t u"a ftogt.- ,rr.y-r"pr;;;-;ppii,*!G,
"ffti."ii""
as well as the user interface part in Figure 1-.1-.The graphics system represents
programming abstraitions with support from the operating system to connect io
the graphics hardware. The graphics hardware belongs to the device area in Figure
1.1; therefore, this is the main focus of the following discussion.

Graphics Ilardware - Input

Current input technology provides us with the ubiquitous mouseJth,edata_tab_let


and
the transparent, touch-sensitive panel mounted on the screen. p""" fr""i"r infut
4.2. COMPU TER IMAGE
COMPUTER IMAGE PROCESSII{G
PROCESSII{G 65

devicesthat supply, in addition to (c, g) screenlocation, 3D and higher-dimensional


input values (degreesof freedom), are becoming common, such as.-triik-balls,-ipac"-
balls or the data gloue.

Track-balls can be made to senserotation about the vertical axis in addition to that
about the two horizontal axes. However, there is no direct'ielatioiidhip betileen
hand movements *ittt ttt" device and the corresponding movement in 3D space.

A space-ball is a rigid sphere containing strain gauges. The user pushes or pulls the
spheie in any direction, prbviding 3D tianslation and orientation. In tfris cas", [ii"
directions of movement correspond to the user's attempts to move"the rigid sphere,
aithough the hand does not actually move.

3:.[lg__"-t.t-"oyem:nE
The data glove records hand position-a-n{ 9r!en!_a!i9" -u.t_,ry"Il.
It is a glove covered with small,.lightweight sensors. Each senlor :9"{*P'.91.?
short fiber-optic cable with a Light-Emitting Diode (LED) at one end a1!. a p,h,oto-
transistor at the other. In addition, a Polhelm:usSSPACEthree-dimensionalposition
and orientation sensorrecords hand movements. Wearing the data glove, a user can
grasp objects, move and rotate them and then release them, thus providing very
natural interaction in 3D IZLB+STl.

Audio communication also has exciting potential since it allows hand-free input and
natural output of simple instructions, feedback, and so on.

Graphics Ilardware - Output

Current output technology uses rasfer displags, which store display primitives in
a refresh buffer in terms of their component pixels. The architecture of a raster
display is shown in Figure 4.2.In some raster displays,there is a hardware display
controller that receivesand interprets sequencesof output commands. In simpler,
more common systems (Figure 4.2), such as those in personal computers, the display
controller exists only as a software component of the graphics library package,and
the refresh buffer is no more than a piece of the CPU's memory that can be read
by the image display subsystem (often called the uideo controller) that produces the
actual image on the screen.
66 CHAPTER 4, IMAGES AND GRAPHICS

(DisplayCommands) (InteractionData)

Display Controller (DC)

00000000011 I lll I I I lll I I I I


000001 I 1l l l I l l l l l l l l 1 1 l l t 1 I I I
000001 I I l l I l l l l l l l l l 1 1I I 1 1I I I Video Controller
000001 I I I I I 1000000000001 1I I I 1
0 0 0 0 0 1I1I 1l t I 1 1 1 1l 1 l l l I l l l 1 l
000001111 I 1l I 1000001 lll I lllt 1l
000001111 1111 1000001 1I I I 111I I I
00000111 I 11I I 1000001111 I lll I 1l

Scanned Image (Formed from Raster)

Figure 4.2: Architecture of a raster d.isplay.

The complete image on a raster display is formed from the raster, which is a set of
horizontal raster lines, each a row of individual piiels; the risteii; ih;. rt-;d ;;
a matrix .l qil,gh representing the entire screen area. The entire image is scanned
out sequentially by the video controller. ffr" ii.t"i scan is shown in Figure +.S.
At each pixel, the beam's intensity is set to reflect the pixel,s intensity; in .oio,
.yrt"mi, three beams are controlled - or," io, ei.h ptim*y ;i;t (;d;gr*;, uG;l
- asspecifleauy the iiii"" irb. .;;t"nents of eachpixel's.,al*
flf":s-"..tiq.lrt.
Raster graphics systemshave other characteristics. To avoid flickering of the image,
a 60 Hz or higher refresh rate is used todayl an entire image of 1024 fines of 10f4
pixels each must be stored explicitly and a bitmap or pixmap is generated.

Raster graphics can display areas filled with solid colors * frr"rrrr, i.e., realistic
4.2. COMPUTERIMAGE PROCESSIIfG 67

Start Raster Scan Raster Line

Vertl@l Retrace
Horlzontil Retnce

Figure 4.3: Raster scan.

images of 3D objects. Furthermore, the refresh processis independent of the image


.-_
complexity (number of polygons, etc.) since the hardware is fast enough to read out
each pixel in the buffer on each refresh cycle.

Dithering

The growthof rastergraphics_hal


t"3d."._glgt3"4 gl_l1!9gl9 ga1!o{,go_1-
.t int-egr3,t
temporary compute-rgraphics.-The color of an object dependsn9!,9tly o_1tb"9-object
itself, but also on the light, source illuminating it, on the color o{ !lt" 9urr93r!_tn_8
area and on the human visual system. What we seeon a black-and-white television
set or display monitor is achromatic light. Achromatic.light is- determined by the
attribute quality of light. Quality of light is determined by the intensity and lumi-
nance parameters. For exampb:t' we have hardcopy--devices'ordisplays whict_r--are
only bi,-leueled,which means they produce just two intgns!1y-Jgy-els-r..!!9"-+
-y.g-,.yogl$
like to expand the range of available intensity.

The solution lies in our eye'scapability for spatial integrat'ion.If we view a very small
area from a sufficiently large viewing distan*ce,our eyes average fine detail witli,n
the small area ard record only the overall intensity of the area. This _pheno-menon
is exploited in the technique called,halftoning, or clustered,-dotord,eredd'i'thg711g
(haiftoning approximation). Each small resolution unit is imprinted with a circle of
CHAPTER 4. IMAGES AND GRAPHICS

black ink whose area is proportional to the blackness1- 1(.I=in1_"_1.:i1y)_g.{_1t":


ire3
in the oligiryl photograph. Graphics output devices can approximate the variable-
area circles of halftone reproduction. For example, a 2 x 2 pixel area of a bi-level
display can be used to produce five different intensity levels at the cost of halvin_gthe
spatial resolution along each axis. The patterns, shown in Figure 4.4,car- be filled
number of 'on'..pixelsproportional io thl desired #gryirt,
by 2 x Z ggas.,.*1'!1ft'the.
The patternt cll be represented by the d,ither matrix. This technique is used on

t
Figure 4.4: Fiue intensity leuels approrimated,ui,th four 2 x 2 di,therpatterns.

devices which are not able to display individual dots (e.g., laser printers). This
meansthat these devicesare poor at reproducingisolal"a 'on' pixels lifrJtqck-dots
in Figure 4.4). AX pixels that are 'on'for a particular intensity must be adjacent to
other 'on' pixels.

A CRT display is able to display individual dotsl hence, the clustering requiremelt
can be relaxed and a dispersed-dotordered d,ithercan be used. Monochrome dither-
ing techniques can also be used to extend the number of available colors, at the
expenseof resolution. Considera color display with three bits per pixel, one for red,
gr""r, md blue.'"Wecan use a2x2 pattern arei to obt.itt izs .otoii"""foiio*.r
"u"ft
pattern cin display flve intensities for each coloi, by using the halfton" piiterns in
Figure 4.4, iesulting in 5 x 5 x 5 =I25 color combinations.

4.2.2 Image Analysis

a1a_lvsis_5
Imas_e *i1h 1":,hlilges.fo,r,e1!l"di5*{"::lip!ig"'fl'l1-3s"'
c91cem9{
that are necessaryfor higher-level scene analysis methods. By itself, knowledge
of the position and value of any particular plxJ J-ort-io"veys no information
related to the recognition of an object, the description of an object's shape, its
position or orientation, the measurement of any distance on the object or whether
the object is defective. Hence, image analysis techniques include computation of
peiceived brightness and color, partial or complete recovery of three-dimensional
4.2, COMPUTER IMAGE PROCESSING 69

data in the scene,location of discontinuities corresponding to objects in the scene


and characterization of the properties of uniform regions in the image.

Image analysis is important in many arenas: aerial surveillance photographs, slow-


scan television images of the moon or of planets gathered from spaceprobes, televi-
sion imagestaken from an industrial robot's visual sensor,X-ray imagesand comput-
erized axial tomography (CAT) scans. Subareasof image processinginclude image
enhancement, pattern detection and recognition and scene analgsis and computer
aision.

Image enhancement deals with improving image quality by eliminating noise (ex-
traneous or missing pixels) or by enhancing contrast.

Pattern detection and recognition deal with detecting and clarifying


:,!l"4et4_lgt
terns and finding distortions from these patterns. A particularly important example
is Optical Character Recognition (OCR) technology,which allows for the economical
bulk input of pages of typeset, typewritten or even hand-printed characters. The
degree of accuracy of hanilwriting recognition depends on the input device. One
possibility is that the user prints characters with a continuous-positioning device,
usually a tablet stylus (a pen-based environment), and the computer recognizes
them (online recognition). This is easier than recognizing scanned-incharacters be-
cause the tablet records the sequence,direction and sometimes speed and pressure
of strokes, and a pattern-recognition algorithm can match these factors to stored
templates for each character. The recognizer may evaluate patterns without con-
sidering how the pattern has been created (a static character recognition) or it may
focus on strokes, edgesin strokes or drawing speed (a dynamic recognizer).A recog-
nizer can be trained to ideniify different styles of block printing. The parameters of
each character are calculated from samples drawn by the users. An architecture for
an object-oriented character recognition engine (AQUIRE), which supports online-
recognition with combined static and dynamic capabilities, is described in [KW93b].
A commercialcharacterrecognizeris describedin lWB85, BW86].

Scene analysis and computer vision deal with recognizing and reconstructing 3D
models of a scenefrom several 2D images. An example is an industrial robot sensing
the relative sizes,shapes,positions and colors of objects.
70 CHAPTER 4. IMAGES AND GRAPHICS

fmage Recognition

To fully recognize an object in an image means knowing that there is an agreement


between the sensoryprojection and the observedimage. How the object appears in
the image has to do with the spatial configuration of the pixel values. Agreement
between the observed spatial configuration and the expected sensory projection re-
quires the following capabilities:

o Infer explicitly or impiicitly an object's position and orientation from the spa-
tial configuration.

r Confirm that the inference is correct.

To infer an object's (".g., u cup) position, orientation and category or classfrom the
spatial configuration of gray levels requires the capability to infer which pixels are
part of the object. Further, from among those pixels that are part of the object, it
requires the capability to distinguish observedobject features, such as special mark-
ings, lines, curves, surfaces or boundaries (e.g., edges of the cup). These features
themselvesare organized in a spatial relationship on the image and the object.

Analytic inference of object shape, position and orientation depends on matching


the distinguishing image features (in 2D, a point, line segment or region) with cor-
responding object features (in 3D, a point, line segment, arc segment, or a curved
or planar surface).

The kind of object, background, imaging sensor and viewpoint of the sensor all
determine whether the recognition problem is easyor difrcult. For example, suppose
that the object is a white planar square on a uniform black background, as shown in
the digital image (Table 4.1). A simple corner feature extractor could identify the
distinguishing corner points, as shown in the symbolic image (Table 4.2). The match
betweenthe image corner features and the object cornel features is direct. Just relate
the cornersof the image square to the cornels of the object square in clockwiseorder,
starting from any arbitrary correspondence.Then, use the corresponding points to
establish the sensororientation relative to the plane of the square. If we know the
size of the square, we can completely and analytically determine the position and
orientation of the square relative to the position and orientation of the camera. In
4.2. COMPUTER IMAGE PROCESSI]fG 7I

0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
U 0 0 0 255 255 255 255 255 0 0 0 0
0 0 0 0 255 255 255 255 255 0 0 0 0
0 0 0 0 255 255 255 255 255 0 0 0 0
U 0 0 0 255 255 255 255 255 0 0 0 0
0 o n 0 255 255 255 255 255 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0

Table 4.7: Numeric d,igitat intensitg image of a white square (gray tone 255) on a
black (gray tone 0) backgroundand symboli,cimage.

N N N N N N N N N N N N N
N N N N N N N N N N N N N
N N N N C N N N C N N N N
N N N N N t{ N N N N N N N
N N N N N N N N N N N N N
N N N N N N N N N N N N N
N N N l{ C N N N C N N N N
N N N N N N N N N N N N N
N N N N N N N N N N N N N

Table 4.2: Numeric digital intensi,tyimage of image's corners shown in Table/.1


(N = noncorner; C - corner).
CHAPTER 4. IMAGES AND GRAPHICS

this simple instance, the unit of pixel is transformed to the unit of match between
image corners and object corners. The unit of match is then transformed to the
unit of object position and orientation relative to the natural coordinate system of
the sensor.

On the other hand, the transformation process may be diffrcult. There may be a
variety of complex objects that need to be recognized,.For example, some objects
may include parts of other objects, shadows may occur or the object reflectances
may be varied, and the background may be busy.

Which kind of unit transformation must be employed dependson the specific nature
of the vision task, the complexity of the image and the kind of prior information
available.

Computer recognition and inspection of objects is, in general, a complex procedure,


requiring a variety of steps that successivelytransform the iconic data into recog-
nition information. A recognition methodology must pay substantial attention to
each of the following six steps: image formatting, cond,itioning, labeling, grouping,
ertracting and matching (Figure 4.5).

t
la

=)

Figure 4.5: Image recognit'ionsteps,


4.2. COMPUTERIMAGE PROCESSING 73

Image Recognition Steps

We will give a brief overview of the recognition steps, but a deeper analysis of these
steps can be found in computer vision literature, such as [Nev82, HS92], etc.

Image formatting means capturing an image from a camera and bringing it into a
digital form. It means that we will have a digital representation of an image in the
form of pixels. (Pixels and image formats were describedin Sections4.1.1' and
4.L2.) An example of an observedimage is shown in Figure 4.6.

Figure 4.6: Obserueclimage (Courtesy of Jana Koieckd, GRASP Laboratory, Uni-


uers'ity of Pennsgluania, 1991.

Conditioning,labeling, grouping,extracting and matching constitute a canonicalde-


composition of the image recognition problem, each step preparing and transforming
the data to facilitate the next step. Dependingon the application, we may have to
apply this sequenceof steps at more than one level of the recognition and description
processes.As thesestepswork on any level in the unit transformation process,they
prepare the data for the unit transformation, identify the next higher-level unit and
interpret it. The flve transformation steps,in more detail, are:

7. Condi,ti,oning

Conditioning is based on a model that suggeststhe observedimage is com-


posed of an informative pattern modified by uninteresting variations that typ-
74 CHAPTER 4. IMAGES AND GRAPHICS

ically add to or multiply the informative pattern. Conditioning estimates the


informative pattern on the basis of the observed image. Thus conditioning
suppressesnoise, which can be thought of as random unpatterned variations
affecting all measurements. Conditioning can also perform background nor-
malization by suppressing uninteresting systematic or patterned variations.
conditioning is typically applied uniformly and is context-independent.

2. Labeling
Labeling is based on a model that suggeststhe informative pattern has struc-
ture as a spatial arrangement of events, each spatial event being a set of
connected pixels. Labeling determines in what kinds of spatial events each
pixel participates.
An example of a labeling operation is edge detection. Edge detection is an
important part of the recognition process. Edge detection techniques find
local discontinuities in some image attribute, such as intensity or color (e.g.,
detection of cup edges). These discontinuities are of interest becausethey are
likely to occur at the boundaries of objects. An edgeis said to occur at a point
in the image if some image attribute changesin value discontinuously at that
point. Examples are intensity edges. An ideal edge, in one dimension, may be
viewed as a step change in intensityl for example, a step between high-valued
and low-valued pixels (Figure 4.7). If the step is detected, the neighboring

Figure 4.7: One-d,imensional


edge.

high-valued and low-valued pixels are labeled as part of an edge. An example


COMPUTER IMAGE PROCESSNTG 75

of an image(Figure 4.6.) after edgedetectionis shownin Figure 4.8.

Figure 4.8: Edge detection of the image from Fi,gurei.6 (Courtesy of Jana Koieclca,
GRASP Laboratory, Uniuersity of Pennsyluania, 1991)'

Edge detection recognizes many edges, but not all of them are significant.
Therefore, another labeling operation must occur after edge detection, namely
thresholding.Thresholding specifieswhich edgesshould be acceptedand which
should not; the thresholding operation filters only the significant edges from
the image and labels them. Other edgesare removed. Thresholding the image
from Figure 4.8 is presentedin Figure 4.9.

Other kinds of labeling operations include corner finding and identification of


pixels that participate in various shape primitives.

3. Grouping

The labeling operation labelsthe kinds of primitive spatial events in which the
pixel participates. The grouping operation identifies the events by collecting
together or identifying maxima.l connected sets of pixels participating in the
same kind of event. When the reader recalls the intensity edge detection
viewed as a step changein intensity (Figure 4.7),the edgesare labeled as step
edges,and the grouping operation constitutesthe step edgelinking.

A grouping operation, where edgesare grouped into lines, is called line-fi'tting.


A grouped image with respect to lines is shown in Figure 4.10. Again the
76 CHAPTER 4. IMAGES AND GRAPHICS

Figure 4.9: Thresholding the image from Figure l.S (Courtesy of Jana Koseckd,
GRASP Laboratory, Uniuersity of Pennsyluania, lggl).

grouping operation line-fitting is performed on the image shown in Figure 4.8.

Figure 4.10: Line-fitting of the image from Figure l.B (Courtesy of Jana Koieckd,
GRASP Laboratory, Uniuersitg of Pennsyluania).

The grouping operation involves a changeof logical data structure. The ob-
served image, the conditioned image and the labeled image are all digital
image data structures. Dependingon the implementation, the grouping oper-
ation can produce either an image data structure in which each pixel is given
an index associatedwith the spatial event to which it belonesor a data struc-
4.2. COMPUTER IMAGE PROCESSING 77

ture that is a collection of sets. Each set corresponds to a spatial event and
contains the pairs of positions (row, column) that participate in the event.
In either case, a change occurs in the logical data structure. The entities of
interest prior to grouping are pixels; the entities of interest after grouping are
sets of pixels.

4. Ertracting

The grouping operation determines the new set of entities, but they are left
naked in the sense that the only thing they possesis their identity. The
extracting operation computes for each gloup of pixels a list of properties.
Example properties might include its centroid, area, orientation, spatial mo-
ments, gray tone moments, spatial-gray tone moments, circumscribing circle,
inscribing circle, and so on. Other properties might depend on whether the
group is considered a region ot an arc. If the group is a region, the number
of holes might be a useful property. If the group is an atc, avelage curvatule
might be a useful property.

Extraction can also measure topological or spatial relationships between two


or mole groupings. For example, an exttacting operation may make explicit
that two groupings touch, ot are spatially close, or that one grouping is above
another.

5. Matching

After the completion of the extracting operation, the events occurring on the
image have been identified and measured, but the eventsin and of themselves
have no meaning. The meaning of the observedspatial events emergeswhen a
perceptual organization has occurred such that a specific set of spatial events
in the observed spatial organization clearly constitutes an imaged instance
of some previously known object, such as a chair or the letter A. Once an
object or set of object parts has been recognized,measurements(such as the
distance between two parts, the angle between two lines or the alea of an
object part) can be made and related to the allowed tolerance, as may be the
case in an inspection scenario. It is the matching operation that determines
the interpretation of some related set of image events, associatingthese events
with some given three-dimensional object or two-dimensional shape.
78 CHAPTER 4. IMAGES AND GRAPHICS

There are a wide variety of matching operations. The classicexample is tem-


plate matching, which compares the examined pattern with stored models
(templates) of known patterns and choosesthe best match.

4.2.3 Image TYansmission

Image transmission takes into account transmission of digital images through com-
puter networks. There are several requirements on the networks when images are
transmitted: (1) The network must accommodatebursty data transport becauJe
image transmissionis bursty (The burst is causedby the large size of the image.);
(2) Image transmission requires reliable transportl (3) Time-dependence is not a
dominant characteristic of the image in contrast to audio/video transmission.

Image size dependson the image representationformat used for transmission. There
are several possibilities:

o Raw image data transmission


In this case,the image is generatedthrough a video digitizer and transmitted
in its digital {ormat. The size can be computed in the following manner:

si z e = spatial -r esolution x pir el 4uanti zation

For example, the transmission of an image with a resolution of 640 x 480 pixels
and pixel quantizationof 8 bits per pixel requirestransmissionof 307,200bytes
through the network.

Compressedimage data transmission


In this case,the image is generatedthrough a video digitizer and compressed
beforetransmission.Methodssuchas JPEC or MPEG, describedin Chapier
6, are used to downsizethe image. The reduction of image size dependson
the compressionmethod and compressionrate.

Sgmbolic image data transmission


In this case, the image is representedthrough symbolic data representation
as image primitives (e.g..2D or 3D geometricrepresentation),attributes and
4.3. COMMENTS

other control information. This image representation method is used in com-


puter graphics. Image size is equal to the structure sile, *hi.h .utries the
transmitted symbolic information of the image.

4.3 Comments

We have described in this section some characteristics of images and graphical ob-
jects. The quality of hardwarl 919!t g,s
media depends on the agalit.y of !_l-re
!!e!g
frame grabbers, displays i"a otn"i input/output devices. The developmgnt of input
and.o,utput devices continues at a rapid pace. A few examples should give a flavor
of this development:

New multimed,iad,euices
New scanners of photographical objects already provide high-quality digital
images and become part of multimedia systems. An introduction of a new
muitimedia device (e.g., scanner) implies new multimedia format becausethe
new medium (e.g., photographical images) can be combined with other images
and other media. An example of such a new multimedia format is the Photo
Image Pac File Format introduced by Kodak. This format is a new disc format
that combines high-resolution images with text, graphics and sound. Hence,
it enablesusersto designinteractive Photo-CD-basedpresentations[Ann94b]'

Irnprouementsof eristing multimedia deuices


I.{ew 9D d,igiti,zersare coming to market which enable the user to copy 3D
objects of all shapesand sizesinto a computer [Ann94a].
CHAPTER 4. IMAGES AND GRAPHICS

You might also like