HUMAN-COMPUTER INTERACTION
HUMAN FACTORS AS
HCI THEORIES
CHAPTER 3
P R E P A R E D B Y : Ryan Jison de la Gente
HIGHLIGHTS
Learning
Objectives
Human Information Processing
Task Modeling and Human Problem-Solving Model
Human Reaction and Prediction of Cognitive Performance
Sensation and Perception of Information
Visual
Aural
Tactile and Haptic
Multimodal Interaction
Human Body Ergonomics (Motor Capabilities)
Fitts’s Law
Motor Control
Human
Information
Processing
Any effort to design an effective interface for human–
computer interaction (HCI) requires two basic elements:
AN UNDERSTANDING OF COMPUTER FACTORS
(SOFTWARE/HARDWARE)
HUMAN BEHAVIOR.
In this chapter, we take a brief look at some of the basic
human factors that constrict the extent of this interaction.
Human
Information
Processing
As the main underlying theory for HCI, human factors can largely be
divided into:
(a) cognitive science, which explains the human’s capability and
model of conscious processing of high-level information
(b) ergonomics, which elucidates how raw external stimulation
signals are accepted by our five senses, are processed up to the
preattentive level, and are later acted upon in the outer world
through the motor organs.
HUMAN-FACTORS
KNOWLEDGE
TASK/INTERACTION MODELING:
Formulate the steps for how humans might interact to solve and carry out a
given task/problem and derive the interaction model. A careful HCI designer
would not neglect to obtain this model by direct observation of the users
themselves, but the designer’s knowledge in cognitive science will help greatly
in developing the model.
PREDICTION, ASSESSMENT, AND EVALUATION OF INTERACTIVE BEHAVIOR:
Understand and predict how humans might react mentally to various
information-presentation and input-solicitation methods as a basis for
interface selection. Also, evaluate inter-action models and interface
implementations and explain or predict their performance and usability.
TASK MODELING AND
HUMAN PROBLEM-SOLVING
MODEL
TASKMODELING AND
HUMAN PROBLEM-SOLVING
MODEL
Human problem-solving or information-processing efforts consist of the
following important parts:
SENSATION
which senses external information (e.g., visual, aural, haptic), and
Perception, which interprets and extracts basic meanings of the
external information.
PERCEPTION
which interprets and extracts basic meanings of the external
information.
TASKMODELING AND
HUMAN PROBLEM-SOLVING
MODEL
Human problem-solving or information-processing efforts consist of the
following important parts:
MEMORY
which stores momentary and short-term information or long-term
knowledge. This knowledge includes information about the external
world, procedures, rules, relations, schemas, candidates of actions to
apply, the current objective (e.g., accomplishing the interactive task
successfully), the plan of action, etc.
TASKMODELING AND
HUMAN PROBLEM-SOLVING
MODEL
Human problem-solving or information-processing efforts consist of the
following important parts:
DECISION MAKER/EXECUTOR
which formulates and revises a “plan,” then decides what to do based
on the various knowledge in the memory, and finally acts it out by
commanding the motor system (e.g., to click the mouse left button).
HUMAN FACTORS AS HCI THEORIES
THE OVERALL HUMAN MORE DETAILED VIEW OF THE
PROBLEM-SOLVING MODEL “DECISION MAKER/EXECUTOR.”
AND PROCESS
HUMAN REACTION AND
PREDICTION OF
COGNITIVE
PERFORMANCE
We can also, to some degree, predict how humans will
react and perform in response to a particular human
interface design. We can consider two aspects of
human performance: one that is cognitive and
the other ergonomic.
HUMAN REACTION AND PREDICTION
OF COGNITIVE PERFORMANCE
Norman and Draper spoke of the “gulf of execution/evaluation,”
which explains how users can be left bewildered (and not
perform very well) when an interactive system does not offer
certain actions or does not result in a state as expected by the
user.
Such a phenomenon would be a result of an interface based
on an ill-modeled interaction.
The mismatch between the user’s mental model and the task
model employed by the interactive system creates the “gulf.”
On the other hand, when the task model and interface structure
of the interactive system maps well to the expected mental
model of the user, the task performance will be very fluid.
HUMAN REACTION AND PREDICTION OF COGNITIVE PERFORMANCE
PREDICTIVE PERFORMANCE ASSESSMENT:
GOMS (GOALS, OPERATORS, METHODS, AND SELECTION)
Many important cognitive activities have been
analyzed in terms of their typical approximate
process time, e.g., for single-chunk retrieval from
the short-term memory, encoding (memorizing)
of information into the long-term
memory, responding to a visual stimulus and
interpreting its content, etc.
Estimates of Time Taken for Typical Desktop
Computer Operations from GOMS
HUMAN REACTION AND PREDICTION OF COGNITIVE PERFORMANCE
SENSATION AND PERCEPTION OF INFORMATION
VISUAL
Humans are known to have at least five senses. Among them, those
that would be relevant to HCI are the modalities of visual, aural, haptic
(force feedback), and tactile sensation.
AURAL
Taking external stimulation or raw sensory information (sometimes
computer generated) and then processing it for perception is the first
part in any human–computer interaction.
TACTILE AND HAPTIC
Naturally, the information must be supplied in a fashion that is
amenable to human consumption, that is, within the bounds of a
human’s perceptual capabilities.
MULTIMODAL INTERACTION
SENSATION AND PERCEPTION OF INFORMATION
VISUAL
Visual modality is by far the most important information medium. Over
40% of the human brain is said to be involved with the processing of
visual information.
SENSATION AND PERCEPTION OF INFORMATION
VISUAL
VISUAL AND DISPLAY PARAMETERS
Field of view (FOV): This is the angle subtended
by the visible area by the human user in the
horizontal or vertical direction.
The shaded area in Figure 3.6 illustrates the
horizontal field of view. The human FOV is nearly
180° in both the horizontal and vertical directions.
SENSATION AND PERCEPTION OF INFORMATION
VISUAL
VISUAL AND DISPLAY PARAMETERS
Viewing distance: This the perpendicular distance
to the surface of the display. Viewing distance
(dotted line in Figure 3.6) may change with user
movements. However, one might be able to define
a nominal and typical viewing distance for a given
task or operating environment.
SENSATION AND PERCEPTION OF INFORMATION
VISUAL
VISUAL AND DISPLAY PARAMETERS
Display field of view: This is the angle subtended by
the display area from a particular viewing distance.
Note that for the same fixed display area, the
display FOV will be different at different viewing
distances.
In Figure 3.6, the display FOV is denoted with the
dashed line. The display offers different fields of
view, depending on the viewing distance (dotted
line in the middle).
SENSATION AND PERCEPTION OF INFORMATION
VISUAL
VISUAL AND DISPLAY PARAMETERS
Pixel: A display system is typically composed of
an array of small rectangular areas called pixels.
SENSATION AND PERCEPTION OF INFORMATION
VISUAL
VISUAL AND DISPLAY PARAMETERS
Display resolution: This is the number of pixels in the
horizontal and vertical directions for a fixed area.
SENSATION AND PERCEPTION OF INFORMATION
VISUAL
VISUAL AND DISPLAY PARAMETERS
Visual acuity: In effect, this is the resolution
perceivable by the human eye from a fixed
distance.
This is also synonymous with the power of sight,
which is different for different people and age
groups.
SENSATION AND PERCEPTION OF INFORMATION
VISUAL
DETAIL AND PERIPHERAL VISION
The human eye contains two types of cells that react to light
intensities in different ways.
The cones, which are responsible for color and detail recognition,
are distributed heavily in the center of the retina (back of the
eyeball), which subtends about 5° in the human FOV and roughly
establishes the area of focus.
The oval region in Figure 3.6 shows the corresponding region in
the display for which details can be perceived through these cells.
On the other hand, the rods are distributed mainly in the periphery
of the retina and are responsible for motion detection and less
detailed peripheral vision. While details may not be sensed, the
rods contribute to our awareness of the surrounding environment.
SENSATION AND PERCEPTION OF INFORMATION
VISUAL
COLOR, BRIGHTNESS, AND CONTRAST
Brightness: The amount of light energy emitted by
the object (or as perceived by the human).
Color: Human response to different wavelengths of
light, namely for those corresponding to red, green,
blue, and their mixtures.
A color can be specified by the composure of the
amounts contributed by the three fundamental
colors and also by hue (particular wavelength),
saturation (relative difference in the major
wavelength and the rest in the light), and bright-
ness value (total amount of the light energy)
(Figure 3.10).
SENSATION AND PERCEPTION OF INFORMATION
VISUAL
COLOR, BRIGHTNESS, AND CONTRAST
Contrast: Relative difference in brightness or color
between two visual objects. Contrast in
brightness is measured in terms of the difference
or ratio of the amounts of light energies between
two or more objects.
The recommended ratio of the foreground to
background brightness contrast is at least
3:1.
Color contrast is defined in terms of differences
or ratios in the dimensions of hue and saturation.
It is said that the brightness contrast is more
effective for detail perception than
the color contrast (Figure 3.11).
SENSATION AND PERCEPTION OF INFORMATION
VISUAL
PRE-ATTENTIVE FEATURES AND HIGH-LEVEL
DIAGRAMMATIC SEMANTICS
Detail, color, brightness, and contrast are all very-low-level raw visual
properties. Before all these low-level-part features are finally
consolidated for conscious recognition (of a larger object) through the
visual information processing pipeline, pre-attentive features might be
used to attract our attention.
Pre-attentive features are composite, primitive, and intermediate visual
elements that are automatically recognized before entering our
consciousness, typically within 10 ms after entering the sensory system,
these features may rely on the relative differences in color, size, shape,
orientation, depth, texture, motion, etc.
Figure 3.12 shows several examples and how they can be used
collectively to form and design effective graphic icons.
SENSATION AND PERCEPTION OF INFORMATION
AURAL
Next to the visual, the aural modality (sound) is perhaps the
most prevalent mode for information feedback.
The actual form of sound feedback can be roughly divided
into three types:
(a) simple beep like sounds
(b) short symbolic sound bytes known as earcons (e.g.,
the paper-crunching sound when a file is inserted into
the trashcan for deletion), and
(c) relatively longer “as is” sound feedback that is
replayed from recordings or synthesis.
SENSATION AND PERCEPTION OF INFORMATION
AURAL
AURAL DISPLAY PARAMETERS
INTENSITY (AMPLITUDE)
refers to the amount of sound energy and is synonymous
with the more familiar term, volume.
Intensity is often measured in the units of decibels (dB), a
logarithmic scale of sound energy, where 0 dB corresponds
to the lowest level of audible sound and about 130 dB is the
highest. It is instructive to know the decibel levels of different
sounds as a guideline in setting the nominal volume for the
sound feedback (Table 3.3).
SENSATION AND PERCEPTION OF INFORMATION
AURAL
AURAL DISPLAY PARAMETERS
Sound can be viewed as containing or being composed of
a number of sinusoidal waves with different frequencies and
corresponding amplitudes.
The dominant frequency component determine various
characteristics of sounds such as the pitch (e.g., low or high
key), timbre (e.g., which instrument), and even directionality
(where is the sound coming from?).
Humans can hear sound waves with frequency values
between about 20 and 20,000 Hz.
SENSATION AND PERCEPTION OF INFORMATION
AURAL
AURAL DISPLAY PARAMETERS
PHASE
refers to the time differences among sound waves that
emanate from the same source. Phase differences
occur, for example, because our left and right ears
may have slightly different distances to the sound
source and, as such, phase differences are also known
to contribute to the perception of spatialized sound
such as stereo.
SENSATION AND PERCEPTION OF INFORMATION
AURAL
OTHER CHARACTERISTICS OF SOUND
AS INTERACTION FEEDBACK
We further point out a few differences of aural feedback from the
visual.
First, sound is effectively omnidirectional. For this reason, sound
is most often used to attract and direct a user’s attention. However,
as already mentioned, it can also be a nuisance as a task interrupter
(e.g., a momentary loss of context) by the startle effect. Making use
of contrast is possible with sound as well. For instance, auditory
feedback would require a 15–30-dB difference from the ambient
noise to be heard effectively. Differentiated frequency components
can be used to convey certain information.
SENSATION AND PERCEPTION OF INFORMATION
AURAL
AURAL MODALITY AS INPUT METHOD
So far, the aural modality has been explained only in
the context of passive feedback.
As for using it actively as a means for input to
interactive systems, two major methods are:
(a) keyword recognition and
(b) natural language understanding
SENSATION AND PERCEPTION OF INFORMATION
TACTILE AND HAPTIC
SENSATION AND PERCEPTION OF INFORMATION
TACTILE AND HAPTIC
HAPTIC
is defined to be the modality that takes advantage of touch
by applying forces, vibrations, or motions to the user. Thus
haptic refers to both the sensation of force feedback as well
as touch (tactile).
For convenience, we will use the term haptic to refer to the
modality for sensing force and kinesthetic feedback through
our joints and muscles (even though any force feedback
practically requires contact through the skin) and the term
tactile for sensing different types of touch (e.g., texture, light
pressure/contact, pain, vibration, and even temperature)
through our skin.
SENSATION AND PERCEPTION OF INFORMATION
TACTILE AND HAPTIC
TACTILE DISPLAY PARAMETERS
TACTILE RESOLUTION:
The skin sensitivity to physical objects is different
over the human body. The fingertip is one of the
most sensitive areas and is frequently used for HCI
purpose.
The fingertip can sense objects as small as 40
μm(micrometre) in size
SENSATION AND PERCEPTION OF INFORMATION
TACTILE AND HAPTIC
TACTILE DISPLAY PARAMETERS
VIBRATION FREQUENCY
Rapid movement such as vibration is mostly
sensed by the Pacinian corpuscle, which is known
to have a signal-response range of 100–300 Hz.
Vibration frequency of about 250 Hz is said to be
the optimal for comfortable perception
SENSATION AND PERCEPTION OF INFORMATION
TACTILE AND HAPTIC
TACTILE DISPLAY PARAMETERS
PRESSURE THRESHOLD:
The lightest amount of pressure humans can sense
is said to be about 1000 N/m2. For a fingertip, this
amounts to about 0.02 N for the fingertip area.
The maximum threshold is difficult to measure,
because when the force/torque gets large enough,
the kinesthetic senses start to operate, and this
threshold will greatly depend on the physical
condition of the user (e.g., strong vs. weak user).
SENSATION AND PERCEPTION OF INFORMATION
TACTILE AND HAPTIC
As mentioned, there are many types of tactile stimulation,
such as texture, pressure, vibration, and even temperature.
For the purposes of HCI, the following parameters are
deemed important, and the same goes for the display
system providing the tactile-based feedback.
Physical tactile sensation is felt by a combination of skin
cells and nerves tuned for particular types of stimulation,
e.g., the Meissner’s corpuscle for slight pressure or slow
pushing (stimulation signal frequency of 3–40 Hz), Merkel
cells for flutter and textured/protrusion surfaces (0.3–3 Hz),
the Pacinian corpuscle for more rapid vibratory stimulation
(10–500 Hz), and Ruffini endings for skin stretch.
SENSATION AND PERCEPTION OF INFORMATION
TACTILE AND HAPTIC
HAPTIC AND HAPTIC DISPLAY PARAMETERS
Along with tactile feedback, haptic feedback adds
a more apparent physical dimension to interaction.
Force feedback and movement is felt by the cells
and nerves in our muscles and joints.
For instance, the muscle spindle/tendon takes the
inertial load, and Pacinian/Ruffini/Golgi receptors
sense the joint movements, pressure, and torque.
SENSATION AND PERCEPTION OF INFORMATION
TACTILE AND HAPTIC
HAPTIC AND HAPTIC DISPLAY PARAMETERS
The simplest form of a haptic device is a simple
electromagnetic latch that is often used in game
controllers. It generates a sudden inertial
movement and slowly repositions itself for
repeated usage.
Normally, the user holds on to the device, and
inertial forces are delivered in the direction relative
to the game controller. Such a device is not
appropriate for fast-occurring interaction (e.g.,
successive gun shots) or for displaying a
continuously sustained force (e.g., leaning
against a wall).
SENSATION AND PERCEPTION OF INFORMATION
TACTILE AND HAPTIC
HAPTIC AND HAPTIC DISPLAY PARAMETERS
More-complicated haptic devices are in the
form of a robotic kinematic chain, either
fixed on the ground or worn on the body.
As a kinematic chain, such devices offer
higher degrees of freedom and finer force
control (Figure 3.17).
For the grounded device, the user interacts
with the tip of the robotic chain through
which a force feedback is delivered.
The sensors in the joints of the device make
it possible to track the tip (interaction point)
within the three-dimensional (3-D)
operating space.
SENSATION AND PERCEPTION OF INFORMATION
TACTILE AND HAPTIC
IMPORTANT HAPTIC DISPLAY PARAMETERS
(a) the degrees of freedom (the number of directions in which force or torque be
can displayed),
(b) the force range (should be at least greater than 0.5mN),
(c) operating/interaction range (how much movement is allowed through the
device), and
(d) stability (how stable the supplied force is felt to be).
Stability is in fact a by-product of the proper sampling period, which refers to the
time taken to sense the current amount of force at the interaction point and then
determine whether the target value has been reached and reinforce it (a process
that repeats until a target equilibrium force is reached at the interaction point).
The ideal sampling period is about 1000 Hz, and when the sampling period falls
under a certain value, the robotic mechanism exhibits instability (e.g., exhibited in
the form of vibration) and thus lower usability. The dilemma is that providing a
high sampling rate requires a heavy computation load, not only in updating the
output force, but also in physical simulation (e.g., to check if the 3-D cursor has hit
any virtual object). Therefore, a careful “satisficing” solution is needed to balance
the level of the haptic device performance and the user experience (Figure 3.18)
SENSATION AND PERCEPTION OF INFORMATION
TACTILE AND HAPTIC
MULTIMODAL INTERACTION
Conventional interfaces have been mostly visually oriented. However,
for various reasons, multimodal interfaces are gaining popularity with
the ubiquity of multimedia devices. By employing more than one
modality, interfaces can become more effective in a number of ways,
depending on how they are configured.
SENSATION AND PERCEPTION OF INFORMATION
TACTILE AND HAPTIC
MULTIMODAL INTERACTION
COMPLEMENTARY:
Different modalities can assume different roles and act in a
complementary fashion to achieve specific interaction objectives. For
example, an aural feedback can signify the arrival of a phone call while the
visual displays the caller’s name.
REDUNDANT:
Different modality input methods or feedback can be used to ensure a
reliable achievement of the interaction objective. For instance, the ring of a
phone call can be simultaneously aural and tactile to strengthen the pick-
up probability.
ALTERNATIVE:
Providing users with alternative ways to interact gives people more
choices. For instance, a phone call can be made either by touching a
button or by speaking the callee’s name, thereby promoting convenience
and usability.
SENSATION AND PERCEPTION OF INFORMATION
TACTILE AND HAPTIC
MULTIMODAL INTERACTION
For multimodal interfaces to be effective, each feedback must
be properly synchronized and consistent in its representation.
For instance, to signify a button touch, the visual highlighting and beep
sound effect must occur within a short time (e.g., less than 200 ms)
to be recognized as one consistent event. The representation must be
coordinated between the two: In the previous example, if there is
one highlighting, then there should also be one corresponding beep.
When inconsistent, the interpretation of the feedback can be confusing, or
only the dominant modality will be recognized.
Human Body Ergonomics
(Motor Capabilities)
So far, we have mostly talked about human cognitive and
perceptual capabilities and how display or input systems must
be configured to match them.
To be precise, ergonomics is a discipline focused on making
products and interfaces comfortable and efficient.
Thus, broadly speaking, it encompasses mental and
perceptual issues, although in this book, we restrict the term
to mean ways to design interfaces or interaction devices for
comfort and high performance according to the physical
mechanics of the human body.
For HCI, we focus on the human motor capabilities that are
used to make input interaction. We start with Fitts’s law and
human motor control.
Human Body Ergonomics
(Motor Capabilities)
FITTS’S LAW
Fitts’s law is a model of human movement that predicts
the time required to rapidly move to a target area as a
function of the distance to and the size of the target.
The movement task’s Index of Difficulty (ID) can be
quantified in terms of the required information amount,
i.e., in the number of bits.
From the main equation in Figure 3.19, the actual time to
complete the movement task is predicted using a simple
linear equation, where movement time, MT, is a linear
function of ID.
Human Body Ergonomics
(Motor Capabilities)
FITTS’S LAW
MT = a + b * ID and ID = log(A/W + 1)
where A and B are coefficients specific to a given task.
Thus, to reiterate, ID represents an abstract notion of
difficulty of the task, while MT is an actual prediction
value for a particular task.
The values for coefficients a and b are obtained by
taking samples of the performance and
mathematically deriving them by regression
(Figure 3.20).
Human Body Ergonomics
(Motor Capabilities)
FITTS’S LAW
Note that the original Fitts’s law was created for interaction
with everyday objects (in the context of operation in factory
assembly lines) rather than for computer interfaces.
Researchers have applied the concept of Fitts’s law to
computer interfaces and have found that the same principle
applies. For instance, as shown in Figure 3.21, the task of
“dragging an icon into a trashcan icon” using a mouse can be
assessed using Fitts’s law. Many other computer interactive
tasks can be modeled similarly, and several revised Fitts’s
laws (e.g., for desktop computer interface, mobile interface)
have been derived as well.
Human Body Ergonomics
(Motor Capabilities)
MOTOR CONTROL
Perhaps the most prevalent form of input is made by the
movements of our arms, hands, and fingers for keyboard and
mouse input. Berard et al. have reported that there was a
significant drop in human motor control performance below a
certain spatial-resolution threshold.
For instance, while the actual performance is dependent on
the form factor of the device used and the mode of operation,
the mouse is operable with a spatial resolution on the order of
thousands of dpi (dots per inch) or ≈0.020 mm, while the
resolution for a 3-D stylus in the hundreds.
Human Body Ergonomics
(Motor Capabilities)
MOTOR CONTROL
In addition to discrete-event input methods (e.g., buttons),
modern user interfaces make heavy use of continuous input
methods in the two-dimensional (2-D) space (e.g., mouse,
touch screen) and increasingly in the 3-D space (e.g., haptic,
Wii-mote). While the human capabilities will determine the
achievable accuracy in such input methods, the control-
display (C/D) ratio is often adjusted. C/D ratio refers to the
ratio of the movement in the control device (e.g., mouse)
to that in the display (e.g., cursor). If the C/D ratio is low, the
sensitivity of the control is high and, therefore, travel time
across the display will be fast. If the C/D/ ratio is high,
sensitivity is low and, therefore, the fine-adjust time will be
relatively fast.
Human Body Ergonomics
(Motor Capabilities)
OTHERS
There are many cognitive, perceptual, and ergonomic
issues that have been left out:
Learning and adaptation
Modalities other than the “big three
(visual/aural/haptictactile), such as gestures, facial
expression, brain waves, physiological signals
(electromyogram, heart rate, skin conductance), gaze,
etc.
Aesthetics and emotion
Multitasking
SUMMARY
In this chapter, we have reviewed the essence of human factors, including sensation,
perception, information processing, and Fitts’s law, as the foremost underlying theory for
the design of interfaces for human–computer interaction. By the very principle of “Know
thy user,” it is clear that the HCI designer must have a basic understanding of these areas
so that any interface will suit the user’s most basic mental, perceptual, and ergonomic
capabilities. We can also readily see that many of the HCI principles discussed previously
in this book naturally derive from these underlying theories.