KEMBAR78
Skinput Technology | PDF | Body Mass Index | Sampling (Signal Processing)
0% found this document useful (0 votes)
129 views18 pages

Skinput Technology

Skinput is a technology that uses bio-acoustic sensing to allow touch input on the skin. It involves embedding an array of piezoelectric sensors tuned to different frequencies into an armband to capture the acoustic signals generated when the skin is tapped. These signals include transverse and longitudinal waves traveling through soft tissue and bone that create unique acoustic fingerprints for different tap locations. The system aims to provide an always-available input that does not require the user to hold or wear a separate device.

Uploaded by

Pranay Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
129 views18 pages

Skinput Technology

Skinput is a technology that uses bio-acoustic sensing to allow touch input on the skin. It involves embedding an array of piezoelectric sensors tuned to different frequencies into an armband to capture the acoustic signals generated when the skin is tapped. These signals include transverse and longitudinal waves traveling through soft tissue and bone that create unique acoustic fingerprints for different tap locations. The system aims to provide an always-available input that does not require the user to hold or wear a separate device.

Uploaded by

Pranay Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 18

SKINPUT TECHNOLOGY

`HISTORY

Many other related works are done by the scientists to get an appropriate input
surface. They are as follows:

2.1 Always-Available Input

The primary goal of Skinput is to provide an always available mobile input


system – that is, an input system that does not require a user to carry or pick up a
device. A number of alternative approaches have been proposed that operate in this
space. Techniques based on computer vision are popular. These, however, are
computationally expensive and error prone in mobile scenarios (where, e.g., non-input
optical flow is prevalent). Speech input is a logical choice for always- available input,
but is limited in its precision in unpredictable acoustic environments, and suffers from
privacy and scalability issues in shared environments.

Other approaches have taken the form of wearable computing. This typically
involves a physical input device built in a form considered to be part of one’s
clothing. For example, glove-based input systems allow users to retain most of their
natural hand movements, but are cumbersome, uncomfortable, and disruptive to
tactile sensation. A “smart fabric” system that embeds sensors and conductors into
fabric has also been presented earlier, but taking this approach to always-available
input necessitates embedding technology in all clothing, which would be prohibitively
complex and expensive.

The Sixth Sense project proposes a mobile, always available input/output


capability by combining projected information with a color-marker-based vision
tracking system. This approach is feasible, but suffers from serious occlusion and
accuracy limitations. For example, determining whether, e.g., a finger has tapped a
button, or is merely hovering above it, is extraordinarily difficult. In the present work,
a brief exploration of the combination of on-body sensing with on-body projection is
done.

2.2 Bio-Sensing
SKINPUT TECHNOLOGY

Skinput leverages the natural acoustic conduction properties of the human


body to provide an input system, and is thus related to previous work in the use of
biological signals for computer input. Signals traditionally used for diagnostic
medicine, such as heart rate and skin resistance, have been appropriated for assessing
a user’s emotional state .These features are generally subconsciously driven and
cannot be controlled with sufficient precision for direct input. Similarly, brain sensing
technologies such as electroencephalography (EEG) and functional near-infrared
spectroscopy (fNIR) have been used by HCI researchers to assess cognitive and
emotional state; this work also primarily looked at involuntary signals. In contrast,
brain signals have been harnessed as a direct input for use by paralyzed patients, but
direct brain computer interfaces (BCIs) still lack the bandwidth required for everyday
computing tasks, and requires levels of focus, training, and concentration that are
incompatible with typical computer interaction.

There has been less work relating to the intersection of finger input and
biological signals. Researchers have harnessed the electrical signals generated by
muscle activation during normal hand movement through electromyography
(EMG) .At present; however, this approach typically requires expensive amplification
systems and the application of conductive gel for effective signal acquisition, which
would limit the acceptability of this approach for most users. The input technology
most related to Skin Put is that which placed contact microphones on a user’s wrist to
assess finger movement. Bone conduction microphones and headphones – now
common consumer technologies – represent an additional bio-sensing technology that
is relevant to the present work. These leverage the fact that sound frequencies relevant
to human speech propagate well through bone. Bone conduction microphones are
typically worn near the ear, where they can sense vibrations propagating from the
mouth and larynx during speech. Bone conduction headphones send sound through
the bones of the skull and jaw directly to the inner ear, bypassing transmission of
sound through the air and outer ear, leaving an unobstructed path for environmental
sounds.

2.3 Acoustic Input


SKINPUT TECHNOLOGY

The skin put approach is also inspired by systems that leverage acoustic
transmission through (non-body) input surfaces. The arrival time of a sound at
multiple sensors has been measured before to locate hand taps on a glass window. A
similar approach to localize a ball hitting a table, for computer augmentation of a real-
world game has also been done. Both of these systems use acoustic time-of-flight for
localization, which when explored for skin put was found to be insufficiently robust
on the human body, leading to the fingerprinting approach described in this paper.

CHAPTER-3
SKINPUT TECHNOLOGY

HARDWARE ARCHITECTURE

Figure 3.1: A wearable, bio-acoustic sensing array built into an armband

3.1 Bio-Acoustics

When a finger taps the skin, several distinct forms of acoustic energy are
produced. Some energy is radiated into the air as sound waves; this energy is not
captured by the Skinput system. Among the acoustic energy transmitted through the
arm, the most readily visible are transverse waves, created by the displacement of the
skin from a finger impact (Figure 3.2). When shot with a high-speed camera, these
appear as ripples, which propagate outward from the point of contact. The amplitude
of these ripples is correlated to both the tapping force and to the volume and
compliance of soft tissues under the impact area. In general, tapping on soft regions of
the arm creates higher amplitude transverse waves than tapping on boney areas (e.g.,
wrist, palm, fingers), which have negligible compliance.

Figure 3.2: Transverse wave propagation


SKINPUT TECHNOLOGY

Figure 3.3: Longitudinal wave propagation

In addition to the energy that propagates on the surface of the arm, some
energy is transmitted inward, toward the skeleton (Figure 3.3). These longitudinal
(compressive) waves travel through the soft tissues of the arm, exciting the bone,
which is much less deformable then the soft tissue but can respond to mechanical
excitation by rotating and translating as a rigid body. This excitation vibrates soft
tissues surrounding the entire length of the bone, resulting in new longitudinal waves
that propagate outward to the skin.

There are two important and separate forms of conduction –transverse waves
moving directly along the arm surface, and longitudinal waves moving into and out of
the bone through soft tissues – because these mechanisms carry energy at different
frequencies and over different distances.

Roughly speaking, higher frequencies propagate more readily through bone


than through soft tissue, and bone conduction carries energy over larger distances than
soft tissue conduction. While we do not explicitly model the specific mechanisms of
conduction, or depend on these mechanisms for our analysis, we do believe the
success of our technique depends on the complex acoustic patterns that result from
mixtures of these modalities.

Similarly, joints play an important role in making tapped locations acoustically


distinct. Bones are held together by ligaments, and joints often include additional
biological structures such as fluid cavities. This makes joints behave as acoustic
filters. In some cases, these may simply dampen acoustics; in other cases, these will
selectively attenuate specific frequencies, creating location-specific acoustic
signatures.

3.2 Sensing

To capture the rich variety of acoustic information described in the previous


section, many sensing technologies are evaluated, including bone conduction
microphones, conventional microphones coupled with stethoscopes, piezo contact
microphones, and accelerometers. However, these transducers were engineered for
very different applications than measuring acoustics transmitted through the human
SKINPUT TECHNOLOGY

body. As such, they were lacking in several significant ways. Foremost, most
mechanical sensors are engineered to provide relatively flat response curves over the
range of frequencies that is relevant to signal of choice.

This is a desirable property for most applications where a faithful


representation of an input signal – uncolored by the properties of the transducer – is
desired. However, because only a specific set of frequencies is conducted through the
arm in response to tap input, a flat response curve leads to the capture of irrelevant
frequencies and thus to a high signal-to-noise ratio. While bone conduction
microphones might seem a suitable choice for Skinput, these devices are typically
engineered for capturing human voice, and filter out energy below the range of human
speech (whose lowest frequency is around 85Hz). Thus most sensors in this category
were not especially sensitive to lower-frequency signals (e.g., 25Hz), which were
found in the empirical pilot studies to be vital in characterizing finger taps.

To overcome these challenges, instead of a single sensing element with a flat


response curve, to an array of highly tuned vibration sensors are used. Specifically,
small, cantilevered piezo films (MiniSense100, Measurement Specialties, Inc.) were
employed. By adding small weights to the end of the cantilever, the resonant
frequency is altered, allowing the sensing element to be responsive to a unique,
narrow, low-frequency band of the acoustic spectrum. Adding more mass lowers the
range of excitation to which a sensor responds.

Each element was weighted such that it aligned with particular frequencies
that pilot studies showed to be useful in characterizing bio-acoustic input. Figure 3.4
shows the response curve for one of the sensors, tuned to a resonant frequency of
78Hz. The curve shows a ~14dB drop-off ±20Hz away from the resonant frequency.
SKINPUT TECHNOLOGY

Figure 3.4: Response curve of the sensing element

Additionally, the cantilevered sensors were naturally insensitive to forces


parallel to the skin (e.g., shearing motions caused by stretching). Thus, the skin stretch
induced by many routine movements (e.g., reaching for a doorknob) tends to be
attenuated. However, the sensors are highly responsive to motion perpendicular to the
skin plane – perfect or capturing transverse surface waves and longitudinal waves
emanating from interior structures.

Finally, the sensor design is relatively inexpensive and can be manufactured in


a very small form factor (e.g., MEMS), rendering it suitable for inclusion in future
mobile devices (e.g., an arm-mounted audio player).

.3 3Armband Prototype

The final prototype, shown in Figures 3.1, features two arrays of five sensing
elements, incorporated into an armband form factor. The decision to have two sensor
packages was motivated by the focus on the arm for input. In particular, when placed
on the upper arm (above the elbow), acoustic information collection from the fleshy
bicep area in addition to the firmer area on the underside of the arm, with better
acoustic coupling to the Humerus-the main bone that runs from shoulder to elbow was
hoped to be possible.

When the sensor was placed below the elbow, on the forearm, one package
was located near the Radius, the bone that runs from the lateral side of the elbow to
the thumb side of the wrist, and the other near the Ulna, which runs parallel to this on
the medial side of the arm closest to the body. Each location thus provided slightly
different acoustic coverage and information, helpful in disambiguating input location.

Based on pilot data collection, a different set of resonant frequencies were


selected for each sensor package (Table 3.1). The upper sensor package was tuned to
be more sensitive to lower frequency signals, as these were more prevalent in fleshier
areas. Conversely, the lower sensor array was tuned to be sensitive to higher
frequencies, in order to better capture signals transmitted though (denser) bones.
SKINPUT TECHNOLOGY

Upper 25 Hz 27Hz 30Hz 38Hz 78Hz


array
Lower 25Hz 27Hz 40Hz 44Hz 64Hz
array

Table 3.1: Resonant frequencies of elements in the two sensor packages

3.4 Processing

In the prototype system, a Mackie Onyx 1200F is employed for audio


interface to digitally capture data from the ten sensors (http://mackie.com). This was
connected via Firewire to a conventional desktop computer, where a thin client
written in C interfaced with the device using the Audio Stream Input/ Output (ASIO)
protocol. Each channel was sampled at 5.5 kHz, a sampling rate that would be
considered too low for speech or environmental audio, but was able to represent the
relevant spectrum of frequencies transmitted through the arm. This reduced sample
rate (and consequently low processing bandwidth) makes our technique readily
portable to embedded processors.

For example, the ATmega168 processor employed by the Arduino platform


can sample analog readings at 77 kHz with no loss of precision, and could therefore
provide the full sampling power required for Skinput (55 kHz total). Data was then
sent from the thin client over a local socket to the primary application, written in Java.
This program performed three key functions.

Figure 3.5: Ten channels of acoustic data generated by finger taps


SKINPUT TECHNOLOGY

First, it provided a live visualization of the data from the ten sensors, which
was useful in identifying acoustic features (Figure 3.5). Second, it segmented inputs
from the data stream into independent instances (taps). Third, it classified these input
instances. The audio stream was segmented into individual taps using an absolute
exponential average of all ten channels (Figure 3.5, red waveform). When an intensity
threshold was exceeded (Figure 3.5, upper blue line), the program recorded the
timestamp as a potential start of a tap.

If the intensity did not fall below a second, independent “closing” threshold
(Figure 3.5, lower purple line) between 100ms and 700ms after the onset crossing
(duration found to be the common for finger impacts), the event was discarded. If start
and end crossings were detected that satisfied these criteria, the acoustic data in that
period (plus a 60ms buffer on either end) was considered an input event (Figure 3.5,
vertical green regions). Although simple, this heuristic proved to be highly robust,
mainly due to the extreme noise suppression provided by the sensing approach.

After an input has been segmented, the waveforms are analyzed. The highly
discrete nature of taps (i.e. point impacts) meant acoustic signals were not particularly
expressive over time (unlike gestures, e.g., clenching of the hand). Signals simply
diminished in intensity overtime. Thus, features are computed over the entire input
window and do not capture any temporal dynamics. A brute force machine learning
approach is employed, computing 186 features in total, many of which are derived
combinatorial. For gross information, the average amplitude is included, standard
deviation and total (absolute) energy of the waveforms in each channel (30 features).

From these, all average amplitude ratios between channel pairs (45 features)
are calculated. An average of these ratios (1 feature) is also included. A 256-point FFT
for all ten channels is calculated, although only the lower ten values are used
(representing the acoustic power from 0Hz to 193Hz), yielding 100 features. These
are normalized by the highest-amplitude FFT value found on any channel. the center
of mass of the power spectrum within the same 0Hz to 193Hz range for each channel
is included as well, a rough estimation of the fundamental frequency of the signal
displacing each sensor (10 features). Subsequent feature selection established the all-
pairs amplitude ratios and certain bands of the FFT to be the most predictive features.
These 186 features are passed to a Support Vector Machine (SVM) classifier.
SKINPUT TECHNOLOGY

The devised software uses the implementation provided in the Weka machine
learning toolkit. It should be noted, however, that other, more sophisticated
classification techniques and features could be employed. Thus, the results presented
in this paper should be considered a baseline. Before the SVM can classify input
instances, it must first be trained to the user and the sensor position. This stage
requires the collection of several examples for each input location of interest.

When using Skinput to recognize live input, the same 186 acoustic features are
computed on-the fly for each segmented input. These are fed into the trained SVM for
classification. An event model is used in the software – once an input is classified, an
event associated with that location is instantiated. Any interactive features bound to
that event are fired. As can be seen in the video, interactive speeds were readily
achieved.

CHAPTER-4
EXPERIMENTAL STUDY

To evaluate the performance of the system, 13 participants were recruited (7


female). These participants represented a diverse cross-section of potential ages and
body types. Ages ranged from 20 to 56 (mean 38.3), and computed body mass indexes
(BMIs) ranged from 20.5 (normal) to 31.9 (obese).

4.1 Experimental Conditions

Three input groupings were selected from the multitude of possible location
combinations to test. These groupings, illustrated in Figure 4.1, are of particular
interest with respect to interface design, and at the same time, push the limits of our
sensing capability. From these three groupings, five different experimental conditions
were derived as described below.
SKINPUT TECHNOLOGY

Figure 4.1: The three input location sets evaluated

4.1.1 Fingers (Five Locations)

One set of gestures tested had participants tapping on the tips of each of their
five fingers (Figure 4.1., “Fingers”). The fingers offer interesting affordances that
make them compelling to appropriate for input. Foremost, they provide clearly
discrete interaction points, which are even already well-named (e.g., ring finger). In
addition to five finger tips, there are 14 knuckles (five major, nine minor), which,
taken together, could offer 19 readily identifiable input locations on the fingers alone.

Second, the exceptional finger-to finger dexterity, as demonstrated when


counting by tapping on our fingers was used. Finally, the fingers are linearly ordered,
which is potentially useful for interfaces like number entry, magnitude control (e.g.,
volume), and menu selection.

At the same time, fingers are among the most uniform appendages on the body,
with all but the thumb sharing a similar skeletal and muscular structure. This
drastically reduces acoustic variation and makes differentiating among them difficult.
Additionally, acoustic information must cross as many as five (finger and wrist) joints
to reach the forearm, which further dampens signals. For this experimental condition,
the sensor arrays were placed on the forearm, just below the elbow.

Despite these difficulties, pilot experiments showed measureable acoustic


differences among fingers, which is primarily related to finger length and thickness,
interactions with the complex structure of the wrist bones, and variations in the
SKINPUT TECHNOLOGY

acoustic transmission properties of the muscles extending from the fingers to the
forearm.

4.1.2 Whole Arm (Five Locations)

Another gesture set investigated the use of five input locations on the forearm
and hand: arm, wrist, palm, thumb and middle finger (Figure 4.1, “Whole Arm”).
These locations were selected for two important reasons. First, they are distinct and
named parts of the body (e.g., “wrist”). This allowed participants to accurately tap
these locations without training or markings. Additionally, these locations proved to
be acoustically distinct during piloting, with the large spatial spread of input points
offering further variation.

These locations were used in three different conditions. One condition placed
the sensor above the elbow, while another placed it below. This was incorporated into
the experiment to measure the accuracy loss across this significant articulation point
(the elbow). Additionally, participants repeated the lower placement condition in an
eyes-free context: participants were told to close their eyes and face forward, both for
training and testing. This condition was included to gauge how well users could target
on-body input locations in an eyes-free context (e.g., driving).

4.1.3 Forearm (Ten Locations)

In an effort to assess the upper bound of the approach’s sensing resolution, the
fifth and final experimental condition used ten locations on just the forearm (Figure
4.1, “Forearm”). Not only was this a very high density of input locations (unlike the
whole-arm condition), but it also relied on an input surface (the forearm) with a high
degree of physical uniformity (unlike, e.g., the hand). These factors were expected to
make acoustic sensing difficult. Moreover, this location was compelling due to its
large and flat surface area, as well as its immediate accessibility, both visually and for
finger input. Simultaneously, this makes for an ideal projection surface for dynamic
interfaces.

To maximize the surface area for input, the sensor was placed above the elbow,
leaving the entire forearm free. Rather than naming the input locations, as was done in
the previously described conditions, small, colored stickers were employed to mark
SKINPUT TECHNOLOGY

input targets. This was both to reduce confusion (since locations on the forearm do not
have common names) and to increase input consistency. As mentioned previously, the
forearm is ideal for projected interface elements; the stickers served as low-tech
placeholders for projected buttons.

4.2 Design and Setup

A within-subjects design was employed in the experiment, with each


participant performing tasks in each of the five conditions in randomized order: five
fingers with sensors below elbow; five points on the whole arm with the sensors
above the elbow; the same points with sensors below the elbow, both sighted and
blind; and ten marked points on the forearm with the sensors above the elbow.

Participants were seated in a conventional office chair, in front of a desktop


computer that presented stimuli. For conditions with sensors below the elbow, the
armband was placed ~3cm away from the elbow, with one sensor package near the
radius and the other near the ulna. For conditions with the sensors above the elbow,
the armband was placed ~7cm above the elbow, such that one sensor package rested
on the biceps. Right-handed participants had the armband placed on the left arm,
which allowed them to use their dominant hand for finger input. For the one left-
handed participant, we flipped the setup, which had no apparent effect on the
operation of the system. Tightness of the armband was adjusted to be firm, but
comfortable. While performing tasks, participants could place their elbow on the desk,
tucked against their body, or on the chair’s adjustable armrest; most chose the latter.

4.3 Procedure

For each condition, the experimenter walked through the input locations to be
tested and demonstrated finger taps on each. Participants practiced duplicating these
motions for approximately one minute with each gesture set. This allowed participants
to familiarize themselves with our naming conventions (e.g. “pinky”, “wrist”), and to
practice tapping their arm and hands with a finger on the opposite hand.

It also allowed conveying the appropriate tap force to participants, who often
initially tapped unnecessarily hard. To train the system, participants were instructed to
comfortably tap each location ten times, with a finger of their choosing. This
SKINPUT TECHNOLOGY

constituted one training round. In total, three rounds of training data were collected
per input location set (30 examples per location, 150 data points’ total). An exception
to this procedure was in the case of the ten forearm locations, where only two rounds
were collected to save time (20 examples per location, 200 data points total). Total
training time for each experimental condition was approximately three minutes.

We used the training data to build an SVM classifier. During the subsequent
testing phase, participants were presented with simple text stimuli (e.g. “tap your
wrist”), which instructed them where to tap. The order of stimuli was randomized,
with each location appearing ten times in total.

The system performed real-time segmentation and classification, and provided


immediate feedback to the participant (e.g. “you tapped your wrist”). Feedbacks were
provided so that participants could see where the system was making errors (as they
would if using a real application). If an input was not segmented (i.e. the tap was too
quiet), participants could see this and would simply tap again. Overall, segmentation
error rates were negligible in all conditions, and not included in further analysis.

CHAPTER-5
RESULTS

The classification accuracies for the test phases in the five different conditions
are reported in detail. Overall, classification rates were high, with an average accuracy
across conditions of 87.6%. Additionally, preliminary results exploring the correlation
between classification accuracy and factors such as BMI, age, and sex are also stated.

5.1 Five Fingers

Despite multiple joint crossings and ~40cm of separation between the input
targets and sensors, classification accuracy remained high for the five-finger
condition, averaging 87.7% (SD=10.0%, chance=20%) across participants.
Segmentation, as in other conditions, was essentially perfect.

Inspection of the confusion matrices showed no systematic errors in the


classification, with errors tending to be evenly distributed over the other digits. When
SKINPUT TECHNOLOGY

classification was incorrect, the system believed the input to be an adjacent finger
60.5% of the time; only marginally above prior probability (40%). This suggests there
are only limited acoustic continuities between the fingers. The only potential
exception to this was in the case of the pinky, where the ring finger constituted 63.3%
percent of the misclassifications.

5.2 Whole Arm

Participants performed three conditions with the whole-arm location


configuration. The below-elbow placement performed the best, posting a 95.5%
(SD=5.1%, chance=20%) average accuracy. This is not surprising, as this condition
placed the sensors closer to the input targets than the other conditions. Moving the
sensor above the elbow reduced accuracy to 88.3% (SD=7.8%, chance=20%), a drop
of 7.2%. This is almost certainly related to the acoustic loss at the elbow joint and the
additional 10cm of distance between the sensor and input targets. Figure 8 shows
these results.

The eyes-free input condition yielded lower accuracies than other conditions,
averaging 85.0% (SD=9.4%, chance=20%). This represents a 10.5% drop from its
vision assisted, but otherwise identical counterpart condition. It was apparent from
watching participants complete this condition that targeting precision was reduced. In
sighted conditions, participants appeared to be able to tap locations with perhaps a
2cm radius of error. Although not formally captured, this margin of error appeared to
double or triple when the eyes were closed. Additional training data, which better
covers the increased input variability, would remove much of this deficit.
SKINPUT TECHNOLOGY

Figure 5.1: Ten input locations made into groups

5.3 Forearm

Classification accuracy for the ten-location forearm condition stood at 81.5%


(SD=10.5%, chance=10%), a surprisingly strong result for an input set devised to
push the system’s sensing limit (K=0.72, considered very strong).

Following the experiment, different ways to improve accuracy were


considered by collapsing the ten locations into larger input groupings. The goal of this
exercise was to explore the tradeoff between classification accuracy and number of
input locations on the forearm, which represents a particularly valuable input surface
for application designers. Targets were grouped into sets based on what was to be
logical spatial groupings (Figure 5.1, A-E and G). In addition to exploring
classification accuracies for layouts considered to be intuitive an exhaustive search
(programmatically) was performed over all possible groupings. For most location
counts, this search confirmed that the intuitive groupings were optimal; however, this
search revealed one plausible, although irregular, layout with high accuracy at six
input locations.

Unlike in the five-fingers condition, there appeared to be shared acoustic traits


that led to a higher likelihood of confusion with adjacent targets than distant ones.
This effect was more prominent laterally than longitudinally. Figure 5.1 illustrates this
with lateral groupings consistently outperforming similarly arranged, longitudinal
groupings (B and C vs. D and E). This is unsurprising given the morphology of the
arm, with a high degree of bilateral symmetry along the long axis.

5.4 BMI Effects

Early on, the acoustic approach was susceptible to variations in body


composition. This included, most notably, the prevalence of fatty tissues and the
density/ mass of bones. These, respectively, tend to dampen or facilitate the
transmission of acoustic energy in the body. To assess how these variations affected
the sensing accuracy, each participant’s body mass index (BMI) was calculated from
self-reported weight and height. Data and observations from the experiment suggest
that high BMI is correlated with decreased accuracies. The participants with the three
SKINPUT TECHNOLOGY

highest BMIs (29.2, 29.6, and 31.9 – representing borderline obese to obese)
produced the three lowest average accuracies.

Figure 5.2: Accuracy against BMI

Figure 5.2 illustrates this significant disparity - here participants are separated
into two groups, those with BMI greater and less than the US national median, age
and sex adjusted (F1,12=8.65, p=.013). Other factors such as age and sex, which may
be correlated to BMI in specific populations, might also exhibit a correlation with
classification accuracy. For example, in the participant pool, males yielded higher
classification accuracies than females, but, this is an expected artifact of BMI
correlation in the sample, and probably not an effect of sex directly.

5.5 Example interfaces and interactions

Several prototype interfaces were conceived and built that demonstrated our
ability to appropriate the human body, in this case the arm, and use it as an interactive
surface. These interfaces can be seen in Figure 5.3. While the bio-acoustic input
modality is not strictly tethered to a particular output modality, the sensor form factors
explored could be readily coupled with visual output provided by an integrated pico-
projector.

There are two nice properties of wearing such a projection device on the arm
that permit one to sidestep many calibration issues. First, the arm is a relatively rigid
structure - the projector, when attached appropriately, will naturally track with the
arm. Second, since one has fine-grained control of the arm, making minute
adjustments to align the projected image with the arm is trivial (e.g., projected
horizontal stripes for alignment with the wrist and elbow).
SKINPUT TECHNOLOGY

Figure 5.3: The armband augmented with a pico-projector

To illustrate the utility of coupling projection and finger input on the body (as
researchers have proposed to do with projection and computer vision-based
techniques), a three proof-of-concept projected interfaces built on top of the system’s
live input classification was developed.

In the presented interface, a numeric keypad is projected on a user’s palm and


allows them to tap on the palm to, e.g., dial a phone number (right). To emphasize the
output flexibility of approach, bio-acoustic input was coupled to audio output. In this
case, the user taps on preset locations on their forearm and hand to navigate and
interact with an audio interface.

You might also like