KEMBAR78
Virtual Sound Source Positioning Using Vector Base | PDF | Sound | Acoustics
0% found this document useful (0 votes)
26 views11 pages

Virtual Sound Source Positioning Using Vector Base

Uploaded by

xmacghost
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views11 pages

Virtual Sound Source Positioning Using Vector Base

Uploaded by

xmacghost
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

PAPERS

Virtual Sound Source Positioning Using Vector Base


Amplitude Panning*

VlLLE PULKKI, AES Member

Laboratory of Acoustics and Audio Signal Processing, Helsinki University of Technology, FIN-02015 HUT, Finland

A vector-based reformulation of amplitude panning is derived, which leads to simple and


computationally efficient equations for virtual sound source positioning. Using the method,
vector base amplitude panning (VBAP), it is possible to create two- or three-dimensional
sound fields where any number of loudspeakers can be placed arbitrarily. The method
produces virtual sound sources that are as sharp as is possible with current loudspeaker
configuration and amplitude panning methods. A digital tool that implements two- and three-
dimensional VBAP with eight inputs and outputs has been realized.

0 INTRODUCTION matrixing. A review of such systems is presented in [2].


In most systems the loudspeakers are situated in a
The acoustical sound field around us is very complex, two-dimensional (horizontal) plane. Some attempts to
Direct sounds, reflections, and refractions arrive at the produce periphonic (full-sphere) sound fields with three-
listener's ears, who then analyzes incoming sounds and dimensional loudspeaker placement exist, such as holo-
connects them mentally to sound sources. Spatial hear- phony [3] or three-dimensional Ambisonics [4].
ing is an important part of the cognition of the sur- Periphonic sound fields can be produced in two-
rounding world, channel loudspeaker or headphone listening by filtering
The perception of the direction of the sound source the sound material with digital models of the free-field
relies heavily on the two main localization cues: in- transfer functions between the'listener's ear canal and
teraural level difference (ILD) and interaural time differ- the desired place of the sound source [head-related trans-
ence (ITD) [1]. These frequency-dependent differences fer functions (HRTFs)] [5]. The spectral information of
occur when the sound arrives at thelistener's ears after the direction of the sound source is thus added to the
having traveled paths of different lengths or being shad- signal emanating from the loudspeakers. The system,
owed differently by the listener's head. In addition to however, has quite strict boundary conditions, which
ILD and ITD some other cues, such as spectral coloring, limits its use.
are used by humans in sound source localization. In most systems the positions of the loudspeakers are
Bringing a virtual three-dimensional sound field to a fixed. In the Ambisonics systems, the number and place-
listening situation is one goal of the research in the audio ment of the loudspeakers may be variable. However, the
reproduction field. The first recordings were monopho- best possible localization accuracy is achieved with olXhog-
nic; they created pointlike sound fields. A big step was onal loudspeaker placement. If the number of loudspeakers
two-channel stereophonic reproduction, with which the is greater, the accuracy is not improved appreciably.
sound field was enlarged to a line between two loud- A natural improvement would be a virtual sound
speakers. Two-channel stereophony is still the most used source positioning system that would be independent of
reproduction method in domestic and professional the loudspeaker arrangement and could produce virtual
equipment, sound sources with maximum accuracy using the current
Various attempts to enlarge the sound field have been loudspeaker configuration.
proposed. Horizontal-only (pantophonic) sound fields The vector base amplitude panning (VBAP) described
have been created with various numbers of loudspeakers in this paper, first introduced in [6], is a new approach
and with various systems of encoding and decoding and to the problem. The approach enables the use of an
unlimited number of loudspeakers in an arbitrary two-
* Manuscript received 1996 November 14; revised 1997 or three-dimensional placement around the listener. The
April 5. loudspeakers are required to be nearly equidistant from

456 J. AudioEng.Soc.,Vol.45,No.6, 1997June

c Audio Engineering Society, Inc. 1997


PAPERS VIRTUAL SOUND SOURCE POSITIONING

the listener, and the listening room is assumed to be not the virtual source should appear is defined and the pan-
very reverberant. Multiple moving or stationary sounds ning tool performs the gain factor calculation. In the
can be positioned in any direction in the sound field next two subsections some different ways of calculating
spanned by the loudspeakers, the factors will be presented.
In VBAP the amplitude panning method is reformu-
lated with vectors and vector bases. The reformulation 1.1 Trigonometric Formulation
leads to simple equations for amplitude panning, and The directional perception of a virtual sound source
the use of vectors makes the panning methods computa- produced by amplitude panning follows approximately
tionally efficient. Two- and three-dimensional VBAP the stereophonic law of sines originally proposed by
methods are presented. Applications of the methods are Blumlein [8] and reformulated in phasor form by'
discussed and some comparisons are made between the Bauer [9],
three-dimensional methods proposed in this paper and

those presented elsewhere. A panning tool that is capable sin _0 _ gl - g2 (2)


of producing sound fields with multiple moving virtual sin _0o gl + g2
sources in two or three dimensions is also discussed.
where 0 ° < _00< 90 °, -q_o _< qo _< q_0, and gl, g2 E
I TWO-DIMENSIONAL AMPLITUDE PANNING [0, 1]. In Eq. (2) _0 represents the angle between the x
axis and the direction of the virtual source; -+_0o is the
Two-dimensional amplitude panning, also known as angle between the x axis and the loudspeakers. This
intensity panning, is the most popular panning method, equation is valid if the listener's head is pointing directly
The applications range from small domestic stereo- forward. If the listener turns his or her head following
phonic amplifiers to professional mixers. The method is, the virtual source, the tangent law is more correct [ 10],
however, an approximation of real-source localization.

In the simple amplitude panning method two loudspeak- tan _0 = g; - g2 (3)


ers radiate coherent signals, which may have different tan _0o gl + g2
amplitudes. The listener perceives an illusion of a single
auditory event (virtual sound source, phantom sound where 0 ° < q_o< 90°, -_o0 _< _0 _< q_0, and gl, g2 E
source), which can be placed on a two-dimensional sector [0, 1]. Eqs. (2) and (3) have been calculated with the
defined by locations of the loudspeakers and the listener assumption that the incoming sound is different only in
by controlling the signal amplitudes of the loudspeakers, magnitude, which is valid for frequencies below 500-
A typical loudspeaker configuration is illustrated in Fig. 600 Hz. When keeping the sound power level constant,
1. Two loudspeakers are positioned symmetrically with the gain factors can be solved using Eqs. (2) and (1) or
using Eqs. (3) and (1). The slight difference between
respect to the median plane. Amplitudes of the signals are
controlled with gain factors gl and g2, respectively. The Eqs. (2) and (3) means that the rotation of the head
loudspeakers are typically positioned at q_0= 30° angles, causes small movements of the virtual sources. How-
The direction of the virtual source is dependent on the ever, in subjective tests [ 11] it was shown that this effect
is negligible.
relation of the amplitudes of the emanating signals. If
the virtual source is moving and its loudness should be Some kind of amplitude panning method is used in
constant, the gain factors that control the channel levels the Ambisonics encoding system [12]. In pantophonic
Ambisonics the entire sound field is decoded to three
have to be normalized. The sound power can be set to
a constant value C, whereby the following approxima- channels using a modified amplitude panning method.
tion can be stated: Two of the channels, X and Y, contain the components
of the sound on the x axis and the y axis, respectively.
g_ + gl = C. (1) The third, W, contains a monophonic mix of the sound
material. The signal to be stored on the channels is calcu-
Some other ways to normalize the gain factors are
presented in [7]. The parameter C > 0 can be considered virtual s
a volume control of the virtual source. The perception source ]' x

limits on C--theof the


of the distance louder
virtualthesource
sound,depends
the closer it is
within lo-
some channel .-;_\1 ".".'f;J- - - ...... -...,
t active arc channel 2
cated. To control the distance accurately, some psycho- /_.-\ ,I ,/_
acoustical phenomena should be taken into account, and
some other sound elements should be added, such as
reflections and reverberation [ 1]. 9
When the distance of the virtual source is left unat--
tended, the virtual source can be placed on an arc be-
tween the loudspeakers, the radius of which is defined
by the distance between the listener and the loudspeak- Y
ers. The arc is called the activearc, as seen in Fig. 1. _ _-_
In the ideal panning process only the direction where Fig. 1. Two-channel stereophonic configuration.

J. Audio Eng. Soc., VoL 45, No. 6, 1997 June 457

c Audio Engineering Society, Inc. 1997


PULKKI PAPERS

lated by multiplying the input signal samples by the In Eq. (7) gl and g2 are gain factors, which can be treated
channel-specific gain factor. The gain factors gx, gy, and as nonnegative scalar variables. We may write the equa-
gw are formulated as tion in matrix form,

gx = cos 0 (4) pT = gLl 2 (8)

gy = sin 0 (5) where g = [gl g2] and Ll2 = [11 /2]T. This equation
can be solved if L_ l exists,
gw = 0.707 (6)

· Fin gl21-1
where 0 is the azimuth angle of the virtual sound source, g = pTL_ 1 = [Pi P2] L/21 /22_] (9)
as illustrated in Fig. 2.
This method differs from the standard amplitud e pan- The inverse matrix L_ 1 satisfies L12L_ l = I, where 1
ning method in that the gain factors gx and gy may have is the identity matrix. L_ 1 exists when ¢0o _ 0 ° and
negative values. The negative values imply that the sig-
_0o_ 90 °, both problem cases corresponding to quite
nal is stored on the recorder in antiphase when compared uninteresting stereophonic loudspeaker placements. For
with the monophonic mix in the W channel. When the such cases the one-dimensional VBAP can be formu-
decoded sound field is encoded, the antiphase signals lated, .which is not discussed here because of 'its
on a channel are applied to the loudspeakers in a negative triviality.
direction of the respective axis. The decoding stage is
Gain factors gl and g2 calculated using Eq. (9) satisfy
performed with matrixing equations [ 12], which are not the tangent law of Eq. (3), which is proved in the Appen-
discussed in this paper. In the equations some additions dix. When the loudspeaker base is orthogonal, _oo -
or subtractions are performed between the signal samples 45 °, the gain factors are also equivalent to those calcu-
on the W channel and on the X and Y channels. Equations lated for the Ambisonics encoding system, with the ex-
for various loudspeaker configurations can be formu-
ception that the gain factors in Ambisonics may have
lated, negativevalues. In suchcases, however,the absolute
The absolute values of the gain factors used in two-
values of the factors are equal.
dimensional Ambisonics satisfy the tangent law [Eq. When _o _ 45°, the gain factors have to be normalized
(3)], which the reader may verify, for example, for val-
using the equation
uesof0 ° < 0 < 90 ° , by setting 0 = _00 + _0, _00 =
45°, g2 = gx, and gl = gy, and by substituting Eqs. (4)
and (5) into the relation (gy - gx)/(gy + gx). gscaled_ VCg (10)
X/g_ + g2
1.2 Vector Base Formulation
In the two-dimensional VBAP method presented in Now gain factors gS¢_edsatisfy Eq. (1).
this section, the two-channel stereophonic loudspeaker 1.3 Two-Dimensional VBAP for More Than Two
configuration is reformulated as a two-dimensional vec- Loudspeakers
tor base. The base is defined by unit-length vectors I1 =
[l n 112]T and 12'= [/21 /22]T, which are pointing toward In many existing audio systems there are more than
loudspeakers 1 and 2, respectively, as seen in Fig. 3. tWO loudspeakers in the horizontal plane, such as in
The superscript T denotes the matrix transposition. The Dolby Surround systems. Such systems can also be re-
unit-length vector P = [Pi 'p2]T, which points toward formulated with vector bases. A set of loudspeaker pairs
the virtual source, can be treated as a linear combination is selected from the system, and the signal is applied at
of loudspeaker vectors, virtual
source
p =glll + g212 . (7) active arc
channel1 ' -__ - -. channel2

s u.U:e,
:::: x O'"' ]x
Il 2

y _ g212
Fig. 2. Coordinate system of two-dimensional Ambisonics sys-
tem. 0--azimuth angle; r--distance of virtual source. Fig. 3. Stereophonic configuration formulated with vectors.

458 J. Audio Eng. Soc., Vol. 45, No. 6, 1997 June

c Audio Engineering Society, Inc. 1997


PAPERS VIRTUAL SOUND SOURCE POSITIONING

any one time to only one pair. Thus the loudspeaker of numerical accuracy during calculation may produce
system consists of many vector bases competing among slightly negative gain factors in some cases. The nega-
themselves. Each loudspeaker may belong to two pairs, tive factor must be set to zero before normalization.
In Fig. 4 a loudspeaker system in which the two- A digital panning tool that can be used to position
dimensional VBAP can be applied is illustrated. A sys- sounds on the horizontal plane with a variable number
tern for virtual source positioning, which similarly uses of loudspeakers has been constructed and will be dis-
only two loudspeakers at any one time, has been imple- cussed in Section 4.
mented in an existing theater [13].
The virtual source can be produced by the loudspeaker 2 THREE-DIMENSIONAL AMPLITUDE PANNING
base on the active arc of which the virtual source is
located. Thus the sound field that can be produced with The typical two-channel stereophonic listening con-
VBAP is a union of the active arcs of the available figuration is extended with a third loudspeaker placed
loudspeaker bases. In two-dimensional cases the best in an arbitrary position at the same distance from the
way of choose the loudspeaker bases is to let the adjacent listener as the other loudspeakers. However, the loud-
loudspeakers form them. In the loudspeaker system il- speaker should not be placed on the two-dimensional
lustrated in Fig. 4 the selected bases would be L]2, L23, plane defined by the listener and the two other loud-
L34, L45, and Lsd. The active arcs of the bases are thus speakers. The virtual source can now appear within a
nonoverlapping, triangle formed by the loudspeakers when viewed from
The use of nonoverlapping active arcs provides con- the listener's position, as illustrated in Fig. 5. The term
tinuously changing gain factors when moving virtual three-dimensional amplitude panning denotes a method
sources are applied. When the sound moves from one for positioning a virtual sound source into a triangle
pair to another, the gain factor of the loudspeaker, which formed by three sound sources, which are driven by
is not used after the change, becomes gradually zero coherent electrical signals with different amplitudes.
before the change-over point. See Section 6.1 for a dem- Now the relation of three gain factors defines the vir-
onstration of this effect in a three-dimensional case. tual source direction perceived by the listener. Eq. (1)
The fact that all other loudspeakers except the selected can be generalized into a three-dimensional form as
pair are idle may seem a waste of resources. In this way,
however, good localization accuracies can be achieved g_ + g2 + g2 = C. (1 1)
for the principal sound, whereas the other loudspeakers
may produce reflections and reverberation as well as The virtual source can thus be placed on the surface
other sound elements, of the three-dimensional sphere, the radius of which is
defined by the distance between the listener and the
1.4 Implementing Two-Dimensional VBAP for loudspeakers. The region on the surface of the sphere
More Than Two Loudspeakers onto which the virtual source can be positioned is called
A digital panning tool that performs the panning pro- the active triangle.
cess is now considered. Sufficient hardware consists of
a signal processor that can perform input and output with
multiple analog-to-digital (A/D) and digital-to-analog x
(D/A) converters and has enough processing power for
the computation needed. The tool has to include also a
user interface. · 7 -_
When the tool is initialized, the directions of the loud-
speakers are measured relative to the best listening posi- /_ 1_ A

tion and loudspeaker pairs are formed from adjacent _/ _x_


loudspeakers. Lnm l matrices are calculated for each pair 15 Ls_ , L 12
and stored in the memory of the panning system.
During run time the system performs the following
steps in an infinite loop:

· New direction vectors P(l ...... ) are defined. ,c


· The right pairs are selected.
· The new gain factors are calculated.
· The old gain factors are cross faded to new ones and

The pair can be selected by calculating unsealed gain


the loudspeaker
factors with Eq. (9)bases
usingareallchanged
selected ifvector
necessary.
bases, and 1_/ L 34 13__

by selecting the base that does not produce any negative Fig. 4. Two-dimensional VBAP with five loudspeakers. Vec-
factors. In practice it is recommended to choose the tors lnpoint to loudspeakers; loudspeaker vector bases Lnrn are
pair with the highest smallest factor, because a lack formed from adjacent loudspeakers.

J. Audio Eng. Soc., Vol. 45, No. 6, 1997 June 459

c Audio Engineering Society, Inc. 1997


PULKKI PAPERS

to the direction of loudspeaker 1. The unit vectors Il,


2.1 On the Trigonometric Formulation 12, and 13 then define the directions of loudspeakers 1,
The general trigonometric formulation of three- 2, and 3, respectively. The direction of the virtual sound
dimensional amplitudepanningforarbitraryloudspeaker source is defined as a three-dimensional unit vector
placement remains unexplored. Spherical trigonometry P = [Pi P: p3]T. A sample configuration is presented
is complicated and computationally inefficient, in Fig. 5.
The periphonic Ambisonics system [4], [12] decodes We express the virtual source vector p as a linear
the three-dimensional sound field to four channels, combination of three loudspeaker vectors Il, /2, and 13,
Channels X, Y, and Z correspond to the axes of the analogically to the two-dimensional case, and express
Cartesian coordinate system, and channel W contains a it in matrix form,
monophonic mix of the input material.
The formulation is analogous to the two-dimensional P = gill + g212 + g313 (16)
formulation mentioned in Section 1. The channel-
specific gain factors gx, gy, gz, and gw are calculated as pT = gLl:3 · (17)

gx = cos 0 cos _/ (12) Here gl, g2, and g3 are gain factors, g = [gl g2 g3],
and Ll: 3 = I11 /2 /3]T. Vector g can be solved,
gy = sin 0 cos _ (13)

gz = sin _/

gw = 0.707
(14)

(15) g=plL_}=[Pl P2 P3]


[//:,
In /12 lt3

/22 123[
1 (18)
/ 1
where _/is the elevation angle and 0 the azimuth angle, L/n /32 /33J

as presented
similar in Fig. 6.transformation
to a coordinate The gain factor calculation
between spherical is if L_} exists, which is true if the vector base defined by
coordinates and Cartesian coordinates. The gain factors L123spans a three-dimensional space. Eq. (18) makes a
projection of vector p to a vector base defined by Ll23
are thus Cartesian coordinates of the unit-length vector
pointing in the direction (0, _/) in the defined spherical in a similar way as in the two-dimensional case. The
coordinate system. The decoding stage is not discussed components of vector g can be used as gain factors after
in thispaper, scaling,whichis givenby

2.2 Vector Base Formulation gSCa_ea= VCg (19)


The two-dimensional VBAP method may now be gen- %2 + g[ + g]'
eralized to the three-dimensional VBAP method. Let the
loudspeakers be positioned on the surface of a three- When the three loudspeakers are placed in an orthogonal
dimensional unit sphere, equidistant from the listener, grid, the gain factors calculated with the three-
The three-dimensional unit vector I1 = [Ill /12 /13] T, dimensional VBAP are equivalent to the absolute values
the origin of which is the center of the sphere, points of gain factors calculated in the three-dimensional Ambi-
sonics system. This is easily proved. From the orthogo-
channel 3 nality of the loudspeaker vector base we see that Lt23 =

activetriangle virtual
" '_;' source

wrtual ,' ,1 , :.... -'-'_ z


source ,' ,' ,\' _' ¢k channel 2 ...-_--
,' _¥,_ .* ,,\., ,',_ ::71
/ i 1' ' ' , ' ii

JJ
/ 141-''1 , 11_ ' _- i r X

I
,' :,'l'-._'P
'7_ _ _
,'-_-"---Jr_
12
channel
1

Fig. 5. Sample configuration for three-dimensional amplitude Fig. 6. Coordinate system used in three-dimensional Ambison-
panning. Loudspeakers form a triangle into which the virtual ics encoding and decoding. 0--azimuth angle; ,t--elevation
source can be placed, angle; r--distance of virtual source.

460 J. AudioEng.Soc.,Vol.45,No.6,1997June

c Audio Engineering Society, Inc. 1997


PAPERS VIRTUAL SOUND SOURCE POSITIONING

I = L_}. Using Eq. (18) we see directly that g = pT. sions of the active regions must be decreased. This is
The gain factors are thus the Cartesian coordinates of the done by applying more loudspeakers on the desired re-
head of the virtual source direction vector p, similarly as gion of the sound field, such as around and behind the
in the three-dimensional Ambisonics system, screen in movie theaters.
The decreasing sizes of the triangles permit also dif-
2.3 Thr_-Dimensional VBAP for More Than fering distances between a loudspeaker and the listener
Thru Loudspeakers ' (loudspeaker distances). Since the coherent signal is ap-
The three-dimensional VBAP can be applied to sys- plied to only three loudspeakers at one time, only the
terns that consist of more than three loudspeakers in an difference between the loudspeaker distances of the
arbitrary three-dimensional placement. The formulation three particular loudspeakers adds error to the perceived
of such a system is very much the same as in the two- direction of the virtual sound source. The loudspeaker
dimensional case presented in Section 1.3. Some differ- distances can be much greater in one end of the listening
ences exist, however. The number of loudspeakers in a room if there are enough loudspeakers in between to
base is obviously three, and each loudspeaker can belong ensure smooth enough changes in distances. The differ-
to several bases. The active triangles of bases should ent loudspeaker distances can be compensated by time
not be intersecting, and they should be selected so that shifting and gaining the loudspeaker signals, which en-
maximum localization accuracy in each direction is pro- ables even freer loudspeaker placement.
vided. A sample configuration with five loudspeakers is VBAP has three important properties:
illustrated in Fig. 7. In this case the selected loudspeaker 1) If the virtual source is located in the same direction
bases are L14s, L34s, and L235. as any of the loudspeakers, the signal emanates only
A digital panning tool that is able to select the loud- from that particular loudspeaker, which provides maxi-
speaker triplet and to calculate the gain factors can be mum sharpness of the virtual source.
constructed as in the two-dimensional case. The tool 2) If the virtual source is located on a line connecting
demands a little more computing power than the two- two loudspeakers, the sound is applied only to that pair,
dimensional panning tool, but it can still be implemented following the tangent law. The gain factor of the third
easily with a modern floating-point signal processor, loudspeaker is zero.
The tool is initialized in a similar way and runs simi- 3) If the virtual source is located at the center of
larly to the two-dimensional case. Selection of the triplet the active triangle, the gain factors of the loudspeakers
is performed as in the two-dimensional case. See Section are equal.
1.4 for details. A digital panning tool with eight input These properties imply that VBAP produces virtual
and output channels has been constructed [6] and will sound sources that are as sharp as it is possible with
be discussed in Section 4. present loudspeaker configurations.

3 SOME FEATURES OF VBAP 4 DSP TOOL FOR TWO- AND THREE-


DIMENSIONAL VBAP
In VBAP, as in all amplitude panning methods, the
virtual source can not be positioned outside the active 4.1 System Oventiew
arc or region. This holds even if the listener is in an A tool for two- and three-dimensional VBAP has been
arbitrary position. Thus the maximum error in the virtual constructed [6], [ 14]. The tool consists of eight A/D and
source localization is proportional to the dimensions of eight D/A converters and two Lougbborough Sound Im-
the active region. Therefore when good localization ac- ages MDC40S modules which adhere to Texas Instruments
curacies on a large listening area are desired, the dimen- TIM-40 specification [15], each module having a
TMS320C40 (C40) processor. The system has as a host a
Macintosh computer. A single C40 is used in the VBAP

................. r-r--_
_.. implementation,
sions. the of
An overview other
theone is reserved
system for future
is presented exten-
in Fig. 8.

, ,//"i''"',,,,, 4_. , L345 , ,' .fi_l13


/"'_" '"', , tool
The
kHz) isare
sample
currently
also rate
32ofkHz,
the digital
supported. but
The higher
signal rates
number processing
(44.1 and
(DSP)
48
of loudspeakers
,' L145 P ' 5 ', can range from two or three to eight, and they can be
,' . __ - - - - __ ', located anywhere on the edge of a two-dimensional cir-
- -' ! -. , cle or on the surface of a three-dimensional sphere. The
.' 5 '-. maximum number of input channels is eight. Each of
-_ "'' -' "-' "_ - them has a virtual source direction vector of its own.
The software implementation was carried out using
the QuickSig and QuickC30 DSP programming environ-
ments [16]. QuickSig and QuickC30 are based on Com-
mon Lisp and its object-oriented extension CLOS, and
Fig.
tors 7. Three-dimensional
Inpoint VBAP
to loudspeakers; with five
selected loudspeakers.
loudspeaker basesVec-
Lijk they support low-level TMS320C3x assembly language
are shown by dashed lines, with high-level object-based programming.

J. Audio Eno. Soc., Vol. 45, No. 6, 1997 June 461

c Audio Engineering Society, Inc. 1997


PULKKI PAPERS

placement standardfor cinemas and theaters.


4.2 VBAP Implementation In off-line panning each channel contains the signal to
Each input channel is panned into two or three output be radiated from the respective loudspeaker. No special
channels. Panning is performed additively: when multi- hardware is needed in the listening phase. The three-
ple input channels share the same output channels, the dimensional sound field thus produced could have a large
signal values are added together, number of virtual sources, which could all be moving
The tool has two levels of interpolation for virtual or stationary.
source direction movement. The user may update the Off-line panning can also be performed asynchro-
direction vectors p_.....s approximately Once per second, nously. The output channels can be stored in a multi-
The panning tool calculates for example 50 vectors channel audio file in a computer. The file can be stored
Pi .....s(1 ..... 50) between new and old direction vec- afterward in a multichannel recorder. The VBAP method
tors. With each interpolating direction vector set could be a part of existing software, such as CSound
Pi .....s(n) new loudspeaker triplets are selected and new [17] or ProTools [18].
gain factors are calculated using the VBAP method. The
gain factor calculation is carried out at all eight input 5.2 On-Line Panning
channels during approximately 32 sample intervals. If the loudspeaker configuration is known to be vari-
The previous gain factorspi .....s(n - 1) are cross faded able, or if the sound signal is to be placed in the sound
to calculate factors Pi .....s(n) linearly. One interpolation field during the reproduction phase, the material can
is completed with equal steps during approximately 100 be stored in a multichannel recorder without panning.
sample intervals. All eight gain factor triplets are cross During the reproduction phase a digital panning tool
faded 'simultaneously_ The gain factors do not exactly initialized with the current loudspeaker placement
satisfy Eq. (11) during fading. However, when the angle should perform the panning process following the user's
between starting-point and end-point direction vectors commands or the information stored in the recorder.
is small (_ 1°), no disturbing effects can be heard. Hybrid systems can be constructed as well. Some part
If the movements of virtual sound sources are de- of the material can be panned off line to a few channels
signed beforehand, the virtual source direction vectors forming a sound field, which can be panned on line to
P_ .....s are then calculated for each desired arrangement the position desired in the sound field. In movies, for
of virtual source directions and stored in the signal proc- example, the sound track could be panned off line to
essor's memory. Movements can also be ,controlled in two or three channels to be placed in a stationary position
real time from the host computer. In such cases the in the sound field, whereas lines and effects could be
direction vectors p_.....s are written in the signal proces- in separate tracks, which could be positioned on line
sor's memory during run time. This enables three- anywhere in the sound field.
dimensional live sound positioning. Fig. 9 illustrates a In theaters the on-line panning with VBAP would be
possible system configuration to be used with the three- very useful. The system would position the virtual sound
dimensional panning tool. source at the spot where the instrument or the singer is
located on the stage. The places of the microphones
5 USING VBAP

5.1 Off-Line Panning 1 8ADC _ TMSC40 _--_ 8DAC _


An off-line panned sound field is produced for a fixed l
loudspeaker configuration and stored in a multichannel Macintosh host
audio recording device. Off-line panning can be used if
the loudspeaker placement is fixed, such as any surround Fig. 8. Overview of hardware running two- or three-
system or a hypothetical three-dimensional loudspeaker dimensional VBAP.

[_:[_;)iim--_ Panning Tool-_ i,[_,i.._ _0___i_i,i, ],_i,i ,:


Fig. 9. Possible use of three-dimensional VBAP panning tool. Number of sound sources can vary up to eight; loudspeaker
placement is arbitrary; virtual sources may be moving or stationary.

462 J. Audio Eng. Soc., Vol. 45, No. 6, 1997 June

c Audio Engineering Society, Inc. 1997


PAPERS VIRTUAL
SOUND
SOURCE
POSITIONING

Could be tracked automatically, which could enable the while g2 is increased, and gl remains zero.
positioning of multiple virtual sound sources at the From point A the source was moved similarly to point
same time. B, which was situated in the center of the triangle. The
The VBAP could also be applied in virtual reality, tool changes gain factor values smoothly to nearly equal
multimedia, computer games, on-stage monitoring, arti- values, gl = g2 = g3 = 0.578, which is as stated by
ficial reverberation, to name a few. the theory. From point B the virtual source was moved
to point C, which was in the middle of the arc between
5.3 Three-Dimensionally Panned Piece of loudspeakers 2 and 3. Gain factors obtained values of
Computer Music g2 = g3 = 0.707 and gl = 0.0. From point C the virtual
A piece of computer music panned with the three-dimen- source was moved to the point defined by loudspeaker
sional panning tool has been performed in a public concert 4 (60, 40) in 20 equal-sized steps. Immediately after
in an existing concert hall, the Chamber Music Hall at the point C the tool changed the loudspeaker triplet to which
Sibelius Academy in Helsinki, Finland. In the hall there the signal was panned from the (riplet 123 to triplet 234.
are 96 loudspeakers, grouped as 32 channels [19]. The 32 In Fig. 11 it can be seen that g4 increases smoothly to
channels were further grouped into eight channels. A vir- the value 1.0, whereas g3 and g2 decrease to the value
tual source was created with each of the six channels of 0.0. The test thus showed that the properties of VBAP
the piece. The virtual sources were moving in the sound stated in Section 3 can be achieved with a digital tool.
field as the composer desired. The panning was completed
with the three-dimensional panning tool discussed in Sec- 6.2 Tests with Extreme Loudspeaker Triangles
tion 4. The processed material was stored on an eight- Some tests were run with different sized loudspeaker
channel Alesis ADAT digital audio tape. triangles using the three-dimensional panning tool de-
scribed in Section 4. In the tests the virtual source was
6 SOME EXPERIMENTS WITH VBAP positioned at a few locations on the triangle, the geome-
try of which was varied. The gain factors were moni-
6.1 Gain Factor Monitoring of a Moving Virtual tored. The first tests were conducted to see ho,6, small
Source triangles could be used. A tested triangle had 5° sides;
A test run for monitoring the values of gain factors
duringthe virtualsourcemovementwascompletedwith r-_-'_l
the three-dimensional panning tool described in Section L4 - - ...................
4. The loudspeaker configuration is illustrated in Fig.
10. Fig. 11 shows the changes in gain factors during the

along
virtual the solidmovement.
source line from loudspeaker 3 through
The virtual source points
was moved j (60,40) ._ _
A, B, and C to loudspeaker 4. At each point marked [ .-" _ B
with a letter the gain factors were monitored twice. , //
,
In the beginning of the test, the user commanded the r_z_,,

(azimuth
computer and elevationthe
to position angles),
virtual which
sourcewas the point (0,
at location where
50) L2 ___- ................ ---. _ L1
loudspeaker 3 (L3) was situated. The gain factor g3 was
adjusted to the value 1.0 by the tool, while others shared (60,0) (0,0)
0.0, as expected. Next the user prompted 10 points on the Fig. 10. Loudspeaker placement and virtual source movement
in gain factor monitoring test. Dashed lines axe boundaries of
arc between loudspeakers 2 and 3 with equal steps, ending active triangles of loudspeaker bases. Virtual source move-
at point B. From Fig. 11 we can see that g3 is decreased ment is marked by solid line.

g1'=g2=g3=0.578
g3=t .0 x g3=g2=0.707
\ \

: ; _ ' j .......... ..
; \ ..... .

......
: ....:.......
::i :gn='o
'""'"' '" . . Z ". i

-----gl A'. B'


i ? i
Fig. 11. Gain factor magnitudes during virtual source movement. Gain factors axe numbered according to loudspeaker numbering
in Fig. 10. Note that gl changes to g4 at point C.

d.AudioEng.Soc.,Vol.45,No.6, 1997June 463

c Audio Engineering Society, Inc. 1997


PULKKI PAPERS

the error of the gain factors calculated was found to be speaker reproduction has been explored [5]. When fil-
negligible. In the second test a triangle that also hardly tering signals using digital models of HRTFs it is possi-
spanned a three-dimensional space was configured. The ble to create illusions of directions of virtual sound
sides of the triangle were 5°, 175 °, and 175°. Still the sources for the listener in stereophonic headphone or
gain factors had the expected Values. Thus it can be loudspeaker listening.
stated that if the three-dimensional VBAP tool has 32- Using HRTF-based systems a three-dimensional
bit floating-point calculation accuracy, it does not set sound field can be produced with only two loudspeakers,
limitations on the placement of the loudspeakers, which permits reproduction in common domestic stereo-
phonic equipment. VBAP is a method for panning virtual
7 COMPARING THREE-DIMENSIONAL VBAP .sources to any number of loudspeakers in an arbitrary
WITH EXISTING THREE-DIMENSIONAL placement. The comparison of the qualities of the virtual
PANNING METHODS sources produced with HRTFs and VBAP is an interest-
ing topic, which can be studied after listening tests on
In this section the three-dimensional VBAP is eom- VBAP have been conducted.
pared with the Ambisonics system, with very large The filters that model HRTFs are computationally
arrays of loudspeakers, and with HRTF-based systems, quite expensive. In some cases the filtering of one sam-
The audio control equipment and the software of the pie of the sound material requires approximately 50-150
Level Control Systems (LCS) [20] cannot be compared multiplications and additions per sample of a virtual
with the VBAP method because there is not enough source. There exist some simplified, models [22], and
information available. However, there seem to exist by using them the computing requirement can be de-
some similarities between VBAP and LCS. creased to as few as 12 multiplications and additions
· per a virtual source sample. In three-dimensional VBAP
7.1 Ambisonics only three multiplications are required per a virtual
The Ambisonics surround sound system is a large sys- source sample. *
tem for storing and reproducing two- and three -
dimensional sound fields [4], [12]. The main drawback 8 CONCLUSIONS
of the Ambisonics system is that the sound-storing for-
mat supports the best loudspeakers placed on the axes New methods for virtual source positioning in two-
of the Cartesian coordinate system, as seen in Fig. 6. and three-dimensional sound fields have been intro-
The number of loudspeakers can be increased. However, duced. The two-dimensional amplitude panning has
in such cases the number of loudspeakers radiating a been reformulated with vectors and vector bases to a
coherent signal increases, which does not improve the two-dimensional VBAP method. The two-dimensional
quality of the virtual sources. VBAP method follows a traditional panning method. It
In VBAP each selected loudspeaker triplet is a coordi- has been generalized in this paper to a three-dimensional
hate system of its own. The gain factor calculation in VBAP method. The three-dimensional VBAP is a gen-
the VBAP method equals that of the Ambisonics in an eral method for virtual source positioning in a three-
orthogonal loudspeaker placement. VBAP thus general- dimensional sound field formed by loudspeakers in an
izes the gain factor calculation of Ambisonics to nonor- arbitrary three-dimensional placement.
thogonal situations, which provides great flexibility in VBAP is computationally efficient and accurate. The
loudspeaker placement with maximum accuracy. A loudspeakers may be in arbitrary two- or three-
drawback of the VBAP method is that in many cases it dimensional positioning. VBAP gives a maximum local-
needs more than four channels for sound storage, ization sharpness that can be achieved with amplitude
panning since it uses at any one time the minimum num-
7.2 Very Large Arrays of Loudspeakers ber of loudspeakers needed: one, two, or three. The
In very large arrays of loudspeakers (VLALs) the number of virtual sound sources or loudspeakers is not
sound is always emanating from one of many loudspeak- limited by the method.
ers. These types of systems are used mainly for scientific By using VBAP it is possible to make recordings for
purposes, as in [21]. VBAP can be considered anassist- any loudspeaker configuration or create recordings that
ant system to a VLAL, since it does not weaken the are independent of loudspeaker placement. MultiPle vir-
localization quality achieved with the array system. If tual sources can be positioned in two- or three-
the direction vector of the virtual source is equal to the dimensional sound fields, even with very complex loud-
direction vector of any of the loudspeakers, the sound speaker configurations.
is emanating only from it, as stated in Section 3. Thus VBAP is more flexible than the Ambisonics sound
the system gives the same localization accuracy as the reproduction system because of the free loudspeaker
VLAL, but the sound can also be placed and moved placement. VBAP is a computationally more efficient
between loudspeakers, three-dimensional sound positioning method than are
HRTF-based methods.
7.3 HRTF-Based Systems A digital tool has been constructed which can position
The use of head-related transfer functions (HRTFs) eight virtual sources in .a two- or three-dimensional
in producing three-dimensional sound fields for loud- sound field formed by eight arbitrarily placed loudspeak-

464 J. AudioEng.Soc.,VoL45,No.6, 1997June

c Audio Engineering Society, Inc. 1997


PAPERS VIRTUAL SOUND SOURCE POSITIONING

ers. The tool is also able to move virtual sources in [12] D. G. Malham and A. Myatt, "3-D Sound Spa-
the sound field independently of each other. A three- tialization Using Ambisonic Techniques," Computer
dimensionally panned piece of computer music has been Music J., vol. 19, no. 4, pp. 58-70 (1995).
performed successfully in an existing concert hall using [ 13] J. Borenius, "Moving Sound Image in the Theaters,"
the proposed technique. J. Audio Eng. Soc., vol. 25, pp. 200-203 (1977 Apr.).
[14] V. Pulkki, J. Huopaniemi, T. Huotilainen, and
9 ACKNOWLEDGMENT M. Karjalainen, "DSP Approach to Multichannel Audio
Mixing," in Proc. Int. Computer Music Conf. ICMC-
The author would like to thank Professor Matti Karja- 96 (1996 Aug.), pp. 93-96.
lainen, Vesa Viilim'fiki, Toomas Altosaar, Jyri Huopa- [15] R. Weir, "TIM-40-TMS320C4X Module Speci-
niemi, Samuli Siltanen, and the personnel of Acoustics fication, 1992," Texas Instruments Inc. (1992).
Laboratory at Helsinki University of Technology for [16] M. Karjalainen, "Object-Oriented Programming
support in the project. Pauli Laine and Andrew Bentley of DSP Processors: A Case Study of QuickC30," in
at the Sibelius Academy Computer Music Studio and Proc. Int. Conf. on Acoustics, Speech, and Signal Pro-
Juhani Borenius at the Finnish Broadcasting Company cessing (IEEE Signal Processing Society, 1992), pp.
Ltd. have also given valuable support. The author would 601-604.
like to thank also Professor Teuvo Kohonen at Helsinki [17] URL: http://www-leeds.ac.uk/music/Man/c-front.
University of Technology. The VBAP theory presented html.
in this paper uses partly the same way of thinking as his [18] URL: http://www.digidesign.com.
theory of the self-organizing map [23]. The author's [19] P. Laine, "Sibelius Academy Computer Music
wife, Sari L6ytymiiki, deserves also warm thanks. This Studio," in Proc. Int. Computer Music Conf, ICMC-93
project has been partially financed by the Sibelius Acad- (1993, Sept.), pp. 302-305.
emy, Helsinki, Finland. [20] URL:http://www.lcsaudio.com/lcs.html.
[21] J. Sandvad and D. Hammersh0i, "Binaural Syn-
10 REFERENCES thesis for Auditory Virtual Environments," in Proc.
Nordic Acoustical Meeting 96 (1996 June), pp.
[1] J. Blauert, Spatial Hearing (MIT Press, Cam- 343-350.
bridge, MA, 1983). [22] J. Huopaniemi and M. Karjalainen, "Review of
[2] G. Steinke, "Surround Sound--The New Phase: Digital Filter Design and Implementation Methods for
An Overview," presented at the 100th Convention of the 3-D Sound," presented at the 102nd Convention of the
Audio Engineering Society, J. Audio Eng. Soc. (Ab- Audio Engineering Society, J. Audio Eng. Soc. (Ab-
stracts), vol. 44, p. 651 (1996 July/Aug.), preprint stracts), vol. 45, p. 413 (1997 May), preprint 4461.
4286. [23] Teuvo Kohonen, Self-Organizing Maps, vol. 30
[3] R. Condamines, "Le relief sonore t_traphonique," of Springer Ser. in Information Sciences (Springer, Ber-
Rev. d'Acoustique, pp. 277-283 (1972). lin, Heidelberg, New York, 1995).
[4] M. A. Gerzon, "Periphony: With-Height Sound
Reproduction," J. Audio Eng. Soc., vol. 21, pp. 2-10 APPENDIX
(1973 Jan./Feb.).
[5] H. M011er, "Fundamentals of Binaural Technol- The statement that gain factors calculated using Eq.
ogy," Appl. Acotgst., vol. 36, pp. 171-218 (1992). (9) will satisfy the tangent law [Eq. (3)] will now be
[6] V. Pulkki, J. Huopaniemi, and T. Huotilainen, proved. At first, some known expressions that will be
"DSP Tool for 8-Channel Audio Mixing," in Proc. Nor- useful in the proof are written down:
dic Acoustical Meeting 96 (1996 June), pp. 307- 314.

rice
[7]F.R. Englewood
Moore, Elements
Cliffs, NJ,
of Computer
1990). Music (Pren-Hall, L-1 = jill
Il21 ll2]-I
122-] -- II1122 --1 121112 L
[122
--121 -_lln](20)
--1
[8] A. D. Blumlein, U.K. patent 394,325, 1931. Re-
printed in Stereophonic Techniques (Audio Engineering
Society, New York, 1986). In= 121= cos q_0 (21)

[9] B. B. Bauer, "Phasor Analysis of Some ln = -122 = sinq_0 (22)


Sterephonic Phenomena," J. Acoust. Soc. Am., vol. 33,
pp. 1536-1539 (1961 Nov.).
[10] B. Bernfeld, "Attempts for BetterUnderstanding Pi = cos _p (23)
of the Directional Sterephonic Listening Mechanism,"
presented at the 44th Convention of the Audio Engi- P2 = sin ¢0. (24)
neering Society, J. Audio Eng. Soc. (Abstracts), vol.
21, p. 308 (1973 May). Eq. (9) can be written in the form
[11] D. M. Leakey, "Some Measurements on the Ef-
fect of Interchannel Intensity and Time Difference in 1 [Pi/22 -- p2121 p2ln -- Pill2] ·
Two Channel sound Systems,"J. Acoust. Soc. Am., vol. g - lnl:2 - 121112
31,pp.977-986 (1959). (25)

J. Audio Eng. Soc., Vol. 45, No. 6, 1997 June 465

c Audio Engineering Society, Inc. 1997


PULKKI PAPERS

Then, using Eqs. (21)-(24), it can be seen that The relation (gl - g2)/(gl + g2) may now be calculated
using Eqs. (26) and (27),

cos qosin q0o + sin q_cos qoo (26) gt - g2 2 sin qocos qoo = tan qo
gl = 2 cos qD
o sin qoo - · (28)
gl + ge 2 COSq_sin q_o tan q_o

cos _ sin q_o-- sin q_cos _Po (27) This is the tangent law [Eq. (3)], which completes the
g2 = 2 cos q_osin (Po proof.

THE AUTHOR

Ville Pulkki was born in Jyviiskyli, Finland, in 1969. Information sciences and the Neural Networks Research
He received the M.Sc. degree in engineering from the Centre at HUT from 1991 to 1995. He worked as a research
Helsinki University of Technology (HUT), Espoo, Fin- assistantand later as a research scientist. Since 1995 he has
land, in 1994. Since 1994 he has been majoring in stud- been working as a research scientist in the Laboratory of
ies at the Musical Education Department at the Sibelius Acoustics and Audio Signal Processing at HUT.
Academy, Helsinki, Finland, and has been a Ph.D. stu- His research interestsinclude three-dimensional sound,
dent at HUT. virtual acoustics,and music. Mr. Pulkki is a member of the
Mr. Pulkki was with the Laboratory of Computer and AES and the Polar Bear Club of Finland.

466 d. Audio Eng. Soc., Vol. 45, No. 6, 1997 June

c Audio Engineering Society, Inc. 1997

You might also like