Audiosculpt Cross Synthesis Handbook
Audiosculpt Cross Synthesis Handbook
• Research reports
• Musical works
• Software
AudioSculpt
Cross-Synthesis Handbook
by Mario Mary
This manual can not be copied, in whole or part, without the written consent of Ircam.
This manual was written by Mario Marcelo Mary under the supervision of Marie-Hélène Serra,
and was produced under the editorial responsibility of Marc Battier, Département de la Valori-
sation, Ircam. English translation by Richard Dudas.
AudioSculpt
Design Philippe Depalle, Chris Rogers
Program Chris Rogers
Project supervision Gerhard Eckel
SVP
Design Philippe Depalle
Program Philippe Depalle, Gilles Poirot, Chris Rogers, Jean Carrive
Analysis modules
Algorithms Xavier Rodet, Philippe Depalle, Guillermo Garcia, Ernst Terhardt,
Boris Doval
Documentation
Coordination Marie-Hélène Serra
Manuals Marc Battier, Jean Carrive, Peter Hanappe, Mario Mary, Brice
Pauset, Marie-Hélène Serra
This documentation corresponds to version 1.0 or higher of AudioSculpt and to version 1.3A
of SVP.
Ircam
1 place Igor-Stravinsky
F-75004 Paris
Tel. 01) 44 78 12 33
Fax 01 44 78 15 40
E-mail : ircam-doc@ircam.fr
IRCAM Users group
Département de la Valorisation
Ircam
Place Stravinsky, F-75004 Paris
Tel. 01 44 78 49 62
Fax 01 44 78 15 40
E-mail: bousac@ircam.fr
http://www.ircam.fr/musinfo
To see the table of contents of this
manual, click on the Bookmark Button
located in the Viewing section of the
Adobe Acrobat Reader toolbar.
Contents
List of examples 6
Résumé 7
General Information 8
Introduction 9
Source-Filter Cross-Synthesis 11
Adjusting the Analysis Parameters 13
Example 1 index 1, 2 and 3 19
Example 2 . 1 index 4, 5 and 6 21
Example 2 . 2 index 7 22
Example 3 . 1 index 8, 9 22
Example 3 . 2 index 10 22
Example 4 . 1 index 11, 12 and 13 23
Example 4 . 2 index 14 26
Creating a Text File for Dynamic Sound Processing 27
Example 4 . 3 index 15 29
Example 4 .4 index 16 29
Generalized cross-synthesis 32
Example 5 index 17, 18 and 19 34
Example 5.1 index 20, 21 and 22 36
Example 5.2 index 23 39
Example 5.3 index 24 39
Example 5.4 index 25 40
Example 5.5 index 26 and 27 41
Example 5.6 index 28 42
Example 5.7 index 29, 30 and 31 43
Example 6.1 index 32, 33 and 34 45
Example 6.2 index 35 47
Example 7 index 36, 37 and 38 48
Example 8.1 index 39, 40 and 41 50
Example 8.2 index 42 53
Example 8.3 index 43 54
Example 9.1 index 44, 45 and 46 56
Example 9.2 index 47 59
Example 9.3 index 48 60
Example 9.4 index 49 60
Example 10 index 50, 51 and 52 61
Cross-Synthesis by Mixing 65
Example 11 index 53, 54 and 55 65
Example 12 index 56, 57 and 58 67
Appendix 68
Example 13. 1 index 59, 60 and 61 69
Example 13.2 index 62, 63 and 64 71
Example 14.1 index 65 and 66 71
Example 14.2 index 67 and 68 74
Example 14.3 index 69 76
Conclusion 77
Annex: List of options 78
Index 79
List of examples
Ce manuel est une illustration d’une des possibilités les plus intéressantes offerte par le programme AudioSculpt :
le synthèse croisée. A travers une série d’exemples sont expliqués différents aspects de ce traitement qui lie in-
timement deux sons.
Le manuel s’accompagne d’une série de fichiers de son utilisés par les exemples. Ils sont disponibles votre CD-
Rom.
Comme ce manuel a une vocation pédagogique, il est conseillé au lecteur d’étudier les exemples l’un après l’au-
tre.
AudioSculpt is an Ircam program which permits the visualization and processing of sounds; it runs on the Mac-
intosh (either 68k or Power PC). Sounds can be visualized in the form of either an amplitude/time envelope or a
sonogram. The processing options currently available (time-expansion /compression, transposition, cross-syn-
thesis, etc…) are those of the program SVP, which is based on the principles of a two-channel phase vocoder.
SVP was originally developed on the Unix platforms Dec (Mips and Alpha), SGI and NeXT. To run the SVP
program in a Unix environment it is necessary to type a command line such as:
svp -v -t -T -A -Z -Scymbale -D2 -M1024 -N1024 cymbale-D2
The various options in the command line specify which treatment to apply to a sound and with which parameters.
In contrast, AudioSculpt was conceived from the beginning as a graphic interface to SVP in order to facilitate the
use of the program.
In place of a command line, the user is able to select a processing command from a menu, and fill in the param-
eters of the various options within a standard Macintosh dialog box. It is always possible, however, to revert to
the older method of controlling SVP by selecting Use Command Line from the menu. For each type of process-
ing presented in this manual, both forms (dialog box and command line) will be given.
Cross-synthesis requires the use of two sound files. If the two files are of different lengths, the resulting sound
file will have the length of the longer file (before version 1.3 of SVP the shorter file’s length was used). Be careful
not to confuse the length of the sound file with the duration of the sound actually heard, which may be shorter
than the actual size of the file due to silence which could be either at the beginning or at the end of the file.
There are three types of cross-synthesis available within SVP:
• cross-synthesis by mixing (add mode)
• source-filter cross-synthesis (mul mode)
• generalized cross synthesis (cross mode)
Currently, AudioSculpt only gives access to source-filter and generalized cross-synthesis from within the Pro-
cessing menu. Cross-synthesis by mixing must be requested by using a command line.
Explained in the simplest and most general manner, cross-synthesis by mixing (add mode) is based on the mixing
of two sounds, source-filter cross-synthesis (mul mode) on the transformation of one sound by another, and gen-
eralized cross-synthesis (cross mode) on the exchange of characteristics of and interpolation between two sounds.
However, the user may go much deeper into the use of each of these three cross-synthesis modes in order to re-
alize original and creative cross-synthesis by calculating different types of hybrid sounds whose sources may no
longer be identifiable.
In order to realize a desired cross-synthesis, there are two basic considerations to take into account:
1. the choice of sounds to cross, and
2. the cross-synthesis mode, and the manner in which to use it.
In this tutorial, the user will find different examples geared to provide a guide for his or her personal work.
Note from the editor: because this handbook is really a tutorial which takes you gradually through all the steps
involved in doing cross synthesis, you are advised to read the sections in the order they appear. For instance, ex-
planation on how to adjust analysis parameters in introduced in the 3rd chapter devoted to Source-filter cross-
synthesis. We hope that reading the handbook in that manner will help you fully understand the processes in-
1. The Sound examples are available on a DAT tape or on the Forum CD-ROM.
This type of cross-synthesis involves multiplying the short-term spectrum of one sound by the short-term spectral
envelope of another sound, which causes a filtering of one sound by the other’s spectral envelope. A sound’s
spectral envelope is calculated by using a linear prediction algorithm (LPC).
To begin processing, first open the sound files which you would like to cross by selecting Open from the File
menu (or by typing command-O). Once the sound files are open, you may select the type of cross-synthesis in the
Processing menu:
In order to select the filtering sound, you may choose between any of the currently open sound files.
size samples
hertz
step samples
poles
samples
Once the desired values have been chosen and set within the Analysis Parameters dialog box, click on OK, and
then on Save in the Cross Synthesis Source Filter dialog box. You will then be presented with yet another dialog
box in which you can name the resulting sound file and select the folder in which it will be saved. You can click
on the Normalize checkbox to normalize the output sound file, i.e. scale the amplitude of the sound to its maxi-
mum.
Normalization of the
Name of the resulting sound output sound’s
amplitude
The same cross-synthesis may also be requested by using a command line:
This is the command line equivalent toa source-filter cross-synthesis which is performed on the files caixixi and
vl-jeté:
Once you have indicated the folder where the input sound files are to be found, AudioSculpt will always look
there for them by default; you may, however, change the default directory at any time. To set the default folders
where AudioSculpt will search for parameter files and save console files, select Set Default Folder>Parameters
Folder and Set Default Folder>Console Folder (both in the File menu), respectively.
Example 3 . 1 index 8, 9
These two examples complement the preceding ones. Using identical cross synthesis parameters, the trumpet will
be crossed with a small sequence played on the berimbao. The resulting sound contains a counterpoint between
the timbral fluctuations and rhythmic action contained in the two sounds.
The command line to filter the berimbao by the trumpet with wah-wah mute is as follows:
Example 3 . 2 index 10
This is the command line to filter the trumpet with wah-wah mute by the berimbao:
amplitude
time
amplitude
time
Example 4 . 2 index 14
This is the inverse of the preceding example. Here, the cymbal is filtered by the clarinet; all of the cross-synthesis
parameter values remain the same.
The resulting amplitude envelope resembles that of the preceding example, but the predominant timbre is that of
the cymbal:
The resulting sound will begin with a time-compression which becomes stronger until 0.3 seconds, at which point
the compression slowly becomes weaker until 0.5 seconds where the compression coefficient is equal to 1.0 (no
compression or expansion). Between 0.5 and 1.0 seconds, the coefficient becomes greater, thereby time-stretch-
ing the sound, and continuing until 2.0 seconds, before returning to 1.0 over the remaining part of the sound. All
in all, the sound will be time-compressed in a variable manner from 0.0 to 0.5 seconds and time-stretched from
0.5 to 3.5 seconds.
The time scale modification gives the file a total duration of about 7.5 seconds. The attack stays very short, but
the resonance was stretched by more than twice in an irregular fashion.
Example 4 .4 index 16
The following example further modifies the previously created sound by applying a transposition to it. In order
to make a dynamic transposition, you should use a text file such as:
or, when using a command line, indicate the name of the transposition file after the option -trans:
Transposition without time correction will modify the duration of the sound; if you would like to preserve the
sound’s original duration, click on the time correction checkbox in the Constant Transposition dialog box (or
add the option -D in the command line). If you wish to make a constant transposition, type the transposition in-
terval in cents directly into the space provided in the dialog box.
In our example, the transposition has given the sound a deep drone with little fluctuations in pitch — it is evident
that the sound has evolved into something far from the cymbal and clarinet of its origins.
This type of cross-synthesis allows the separate combination of the amplitude and instantaneous frequency spec-
tra of two input sounds. For the amplitude spectra of the primary and secondary sounds, A1 and A2, and their
instantaneous phase spectra, ƒ1 and ƒ2,(related to instantaneous frequency) on a given window, the spectrum of
the output sound for that window is defined as:
A = X A1 + x A2 + q A1 A2
ƒ = Y ƒ1 + y ƒ2
The Amplitude Scale Factors correspond to the coefficients X and x of SVP, the Phase Scale Factors to coeffi-
cients Y and y and the Amplitude Cofactor to coefficient q.
To begin processing, open the files that you would like to cross in AudioSculpt using the command by selecting
Open from the File menu (or command-O). As you now know, you have the option to use either fixed or dynamic
(time-varying) parameter values. To use fixed parameter values just choose the type of cross synthesis desired
from the Processing menu:
Using dynamic parameters with a cross-synthesis is much like using them with time-stretching or transposition.
Begin by creating a text file with the crossing parameters. Next, make the window containing the text the active
window and select the type of cross-synthesis from the Processing menu. The following dialog box will open:
Select the sound files you wish to use, click on Analysis Parameters to modify the analysis parameter values, as
before, then begin the processing by clicking on Process and Save.
sound 1 sound 2
time X x Y y q
0.7 y
0.3 x
0.2
q
0.0
0.0 0.5 0.8 1.6 2.7 4.8 sec.
1. One should always remember to initiate such a cross at a very subtle rate, since our hearing
perception is very sensitive to changes made to a stable signal.
This timbral interpolation was greatly facilitated because the both English horn and the cello have a large quantity
of similar timbral characteristics and a similar acoustic ambiance at many pitch levels. As we can see, the choice
of sounds to interpolate is a very important decision in this kind of work. When two sounds are similar, an inter-
polation between them may be made solely by the interpolation of their timbres.
By using a parameter file for cross-synthesis you can obtain a very smooth and homogenous interpolation; any
potential imperfections in this gradual timbral transformation are masked by the other parameters of the sound.
However, in a cross between two sounds which have different acoustic qualities, each change in the sound, how-
ever slight, will bring these differences immediately to our attention.
stages
1 tam-tam
2
6 voice
Click on Analysis Parameters to open the dialog box to adjust the analysis parameters.
We will use the following analysis parameters:
The interpolation has been made discreetly at the attack of each vocalization.
We will now create a series of examples using the same two sound files with the same analysis parameters as
above. Our variations will be made only to the dynamic parameter file controlling the cross-synthesis. In this
manner you can begin to better understand the many diverse possibilities inherent in this type of cross synthesis.
In order to process this cross-synthesis by using a command line, it will suffice to change only the name of the
crossing file and the name of the output file.
time X x Y y q
1.0 X Y
x y
sound 1 0.7
sound 2
0.0 q
0.0 0.35 1.82 2.3 sec.
1.0 X
Y x
0.5
0.0 y q
0.0 0.36 0.8 1.1 1.46 1.9 2.3 sec.
As we can see, the crossing between the two sound files will not be made in a linear fashion, in order to make a
more convincing interpolation.
The file cross-voix-tongue-28 is the resulting interpolation between the vocal attacks and the “tongue-rams” on
the flute.
The values entered in the Analysis Parameters boxes will create a distortion in the time scale of the secondary
sound tam-tam2, due to the use of different hop sizes for the two sounds’ analysis windows. The resulting sound
file will have a duration of 50 seconds, in spite of the fact that the two source sounds tam-tam2 and voix-Eh-4
have durations of 1.474 and 6.402 seconds, respectively. If the two hop sizes are set to equal values, the duration
of the resulting sound file will be equal to that of the larger input file (tam-tam2), whose duration is 6.402 sec-
onds. Although AudioSculpt will synthesize sound in the output sound file as long as it finds a frequency spec-
trum in one of the input sounds, it will, nonetheless, continue the cross-synthesis using silence if it does not.
This file controls a very simple cross. The output file begins with the primary sound alone, followed by a rapid
transition to the secondary sound beginning at 0.5 seconds, and, from 1.6 seconds, the secondary sound is heard
alone. With the crossing file’s window active, select Generalized Cross Synthesis from the Processing menu
and select the sound files in the dialog box.
The modifications made to the dynamic crossing file are few, but they nonetheless produce noticeable differences.
Here is the modified crossing file and its graphic representation:
time X x Y y q
X
1.0
Y x
y
q
0.0
0.0 2.2 0.7 1.0 1.5 2.5 2.6 sec.
The following example describes one possible method for a better realization of a cross-synthesis.
time X x Y y q
violin
attack
transition
resonance
of the
cymbal
time X x Y y q
2 3 4
1.0
X
Y x
0.7
0.5
y
q
0.0
0.0 0.2 0.4 1.0 1.5 2.5 3.2 sec.
0 .0
By carefully observing the sonogram, one notices that the glissando heard in the sound does not correspond to its
spectral representation — in fact, it seems to move in contrary motion to it. This is a typical example of a para-
doxical sound.
time X x Y y q
x y
1.0
X Y
q
0.0
0.0 0.88 2.64 4.1sec.
1.0
Y y
X
x
q
0.0
1 .0
X
Y y q
0 .0
y
q
x
0.0
The parameters Y and y, as they are defined in the file, impose rather large variations in pitch. The most evident
to the ear is probably the glissando produced by parameter Y at the beginning of the sound.
This type of cross-synthesis is analogous to the mixing of two sounds. The coefficients X and x play the role of
faders and allow control of the weight of each sound in the mix. This type of cross-synthesis is currently only
available by using a command line, using the option -Gadd.
To begin processing, open the command line dialog box by selecting Use Command Line from the Processing
menu.
If you are using constant parameter values, you must type these values just after the options -X and -x (respec-
tively, the two options are relative to the primary and secondary input sounds).
Constant
amplitude
Cross synthesis values
by mixing sound1 sound2
If you use dynamic crossing parameters (those which vary over time), you should create and save a text file con-
taining these parameter values.
time X x
The name of this file must be given just after the option -Gadd in the command line.
Do not forget to correctly specify the folder where AudioSculpt will look for both the parameter files and sound
files you will be using (by selecting Set Default Folders from the File menu) before initiating the processing.
Here, our resulting sound has the attack of a cello with the resonance of a drum. The crossing file creates a very
rapid transition.
1.0
X
0.3
0.2 x
0.0
0.0 0.5 1.0 3.0 sec.
time X x
1.0
X
x
0.0
0.0 0.3 1.0 2.0 3.0 3.8 sec.
Cross-synthesis by mixing (add mode) is a useful tool not only for the mixing of two sounds, but, more impor-
tantly, for the creation of micro-mixes. The two preceding examples illustrate the precise control of amplitude
one has over the mixing of two sounds when using this type of cross-synthesis.
This appendix includes those examples which “cross the boundaries” of the diverse types of cross-syntheses so
far explained in this tutorial. These examples are presented to provide a springboard for ideas for those who are
searching for new fields of audio processing in which to explore.
The examples were realized with the help of the program PatchWork1 (also developed at Ircam), which served to
create complex dynamic parameter files for unconventional control of cross-synthesis, transposition, and time ex-
pansion/compression.
1. In PatchWork, the selected time resolution (shown in the time column of the parameter file)
is not necessarily in accordance with the step size used in AudioSculpt. When it is smaller,
AudioSculpt reproduces only the variations which match the step size. the finer ones are skip-
ped.
To create this long file, a patch was constructed in PatchWork, using a stochastic algorithm to generate the oscil-
lations while controlling the percentage of predominance of one sound or the other, as well as the interval of os-
cillations.
Sound 1
Sound 2
0.0 0.37 2.25 2.6 sec.
The oscillation between the two sounds produces different types of “grains” during the transition. This type of
granulation has a masking effect during the transition to the flute sound. The idea to introduce elements which
capture the auditory attention just at the moment of transition is just one of the many possible means leading to
inexhaustible creative possibilities.
+7 cents. . . Delay
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. .
End of the vibrato
. .
. .
. .
. .
. .
- 7 cents. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..
0.0 0.8 2.56 4.33 7.0 sec.
One can further extend the idea of using the option -trans to impose a vibrato upon a sound by generating a fre-
quency modulation to be imposed upon a sound file. It is sufficient to create a vibrato faster than 20 Hz for the
first lateral frequency bands to appear.
In this example, the sound of a flute is colored with frequency modulation induced from a dynamic transposition
file.
The beginning of the file fmfile-9 is shown in the figure. This file defines a vibrato whose frequency is 200 Hz,
and whose amplitude varies between, +100 and -100 Hz.
To achieve good spectral and temporal definition, the analysis is made with a large FFT (flag -N4096) and a very
small hop size (flag -I20).
The following command line is used:
This example of frequency modulation imposed on a sound file is given, once again, to provide a point of depar-
ture for the advanced AudioSculpt user. Once this principle is understood, it can be applied to other types of mod-
ulation, such as amplitude modulation and wave shaping.
The transformation from flutter-tongue to ordinario is obtained by using a version of the above transposition file
whose vibrato begins at the maximum amplitude and gradually decreases to nothing.
The use of text files containing parameter values allowing the realization of sound processing which var-
ies over time opens inexhaustible possibilities when one takes into consideration that these files may be
created in a program other than AudioSculpt itself (for example, PatchWork).
At the beginning of this work, I constructed different systems to create text files based on parameter en-
velopes, as well as for the graphic representation of existing files. I named these systems “editors”, and,
with their help, I extended the possibilities of cross-synthesis within AudioSculpt. Some of these exper-
iments, described in the appendix of the AudioSculpt manual show that, with a little imagination, one
can discover new creative applications of the program.
In the appendix to this tutorial, the reader will also find several examples of unconventional sound pro-
cessing which, I hope, can be musically very powerful.
Conclusion 77
8 Annex: List of options
For a detailed explanation of SVP options, refer to your AudioSculpt or your SVP manual.
B I
Battier M. 2 Interpolation 41, 42
C L
Carrive J. 2 Linear prediction algorithm 11
Cents 30 Linear prediction coding 78
Command line 8 LPC 11, 13, 24, 78
Console Folder 18 number of poles 13, 24
Constant Transposition 31
Cross-synthesis
add mode 9, 67, 69, 78 M
by mixing 9, 67, 69 Mary M. 2
cross mode 9, 78
generalized 9, 32, 34, 37, 43, 45, 62
mul mode 9, 78 N
source-filter 9, 11
NeXT 8
Normalization 15
D Normalize 15
Dec 8
Alpha 8 O
Mips 8
Depalle Ph. 2 Options 78
list 78
Dialog box 8
Digidesign 42
Doval B. 2
Dudas R. 2
P
Dynamic transposition 29 Parameters Folder 18
PatchWork 68, 69, 71, 77
Pauset B. 2
E Phase Scale Factors 32
Poirot G. 2
Eckel G. 2
Poles 13, 24, 78
F R
Factory Settings 14
FFT 13, 75 Rodet X. 2
size 13, 78 Rogers C. 2
Fund. Frequency 13
S
G Serra M.-H. 2
Set Default Folder 17, 18
Garcia G. 2
T
Terhardt E. 2
time correction 31
Time scale modification 29
Transposition 30
U
Unix 8
Use Command Line 8
W
Window size 13