building a C++ library into a Pure Data 'class' download at page bottom
The previous page illustrated some of the persistent problems that
come with time stretching and pitch shifting. For the purpose of
illustration, I wrote elementary time domain and frequency domain pitch
shifters for Pure Data. These objects showed the pernicious artifacts
of elementary pitch shifting in their full glory. They are not suitable
for serious signal processing.
While learning about pitch shifting, and searching for code, I came
accross Olli Parviainen's SoundTouch C++ library at http://www.surina.net/soundtouch/.
Though the technique in this library is not the cutting edge in
transform, it is adequate for simple periodic signals, and efficient.
Further, it is open
source, well documented and user-friendly. SoundTouch is incorporated
in many dsp packages, with Ardour and Audacity being notable
The real time dsp package Pure Data has no pitch shift class on
board, and SoundTouch is an evident candidate to fill this void.
Writing a Pure Data 'class' [soundtouch~] was ultimately a piece of
cake, as the SoundTouch library handles everything conveniently.
compiling and linking a C++ library with a Pure Data 'class' is not
completely straightforward, since Pure Data is written in C.
On this page is a detailed description of building SoundTouch into
[soundtouch~] for Pd. I used Xcode on OSX, and came across several
issues specific to Xcode or to the C/C++ mix. Therefore, a fair portion
of this page would apply to the (re)use of C++ libraries in Pure Data
in general. Elementary topics of Pure Data class developing are not
covered here. A tutorial is on http://iem.at/pd/externals-HOWTO/.
Hints for building Pd externals using Xcode are on http://puredata.info/docs/developer/PdExternalsInXcode.
The critical aspect of (re)using a C++ library within Pure Data, is the table of symbols in the final product. In a mixed C / C++ project, a C++ compiler is used, which will compile C code as well. In the intermediate object files (.o files) and the built product (soundtouch~.pd_darwin in the case described here), all function symbols appear in what is called mangled form, which is C++ specific. This is no problem, except for the case of the setup function in the Pure Data class. The setup function is the entry point for Pure Data to find and set up the class, and it's symbol must follow Pd's conventional spelling, otherwise it will not be found. That is what goes wrong with mangled symbols. To get the setup function spelled as a regular C symbol, it must be declared (and/or defined) as extern "C", like this:
When building with Xcode, you can select the menu item 'build >
show assembly code' and check how the symbols appear in the object
files. A regular C symbol
appears like you have written it in the code, with only an underscore
prepended. Mangled symbols have additions, denoting aspects like
argument type of a function. This is because C++ functions can be
defined more than once for different argument types.
Below, the difference between C symbol and mangled symbol is shown.
Functions like soundtouch_dsp are all declared static
and for them it does not matter whether to have mangled symbols or C
C symbol for soundtouch_tilde_setup()
mangled symbol for soundtouch_dsp()
Pure Data's API
functions must be found by their C symbols as well. If you look in the
API header file m_pd.h, you will see that
the complete file content is conditionally declared extern "C". In this
m_pd.h is well prepared to function either with C compiler or with C++
Is this extern "C" declaration of functions sufficient for the whole thing to work? No, the 'visibility' of symbols must also be organised, and that is a different thing.
In Pure Data classes, most functions can be declared static, meaning they can only be called from within the file. Symbols of functions which must be called from other executable files should be visible beyond the file, and also beyond the executable. This holds for the setup function. Functions not specified as static are by default visible, from other files for static linking, and from other executables for dynamic linking. Other names for visible in this sense are: exported, global, external. The opposite is hidden, invisible, local.
For multiple-file dynamic libraries, symbol visibility can not be
organised at the language level (C or C++). Compiler/linker options and
attributes must help here. Starting from version 4.0, GCC has a
-fvisibility=default wil make
all symbols in the target visible, if not defined otherwise.
-fvisibility=hidden, in contrast, will hide all
symbols within the target, if not specified otherwise. Exceptions to
general setting can be
made by using attributes
in the code. Syntax of flag and attribute are unfortunately
Hidden symbols, where
possible, are recommended for several reasons. With fewer exported
symbols to resolve, the dynamic linker can load a program faster.
Further, the chance of symbol collision is reduced. Apple
has pages on this topic, see [this
link] and [this
link]. GNU has info on http://gcc.gnu.org/wiki/Visibility.
A thorough discussion of linking dynamic libraries of the ELF format is
from Ulrich Drepper: http://people.redhat.com/drepper/dsohowto.pdf.
In Xcode, every new project starts from a template. Since Pure Data external classes reside in dynamic libraries, BSD Dynamic Library is normally the correct template. For a C++ project, there is choice between C++ Dynamic Library and C++ Standard Dynamic Library. What is the difference? With the C++ Standard Dynamic Library template, the template comment states that it will build a product with:
hidden by default'...
It took me a long time to find out and verify that this 'help' text
does not reflect the actual template settings, as I will show later.
Notice that this is in Xcode 3.0, and things may be different
in other versions.
Compiler and linker flags of Xcode templates are normally reflected
in the project- and target settings. For the C++ Standard Dynamic
Library template, the 'Symbols Hidden by Default' option is unchecked,
in contrast with the template help text:
In addition, there is a separate Xcode configuration file, showing the definition GCC_SYMBOLS_PRIVATE_EXTERN=NO. This config file can be accessed from the configuration files folder in the .xcodeproj window of the project, and here is how it looks like:
To see the actual GCC flags, have a look at GCC's build transcript
after building the target (click the indicated icon to see the
transcript). In this case, there
is no general -fvisibility flag at all, only a visibility flag applying
to inline functions:
Apparently, the 'help' comments for the C++ library
templates were swapped, respective to the actual template settings. For
the C++ Dynamic Library
template, settings are like this: 'Symbols Hidden by Default' is
checked, the config file has GCC_SYMBOLS_PRIVATE_EXTERN=YES, and the
GCC flag -fvisibility=hidden will appear in the build transcript.
Such confusion can be very time-consuming. On the other hand,
without it, I might never have bothered about hidden symbols so much.
symbols are in fact exported when the -fvisibility flag
is not set at all? The GNU command nm (run in Terminal.app) lets you
inspect the symbols
table of an object file (.o file) or
executable. nm can give you a complete symbols table, with the symbols
by characters indicating their local/global status and other
characteristics. Upper case characters are reserved for global symbols,
while lower case characters indicate local symbols.
It is also possible to run nm with the -g option and list global
exclusively. Here is a fragment of the global symbols listed in
soundtouch~.pd_darwin, as built without -fvisibility flag:
|katja-vetters-macbook:~ katja$ nm -g
00001ea0 T __Z17disableExtensionsj
00001eb4 T __Z19detectCPUextensionsv
00002fda T __ZN10soundtouch10PeakFinder10detectPeakEPKfii
00002cfa T __ZN10soundtouch10PeakFinderC1Ev
00002ce4 T __ZN10soundtouch10PeakFinderC2Ev
00004aee T __ZN10soundtouch10SoundTouch10putSamplesEPKfj
0000445c T __ZN10soundtouch10SoundTouch10setSettingEii
000049e8 T __ZN10soundtouch10SoundTouch11setChannelsEj
00004384 T __ZN10soundtouch10SoundTouch12getVersionIdEv
0000490a T __ZN10soundtouch10SoundTouch13setRateChangeEf
0000438e T __ZN10soundtouch10SoundTouch13setSampleRateEj
000048c4 T __ZN10soundtouch10SoundTouch14setTempoChangeEf
This is only a minor part of the complete list. All
together, hundreds of symbols are exported. Only the C functions
which I declared static do not appear in the global symbols list. For a
Data object, this seems disproportionate. Most of it is redundant:
exported symbols which are not intended to be global at all.
For larger libraries, the list can get huge. For example, the Gem class
for Pure Data in OSX binary distributions has over 40.000 symbols
Having seen such irrational global symbols lists, I am now
determined to find out how it should be done properly. Specially for
Pure Data, where it is not unusual to load dozens of external classes
within the main process, it seems relevant to keep their footprint and
demands possibly low. Checking the 'Symbols Hidden by Default' option
in the project target
settings will set the -fvisibility=hidden flag for the compiler. This
will make all internal symbols invisible. I need to make an exception
for the setup function, and this is done
with an attribute:
Running nm -g again on soundtouch~.pd_darwin, which is now compiled
with hidden symbols, gives a smaller list indeed. Below, I have copied
the whole of it. Apart from -soundtouch_tilde_setup, they are all
(undefined) references to functions outside soundtouch~.pd_darwin.
These include symbols from the Pure Data framework like _pd_new and
_post, and symbols from the standard C/C++ libraries. They are all
truly global symbols.
|katja-vetters-macbook:~ katja$ nm -g
00000fe6 T _soundtouch_tilde_setup
The visibility flags and attributes are GCC-specific, and with the
attribute included the code is
no longer compiler-independent.
Fortunately, there is GCC for any platform where Pure Data could run,
including Windows. Alternatively, flag and attribute could be defined
conditionally for GCC and for other compilers having a similar
mechanism. This will introduce a lot of lines in the preprocessor
section, even though it is only used for one setup function. Here is
how you could conditionally define and use a macro EXPORT for GCC
version 4.0 or higher:
For a Windows build (even when done with GNU tools) the attribute name is different. That is not shown here because I have not tested it yet, but the syntax can be found on the earlier mentioned page http://gcc.gnu.org/wiki/Visibility.
SoundTouch and the test utility SoundStretch come with scripts to build SoundStretch from the command line on platforms with GNU tools. On OSX, this can be done from Terminal.app, running ./configure and make. However, to build [soundtouch~] for Pd I use the Xcode environment, because that is where I wrote the code for the external. The Pure Data API header m_pd.h and the SoundTouch source files and includes must be added to the Xcode project, with the exception of some Windows-specific files. The includes directory should contain the header file soundtouch_config.h. Originally, there is only soundtouch_config.h.in. To generate the header file, run ./configure in the soundtouch directory as it is distributed.
In addition to compiler flags which are automatically set by Xcode,
the -fcheck-new flag should be set:
Other modifications to the target settings are the usual ones for Pd externals: the linker flag -undefined dynamic_lookup for the missing Pd API definitions, the executable extension .pd_darwin instead of .dylib, and the dismissal of the executable prefix 'lib'.
The SoundTouch library can perform time stretching, samplerate conversion and (by combining these two) pitch shifting. When building the library into a Pd class [soundtouch~], I had to choose between processing an input stream or processing sound files. Well actually I did not make a choice at all, opting for realtime processing right from the start. It means that my [soundtouch~] for Pd is a realtime pitch shifter, operating on signal vectors received at it's inlet.
API functions and explanation of the SoundTouch library are in the file SoundTouch.cpp. When an instance of type SoundTouch is created (with C++ new operator), samplerate and number of channels must be specified (obligatory), next to stretch/rate/pitch factor, and some technical parameters. Then you can start feeding blocks of samples to it and reading processed blocks of samples. When you are done with the SoundTouch instance, destroy it with the delete operator.
With stretch and rate factors kept at 1, and only the pitch factor
modified, the SoundTouch API functions receive and return blocks of
samples at equal rate. Pd's regular signal vector blocks can be
conveniently used for this. With the explicit choice of transposing an
input stream, [soundtouch~] can not directly time-stretch a soundfile
or buffer, even though the library's stretch functions are used under
the skin. However, with some extra Pure Data objects a time stretch
routine is easily built, as I will illustrate further down on this page.
The pitch factor is the only user definable parameter which I
provided with it's own inlet, since this is a musical parameter which
must be accessible during performance. Rapid changes in pitch can
sometimes result in audible clicks. It is possible to compile the
SoundTouch library with PREVENT_CLICK_AT_RATE_CROSSOVER defined. Time
stretching and samplerate conversion are then performed in different
order, and the clicks are gone. Unfortunately, this organisation
induces considerable (extra) latency, unacceptable for realtime
processing. No way of sailing inbetween Scylla and Charibdis, you have
to deal with one of both. I opted for the clicks rather than the
With messages to the left inlet of [soundtouch~], some technical
parameters can be set. 'Sequence' is what could also be called segment,
fragment or grain: the size of a signal portion that should be kept
together in the transformation. This size should be in the order of
fundamental frequencies to be found in the input signal, like 40 to 80
milliseconds. 'Seekwindow' is the size of the margin where SoundTouch
will seek for the best overlap position, using a cross-correlation
procedure. Overlap is the overlap region of two successive sequences
expressed in milliseconds. More details of these parameters are on
Olli's page http://www.surina.net/soundtouch/README.html#SoundStretch.
I found that for many sound sources and pitch settings, relatively
short sequence sizes (like 40 ms) give best results. Seekwindow and
overlap can be set in accordance, at 20 and 10 milliseconds
respectively. However, when trying to process sounds with high
polyphonic complexity, larger sequence and seekwindow sizes are
required to find suitable overlap spots. Although results will not be
great for such difficult sound material, expanding the working margins
for the SoundTouch routines does help. The downside: increased
processing latency and a more prominent echo effect.
Since [soundtouch~] for Pd works on a stream, it can not time-stretch an audio file all by itself. Additional Pure Data objects are needed to load (part of) a soundfile into a named buffer and play it at fractional speed, with interpolation. [soundtouch~] is used to do pitch correction, and ends up with the original pitch if you want, while the signal is played faster or slower. Below is a patch illustrating the idea. The loaded soundfile is played as a loop.
To provide phase-coherent stereo processing of soundfiles,
[soundtouch~] can be initialized as an object with two signal inlets
and outlets. Left and right channels are interleaved, and processed by
the SoundTouch routines as a regular stereo signal.
There is some operational and mathematical redundancy when using Pd
objects to read a (stereo) signal, and then feeding this stream into
the SoundTouch routines. However, in practice this redundancy does not
translate to substantial waste of CPU time. The above patch is
responsible for about 1% CPU load on my 2 GHz MacBook. With modest
stretching and not too complex sound material, the output has fairly
good quality. A more serious bottleneck in Pd is the limited precision
of buffer index numbers which can be transferred among objects. The 32
floating point format offers over 8 million unique indexes, but if some
resolution is to be reserved for interpolation, a buffer with about 20
seconds of audio is the maximum that can be handled. A specialized
class, i.e. [soundstretch~] for Pure Data, could be designed to operate
a large buffer directly and solve the imprecision problem. But is it
worth the effort?
For me, Pure Data's appeal is in it's unlimited options for
processing input signals. Therefore, I am not going to worry about
soundfiles and index imprecision. I am more than happy with SoundTouch
built into [soundtouch~] as a realtime pitch shifter. My main purpose
for it is the transformation of speech and solo instruments in live
with the parametric
Fourier filter described on an earlier page, and enhancements like
filtering, compression and distortion, the pitch shifter should be able
to do a
huge variety of
voice transformations. A single artist could impersonate a
multitude of characters with help of such a 'voicetrafo' module. More
that on the page Instant
If you want to try [soundtouch~] for Pure
Data, download the .zip
file below. It contains binaries for Linux, OSX and Windows, a help
patch, and all the source code. This version of [soundtouch~] is
developed with SoundTouch version 1.6 and Pure Data version 0.42.