<<home  <<previous  next>>


[soundtouch~] for Pure Data



building a C++ library into a Pure Data 'class'                           download at page bottom


The previous page illustrated some of the persistent problems that come with time stretching and pitch shifting. For the purpose of illustration, I wrote elementary time domain and frequency domain pitch shifters for Pure Data. These objects showed the pernicious artifacts of elementary pitch shifting in their full glory. They are not suitable for serious signal processing.

While learning about pitch shifting, and searching for code, I came accross Olli Parviainen's SoundTouch C++ library at http://www.surina.net/soundtouch/. Though the technique in this library is not the cutting edge in time/pitch transform, it is adequate for simple periodic signals, and efficient. Further, it is open source, well documented and user-friendly. SoundTouch is incorporated in many dsp packages, with Ardour and Audacity being notable examples.

The real time dsp package Pure Data has no pitch shift class on board, and SoundTouch is an evident candidate to fill this void. Writing a Pure Data 'class' [soundtouch~] was ultimately a piece of cake, as the SoundTouch library handles everything conveniently. However, compiling and linking a C++ library with a Pure Data 'class' is not completely straightforward, since Pure Data is written in C.

On this page is a detailed description of building SoundTouch into [soundtouch~] for Pd. I used Xcode on OSX, and came across several issues specific to Xcode or to the C/C++ mix. Therefore, a fair portion of this page would apply to the (re)use of C++ libraries in Pure Data in general. Elementary topics of Pure Data class developing are not covered here. A tutorial is on http://iem.at/pd/externals-HOWTO/. Hints for building Pd externals using Xcode are on http://puredata.info/docs/developer/PdExternalsInXcode.



symbols and extern "C"


The critical aspect of (re)using a C++ library within Pure Data, is the table of symbols in the final product. In a mixed C / C++ project, a C++ compiler is used, which will compile C code as well. In the intermediate object files (.o files) and the built product (soundtouch~.pd_darwin in the case described here), all function symbols appear in what is called mangled form, which is C++ specific. This is no problem, except for the case of the setup function in the Pure Data class. The setup function is the entry point for Pure Data to find and set up the class, and it's symbol must follow Pd's conventional spelling, otherwise it will not be found. That is what goes wrong with mangled symbols. To get the setup function spelled as a regular C symbol, it must be declared (and/or defined) as extern "C", like this:




setupetc



When building with Xcode, you can select the menu item 'build > show assembly code' and check how the symbols appear in the object files. A regular C symbol appears like you have written it in the code, with only an underscore prepended. Mangled symbols have additions, denoting aspects like argument type of a function. This is because C++ functions can be defined more than once for different argument types. Below, the difference between C symbol and mangled symbol is shown. Functions like soundtouch_dsp are all declared static and for them it does not matter whether to have mangled symbols or C symbols.





asm1

C symbol for soundtouch_tilde_setup()
asm2.

mangled symbol for soundtouch_dsp()



Pure Data's API functions must be found by their C symbols as well. If you look in the API header file m_pd.h, you will see that the complete file content is conditionally declared extern "C". In this fashion, m_pd.h is well prepared to function either with C compiler or with C++ compiler.



cppwrapper

cppwrapper2



Is this extern "C" declaration of functions sufficient for the whole thing to work? No, the 'visibility' of symbols must also be organised, and that is a different thing.



hidden symbols and the -fvisibility flag


In Pure Data classes, most functions can be declared static, meaning they can only be called from within the file. Symbols of functions which must be called from other executable files should be visible beyond the file, and also beyond the executable. This holds for the setup function. Functions not specified as static are by default visible, from other files for static linking, and from other executables for dynamic linking. Other names for visible in this sense are: exported, global, external. The opposite is hidden, invisible, local.

For multiple-file dynamic libraries, symbol visibility can not be organised at the language level (C or C++). Compiler/linker options and attributes must help here. Starting from version 4.0, GCC has a -fvisibility flag. -fvisibility=default wil make all symbols in the target visible, if not defined otherwise. -fvisibility=hidden, in contrast, will hide all symbols within the target, if not specified otherwise. Exceptions to the general setting can be made by using attributes in the code. Syntax of flag and attribute are unfortunately compiler/platform-specific.

Hidden symbols, where possible, are recommended for several reasons. With fewer exported symbols to resolve, the dynamic linker can load a program faster. Further, the chance of symbol collision is reduced. Apple has pages on this topic, see [this link] and [this link]. GNU has info on http://gcc.gnu.org/wiki/Visibility. A thorough discussion of linking dynamic libraries of the ELF format is from Ulrich Drepper: http://people.redhat.com/drepper/dsohowto.pdf.



Xcode project templates


In Xcode, every new project starts from a template. Since Pure Data external classes reside in dynamic libraries, BSD Dynamic Library is normally the correct template. For a C++ project, there is choice between C++ Dynamic Library and C++ Standard Dynamic Library. What is the difference? With the C++ Standard Dynamic Library template, the template comment states that it will build a product with:

...'all symbols hidden by default'...

It took me a long time to find out and verify that this 'help' text does not reflect the actual template settings, as I will show later. Notice that this is in Xcode 3.0, and things may be different in other versions.




xcodetemplate


 


Compiler and linker flags of Xcode templates are normally reflected in the project- and target settings. For the C++ Standard Dynamic Library template, the 'Symbols Hidden by Default' option is unchecked, in contrast with the template help text:




targetsettings




In addition, there is a separate Xcode configuration file, showing the definition GCC_SYMBOLS_PRIVATE_EXTERN=NO. This config file can be accessed from the configuration files folder in the .xcodeproj window of the project, and here is how it looks like:




config1




To see the actual GCC flags, have a look at GCC's build transcript after building the target (click the indicated icon to see the transcript). In this case, there is no general -fvisibility flag at all, only a visibility flag applying to inline functions:




buildresults




Apparently, the 'help' comments for the C++ library templates were swapped, respective to the actual template settings. For the C++ Dynamic Library template, settings are like this: 'Symbols Hidden by Default' is checked, the config file has GCC_SYMBOLS_PRIVATE_EXTERN=YES, and the GCC flag -fvisibility=hidden will appear in the build transcript.

Such confusion can be very time-consuming. On the other hand, without it, I might never have bothered about hidden symbols so much. Which symbols are in fact exported when the -fvisibility flag is not set at all? The GNU command nm (run in Terminal.app) lets you inspect the symbols table of an object file (.o file) or executable. nm can give you a complete symbols table, with the symbols prepended by characters indicating their local/global status and other characteristics. Upper case characters are reserved for global symbols, while lower case characters indicate local symbols. It is also possible to run nm with the -g option and list global symbols exclusively. Here is a fragment of the global symbols listed in soundtouch~.pd_darwin, as built without -fvisibility flag:



katja-vetters-macbook:~ katja$ nm -g /Applications/Pd-extended.app/Contents/Resources/extra/soundtouch\~.pd_darwin

/Applications/Pd-extended.app/Contents/Resources/extra/soundtouch~.pd_darwin(single module):
         U __Unwind_Resume
00001ea0 T __Z17disableExtensionsj
00001eb4 T __Z19detectCPUextensionsv
00002fda T __ZN10soundtouch10PeakFinder10detectPeakEPKfii
00002cfa T __ZN10soundtouch10PeakFinderC1Ev
00002ce4 T __ZN10soundtouch10PeakFinderC2Ev
00004aee T __ZN10soundtouch10SoundTouch10putSamplesEPKfj
0000445c T __ZN10soundtouch10SoundTouch10setSettingEii
000049e8 T __ZN10soundtouch10SoundTouch11setChannelsEj
00004384 T __ZN10soundtouch10SoundTouch12getVersionIdEv
0000490a T __ZN10soundtouch10SoundTouch13setRateChangeEf
0000438e T __ZN10soundtouch10SoundTouch13setSampleRateEj
000048c4 T __ZN10soundtouch10SoundTouch14setTempoChangeEf
...




This is only a minor part of the complete list. All together, hundreds of symbols are exported. Only the C functions which I declared static do not appear in the global symbols list. For a single Pure Data object, this seems disproportionate. Most of it is redundant: exported symbols which are not intended to be global at all. For larger libraries, the list can get huge. For example, the Gem class for Pure Data in OSX binary distributions has over 40.000 symbols exported.

Having seen such irrational global symbols lists, I am now determined to find out how it should be done properly. Specially for Pure Data, where it is not unusual to load dozens of external classes within the main process, it seems relevant to keep their footprint and demands possibly low. Checking the 'Symbols Hidden by Default' option in the project target settings will set the -fvisibility=hidden flag for the compiler. This will make all internal symbols invisible. I need to make an exception for the setup function, and this is done with an attribute:



attributeetc



Running nm -g again on soundtouch~.pd_darwin, which is now compiled with hidden symbols, gives a smaller list indeed. Below, I have copied the whole of it. Apart from -soundtouch_tilde_setup, they are all (undefined) references to functions outside soundtouch~.pd_darwin. These include symbols from the Pure Data framework like _pd_new and _post, and symbols from the standard C/C++ libraries. They are all truly global symbols.



katja-vetters-macbook:~ katja$ nm -g /Applications/Pd-extended.app/Contents/Resources/extra/soundtouch\~.pd_darwin

/Applications/Pd-extended.app/Contents/Resources/extra/soundtouch~.pd_darwin(single module):
         U __Unwind_Resume
         U __ZNSaIcEC2Ev
         U __ZNSaIcED2Ev
         U __ZNSsC1EPKcRKSaIcE
         U __ZNSsD2Ev
         U __ZNSt13runtime_errorC1ERKSs
         U __ZNSt13runtime_errorD1Ev
         U __ZNSt8ios_base4InitC1Ev
         U __ZNSt8ios_base4InitD1Ev
         U __ZSt9terminatev
         U __ZTISt13runtime_error
         U __ZTVN10__cxxabiv117__class_type_infoE
         U __ZTVN10__cxxabiv120__si_class_type_infoE
         U __ZdaPv
         U __ZdlPv
         U __Znam
         U __Znwm
         U ___assert_rtn
         U ___cxa_allocate_exception
         U ___cxa_atexit
         U ___cxa_free_exception
         U ___cxa_pure_virtual
         U ___cxa_throw
         U ___gxx_personality_v0
         U _class_addmethod
         U _class_domainsignalin
         U _class_new
         U _cos
         U _dsp_add
         U _expf
         U _freebytes
         U _gensym
         U _getbytes
         U _inlet_new
         U _memcpy
         U _memmove
         U _memset
         U _outlet_new
         U _pd_new
         U _post
         U _pow
         U _printf
         U _resizebytes
         U _s_signal
         U _sin
00000fe6 T _soundtouch_tilde_setup




The visibility flags and attributes are GCC-specific, and with the attribute included the code is no longer compiler-independent. Fortunately, there is GCC for any platform where Pure Data could run, including Windows. Alternatively, flag and attribute could be defined conditionally for GCC and for other compilers having a similar mechanism. This will introduce a lot of lines in the preprocessor section, even though it is only used for one setup function. Here is how you could conditionally define and use a macro EXPORT for GCC version 4.0 or higher:



export


export2



For a Windows build (even when done with GNU tools) the attribute name is different. That is not shown here because I have not tested it yet, but the syntax can be found on the earlier mentioned page http://gcc.gnu.org/wiki/Visibility.



other configuration details


SoundTouch and the test utility SoundStretch come with scripts to build SoundStretch from the command line on platforms with GNU tools. On OSX, this can be done from Terminal.app, running ./configure and make. However, to build [soundtouch~] for Pd I use the Xcode environment, because that is where I wrote the code for the external. The Pure Data API header m_pd.h and the SoundTouch source files and includes must be added to the Xcode project, with the exception of some Windows-specific files. The includes directory should contain the header file soundtouch_config.h. Originally, there is only soundtouch_config.h.in. To generate the header file, run ./configure in the soundtouch directory as it is distributed.

In addition to compiler flags which are automatically set by Xcode, the -fcheck-new flag should be set:


compilerflag



Other modifications to the target settings are the usual ones for Pd externals: the linker flag -undefined dynamic_lookup for the missing Pd API definitions, the executable extension .pd_darwin instead of .dylib, and the dismissal of the executable prefix 'lib'.



SoundTouch and [soundtouch~]


The SoundTouch library can perform time stretching, samplerate conversion and (by combining these two) pitch shifting. When building the library into a Pd class [soundtouch~], I had to choose between processing an input stream or processing sound files. Well actually I did not make a choice at all, opting for realtime processing right from the start. It means that my [soundtouch~] for Pd is a realtime pitch shifter, operating on signal vectors received at it's inlet.



soundtouch~



API functions and explanation of the SoundTouch library are in the file SoundTouch.cpp. When an instance of type SoundTouch is created (with C++ new operator), samplerate and number of channels must be specified (obligatory), next to stretch/rate/pitch factor, and some technical parameters. Then you can start feeding blocks of samples to it and reading processed blocks of samples. When you are done with the SoundTouch instance, destroy it with the delete operator.

With stretch and rate factors kept at 1, and only the pitch factor modified, the SoundTouch API functions receive and return blocks of samples at equal rate. Pd's regular signal vector blocks can be conveniently used for this. With the explicit choice of transposing an input stream, [soundtouch~] can not directly time-stretch a soundfile or buffer, even though the library's stretch functions are used under the skin. However, with some extra Pure Data objects a time stretch routine is easily built, as I will illustrate further down on this page.

The pitch factor is the only user definable parameter which I provided with it's own inlet, since this is a musical parameter which must be accessible during performance. Rapid changes in pitch can sometimes result in audible clicks. It is possible to compile the SoundTouch library with PREVENT_CLICK_AT_RATE_CROSSOVER defined. Time stretching and samplerate conversion are then performed in different order, and the clicks are gone. Unfortunately, this organisation induces considerable (extra) latency, unacceptable for realtime processing. No way of sailing inbetween Scylla and Charibdis, you have to deal with one of both. I opted for the clicks rather than the extra latency.

With messages to the left inlet of [soundtouch~], some technical parameters can be set. 'Sequence' is what could also be called segment, fragment or grain: the size of a signal portion that should be kept together in the transformation. This size should be in the order of fundamental frequencies to be found in the input signal, like 40 to 80 milliseconds. 'Seekwindow' is the size of the margin where SoundTouch will seek for the best overlap position, using a cross-correlation procedure. Overlap is the overlap region of two successive sequences expressed in milliseconds. More details of these parameters are on Olli's page http://www.surina.net/soundtouch/README.html#SoundStretch.




settings





I found that for many sound sources and pitch settings, relatively short sequence sizes (like 40 ms) give best results. Seekwindow and overlap can be set in accordance, at 20 and 10 milliseconds respectively. However, when trying to process sounds with high polyphonic complexity, larger sequence and seekwindow sizes are required to find suitable overlap spots. Although results will not be great for such difficult sound material, expanding the working margins for the SoundTouch routines does help. The downside: increased processing latency and a more prominent echo effect.



time stretching


Since [soundtouch~] for Pd works on a stream, it can not time-stretch an audio file all by itself. Additional Pure Data objects are needed to load (part of) a soundfile into a named buffer and play it at fractional speed, with interpolation. [soundtouch~] is used to do pitch correction, and ends up with the original pitch if you want, while the signal is played faster or slower. Below is a patch illustrating the idea. The loaded soundfile is played as a loop.





timestretching





To provide phase-coherent stereo processing of soundfiles, [soundtouch~] can be initialized as an object with two signal inlets and outlets. Left and right channels are interleaved, and processed by the SoundTouch routines as a regular stereo signal.

There is some operational and mathematical redundancy when using Pd objects to read a (stereo) signal, and then feeding this stream into the SoundTouch routines. However, in practice this redundancy does not translate to substantial waste of CPU time. The above patch is responsible for about 1% CPU load on my 2 GHz MacBook. With modest stretching and not too complex sound material, the output has fairly good quality. A more serious bottleneck in Pd is the limited precision of buffer index numbers which can be transferred among objects. The 32 bit floating point format offers over 8 million unique indexes, but if some resolution is to be reserved for interpolation, a buffer with about 20 seconds of audio is the maximum that can be handled. A specialized class, i.e. [soundstretch~] for Pure Data, could be designed to operate on a large buffer directly and solve the imprecision problem. But is it worth the effort?

For me, Pure Data's appeal is in it's unlimited options for processing input signals. Therefore, I am not going to worry about soundfiles and index imprecision. I am more than happy with SoundTouch built into [soundtouch~] as a realtime pitch shifter. My main purpose for it is the transformation of speech and solo instruments in live performance. Combined with the parametric Fourier filter described on an earlier page, and enhancements like filtering, compression and distortion, the pitch shifter should be able to do a huge variety of voice transformations. A single artist could impersonate a multitude of characters with help of such a 'voicetrafo' module. More about that on the page Instant Decomposer.


If you want to try [soundtouch~] for Pure Data, download the .zip file below. It contains binaries for Linux, OSX and Windows, a help patch, and all the source code. This version of [soundtouch~] is developed with SoundTouch version 1.6 and Pure Data version 0.42.


soundtouch~.zip, 216 KB