<<home  <<previous  next>>

Parametric Fourier Filter

I have always been a fan of Cool Edit's built-in Fourier filter, which can do extreme yet quite precise filtering. For some time I daydreamed about making a similar filter for real-time audio processing. The curve in my filter should be controlled by a few parameters, and not by drawing innumerable points on the computer screen. The key to a musically effective multiband filter is a logarithmic distribution of center frequencies and bandwidths, which is better set by mathematical formulas than by eye and hand.


When starting out, I was still unaware of some issues related to spectral processing, notably the resolution/latency trade-off. Soon, it became clear that the linear character of a DFT spectrum output would not allow a 32-band filter, if at the same time an acceptable latency for real time purposes were to be retained.

The Fourier filter can not replace a time domain filter (bank) in all situations. While learning about these limitations the hard way, by trial and error, (because I am too stubborn to take wise people's words for granted) I worked out my parametric Fourier filter and made the best of it, given the conditions. The result so far is a relatively light-weight and flexible routine, linking the appreciated sound characteristic to a surprising ease of handling. Let me reveal some details of the concept.


The essence of my filter's amplitude spectrum is a plain logarithmic curve. On the left, part of the natural logarithm curve ln(x) is plotted. For any base logarithm, x[0] always has y=-infinity and x[1] has y=0.


A very famous one is the base-2 logarithm. It is not on pocket calculators or in the standard C-library. But it is easily computed with 2log(x)=ln(x)/ln(2.). I have set black dots at the base-2 logarithms of 1, 2, 4, and 8.

Base 2 or radix 2 is the number system for computers, the binary system, so base 2 exponents and logarithms are implicitly used all the time. It also happens to be the 'octave-logarithm', describing how frequencies relate to octaves in pitch perception.

Spectrum coefficients represent frequencies in terms of harmonics of the DFT fundamental. The fundamental is a sinusoid with one cycle fitting exactly in the analysis frame. Coefficient x[0] represents the DC component, and x[1] represents the DFT fundamental. It's frequency in Hz depends on the DFT size and samplerate, but coefficient x[2] is always the second harmonic, being the octave of x[1]. x[4] has another octave, then x[8], x[16], x[32] and so on. The pattern is, that the octaves are at x[2n], n being an integer. Inbetween coefficient x[0] and x[1] are infinitely many more lower octaves, none of which can be represented as such. An impression of that is plotted below. The black dots indicate spectrum coefficients and the continuous line shows peaks at octaves.


From this, it is clear that filterbands smaller than octaves are impossible in the lowest frequencies region. Higher up, the pitch resolution gets better, and it is possible to create narrow peaks at octave intervals:


Upward from bin x[30], the resolution would allow isolated bands at third-octave intervals, like plotted below:


Specially the third-octave bands make a multibandfilter 'sing' rich and harmonically, but larger intervals have interesting characters as well. By the way, a time domain third-octave filter with all it's bands up is not supposed to sing at all, but due to phase cancellations, they have that tendency, which has become an appreciated sound in itself. With Fourier filtering this can be pushed to the extreme, but the lowest bands are not controlled with precision and tend to just disappear in many cases.

I could now compute what FFT size is theoretically required to do such extreme filtering. Say I want to control third octaves from 40 Hz up with fair precision. My first guess is that a 215 point FFT would do. Let's see: 215 is 32768 points. With samplerate 44k1 this makes an FFT fundamental of 44100/32768 = 1.3458 Hz. Harmonic 30 is at 30*1.3458 = 40.3 Hz in that case, and that is where the filtering would start being more or less precise. A 32768 point FFT is not so extravagant in itself and may take a millisecond CPU time, but all these samples have to be collected before a frame can be processed. That takes 743 milliseconds! Blah....

Now you see where the bottleneck is with spectral filtering. I could not find a compromise that can pair acceptable latency to the precision needed for my extremest filtering desires. I will have to switch FFT sizes according to the purpose, and not use the third octaves (or the Fourier filter at all) when timing is crucial. In practice, I found a 213 point FFT sufficient for needle-thin bands at third-octave intervals. I typically use wide-band noisy input for such filtering, and the lower bands seem fairly well preserved. A 210 point FFT can not do this, but will serve for most other purposes. This FFT length can also be abused for extreme filtering, but it may introduce artefacts, which are welcomed in some cases and abhorred in others. More on artefacts later, let me now illustrate the mathematical operations that I use to construct multiband filter spectra.


Indexing starts at x[0] but I want to avoid the  minus infinity result of ln(0.), which would propagate through all subsequent operations and might lead to Not A Number somewhere. I simply add 1 to x before taking the logarithm. This is a distortion of the perfect logarithmic flow, but it also makes the disappearing-band effects less systematic, which is an advantage.

The logarithmic curve will supply the phase arguments for a (co)sine function, to create a logarithmic sweep. In order to normalise the sweep somehow to the FFT size, the logarithmic curve is first multiplied with pi/ln(FFTsize/2). The  FFT's half framesize equals the number of spectrum coefficients, with the conjugates left out. The multiplication does not take away the logarithmic character of the curve.


Here is the example for one filterband on a 512 point spectrum. The asymmetric appearance matches the logarithmic nature of pitch perception.


Multiplying the one-band log curve with 2 before computing the sinusoid, makes the sine rotated over a total of 2 pi within the 512 point spectrum.


With the abs() function the sweep is rectified, and exactly two filterbands result.


By simple multiplication of the initial log curve I could effectuate any desired number of filterbands. It need not be an integer number. Here I have set 2.5 bands.


Raising the complete function to a power alters the slopes. A power inbetween 0 and 1 is actually root extracting and makes less steep slopes, till the point of a flat spectrum at power 0. Be careful not to set a power below zero!


Powers above 1 make steeper slopes. Here is the rectified sweep raised to the power of 3.


Of course I also want to control the center frequency of the filterbands. This can be done by simple phaseshift of the sweep. Very conveniently, the bandwidth varies with position and it seems to retain it's Q factor.


The constant Q works because the sweep originates from a logarithmic curve. The plot here illustrates how a -pi/2 phase shift makes dips exactly at the positions of the peaks in the original function, and vice versa.


Finally, I multiply the sweep with factor 1.5 and clip at y=1. Each band will then get a flat region, which gives them better definition.

The passive filtering process generally reduces the output energy, and with narrow peaks the loss can be substantial. Therefore I implemented a partial amplitude compensation, linked to the power function. The integrated area under the sweep diminishes inversely proportional to roughly the square root of the power to which the sweep is raised. A power of 16 would theoretically require a factor 3.5 amplitude compensation, and a power of 64 (my maximum) require a factor 7 compensation. Due to perception issues and limited headroom, that is way too much in practice. For now, I have set a maximum amplitude gain factor 2, which may be a little conservative.

After some experimenting, I decided my filter could do with three user parameters:
- number of filterbands (from 1 to 32)
- Q factor adjustment (starting at flat spectrum)
- frequency position of filterband(s)

All settings in the examples below are done with just these parameters:

(1) almost flat spectrum
(2) increased Q factor

(3) further increased Q factor
(4) frequency shift, Q factor retained

(5) more filterbands
(6) frequency shift, Q factor retained

(7) further Q factor increase
(8) more filterbands

on artefacts

Creating steep slopes in a filter spectrum can introduce artefacts in the filtered signal, as I could clearly hear. How come? I think it is because frequencies in a signal are never correlated by one single Fourier coefficient, but the correlation is always spread over more neighbouring bins. Zeroing out part of these correlation coefficients and leaving the others, will create frequencies which were not in the input signal.

For a short while, I was seriously disappointed by the poor results of Fourier filtering. Then I remembered about FFT window types influencing the spectral smearing character. I tried overlapping 4 FFT frames instead of the regular 2 frames. The extra overlap turned out to be more decisive than the window type. This resolved my problem to a large extent, at the price of extra FFTs. But FFTs are not so expensive at all. At least I could now use the Fourier filter with more extreme settings for delicate signal types like voice processing. Later on, I have done minute experiments to find the best windowing options for the
Fourier filter. It is documented on the page FFT Windowing & Filtering.

I prototyped the Fourier filter as a Max/Msp patch. The patch is extensively commented on the next page, Fourier filter in Max/Msp.