<<home  <<previous  next>>

IPPASS finds the missing fundamental

Instantaneous Power of Phase-Aligned Signal Segment

Finding periodicity and period length in a signal is essential for some pitch shifting algorithms. It is often very easy to identify periods by eye in a signal plot. The example below has an obvious repetitive character. But how can a computer routine determine the pattern length? The signal has several zero-crossings per period, and several peaks as well.

signal with four equally strong harmonics

Phases of the harmonics in the above test signal were randomly selected, which makes it hard to identify a period start. This may seem unrealistic, but I've found that acoustic sounds do not always have their phases aligned either. Anyhow, a process to align phases is often used in pitch detection: autocorrelation. Below is the autocorrelation function plotted of the example signal, together with spikes indicating the autocorrelation peak heights and locations:

unbiased autocorrelation of signal with four equally strong harmonics

With the spikes being so different in height, it is now easy to identify period lenght as the interval between two peaks of considerable height.

Short-time autocorrelation has some similarity with convolution. A segment of the signal is taken, then time-reversed, and used as convolution kernel on that same segment. As a result, phases are set to zero, and the original amplitudes are modified. Autcorrelation output has length [input length * 2 - 1].

Just like how autoconvolution of a block results in a triangle, autocorrelation functions have a triagularish shape, which is regularly undone by an inverse scaling function (unbiasing). Autocorrelation is most efficiently done via frequency domain, as the inverse FFT of the power spectrum.

missing fundamentals

It is not uncommon for acoustic sounds to exhibit a harmonic recipe with a weak fundamental. Typically, this happens with low-register tones in vocals and instruments for which the resonator is too small to amplify the low notes. Below is an example test signal featuring three harmonics with amplitude ratio's 0.2, 1.0 and 0.2. Phases are randomised again, and it is hard to imagine how a computer could tell the period length here, even though the signal lobes are still somewhat different for the eye.

signal with weak fundamental

Indeed, the autocorrelation function of the above signal is not very significant. Sure the peaks are still a bit different in height, but natural sound level fluctuations and noises would marginalize such differences. This is a likely cause for errors in autocorrelation analysis. The dreadful octave errors!

autocorrelation of signal with weak fundamental

PASS and IPPASS help out

Using the amplitude spectrum instead of power spectrum, a signal segment can be made with the original amplitudes of the components and all phases set to zero. Here is the PASS function (Phase-Aligned Signal Segment) corresponding to the signal with weak fundamental:

PASS of signal with weak fundamental

The PASS function's periodicity peaks are more pronounced than in the autocorrelation. From PASS, it is possible to construct another function which really stands out in the missing-fundamental case: IPPASS, Instantaneous Power of Phase-Aligned Signal Segment.

IPPASS of signal with weak fundamental

For comparison, here is the test signal once more from which the above IPPASS was derived:

signal with weak fundamental

analytic signals

The IPPASS function is derived from the two phases of an analytic signal, zero-phase and quadrature phase. If a test signal is made of cosine components plus sine components of the same weight, we already have an analytic signal. When the signal is connected to an x-y oscilloscope, you can see the amplitude curve which is followed by the signal. The amplitude is a radius on the complex plane. A pure sinewave describes a circle, but a harmonic recipe shows periodic amplitude modulations. It is funny that we do not hear the amplitude modulations as such. Only if a false note is played, we start to perceive explicit amplitude modulations: beats or roughness.

The x-y plots below show a few examples, with their harmonic recipes indicated as fractions again. The second plot has the weak-fundamental recipe from the earlier test signal. In each case, the amplitude plot describes several (sub)cycles, but there is only one large amplitude peak per full period.

1.0 + 1.0

0.2 + 1.0 + 0.2

1.0 + 1.0 + 1.0 + 1.0

The IPPASS function uses the square of the amplitude, the instantaneous power. The range of the power function is expanded, making it easier to distinguish small peaks from large peaks.

For some harmonic recipes, the instantaneous amplitude curve describes several equal traces per period. The case plotted below has two odd harmonics, a clarinet-like recipe. For such cases, the IPPASS function does not help in tracking the period, but instead it is confusing. And in the case of a pure sinusoid, IPPASS shows a flat line. Fortunately, such signals show unambiguous peaks in the autocorrelation or in the phase-aligned signal segment. PASS and IPPASS are sort of complementary in this sense.

1.0 + 0.0 + 0.8

mathematical definition

IPPASS is the instantaneous power of a phase-aligned signal segment. Alignment of the phases is required to concentrate as much signal energy as possible in a main periodical peak, analogous to the purpose of autocorrelation. That aspect is best described as a frequency-domain characteristic, the amplitude spectrum of a signal segment:

where k are the frequency bin indexes

The PASS function is taken as the zerophase signal resulting from inverse Fourier transform of the amplitude spectrum:

where n are the PASS sample indexes

Likewise, a quadrature phase version of the phase-aligned segment is created as the inverse Fourier transform of the amplitude spectrum multiplied by -i:

Together the zerophase and quadrature phase signal segments form an analytic signal segment, of which the instantaneous power can be computed by squaring the phase samples and summing them pointwise:

practical implementation

So far, all seems simple and straightforward, but in practice it is not at all so easy to produce a neat phase-aligned signal segment. The example test signals above had frequencies harmonizing with the FFT size to make a nice demo. But a sinusoid not harmonizing with the FFT size has a lot of real and imaginary coefficients even if the signal segment is at phase zero respective to the FFT frame. Illustrations of this can be found on page 'FFT output'.

Making all FFT coefficients real positive creates an output signal with cosines only, but this is not quite the zero-phase version of the input signal which I had naively hoped for. Here is an illustrative example:

zero-phase input signal

inverse FFT of amplitude spectrum of above signal

Wait, is it not the case that phase changes in the spectrum make signal tails fold over, like with circular convolution? For this reason autocorrelation is done with zero-padding so the output gets space to grow.

Let's have a look at the phase-aligned version of a rectangular window with zero-padding. Bah, it looks like an onion!

IFFT of magnitude spectrum of rectangular zero-padded window

There's really no way to factor that onion countour out of an arbitrary 'phase-aligned' signal segment. The input signal must be Hann-windowed, that's for sure. Now a new problem arises. Normally, windowing is undone at the output by overlapping segments. But the phase aligned output segments can not be integrated to form a continous signal stream, since it starts at phase zero every time, regardless of the input phase! That's why I'm consistenly speaking of 'signal segment' all the time. There is nothing to overlap here. The segments must be 'unwindowed' by division, brrr...

That's a pity. This nice IPPASS function must be calculated with help of such an ugly thing as division by the window. I've been puzzling for days to find a better solution, but this might be one of those impossible missions. I'm now using something inbetween a Hann and Hamming window to get a compromise, avoiding too small numbers in the denominator. It needs refinement.

Update: a Gaussian window seems to work better.

Anyway, I can now use the functions on real world input. Here is a plot showing autocorrelation and PASS/IPPASS as derived from an acoustic input signal. It is my voice saying a noisy 'uuuu'. Before analysis, the input signal was steeply lo-pass filtered with cutoff at 1 KHz. Here, autocorrelation is on the brink of failure, while PASS and notably IPPASS have no problem to indicate the correct period length.

Of course I produced that sound on purpose to demonstrate the qualities of IPPASS. But frankly, there are also cases where IPPASS is good for nothing. Like here, where the (filtered) input signal approaches a pure sinewave:

I have no clue if the function which I've now baptized 'IPPASS' is already in use for any purpose. In particular, I hope it is not taken hostage in the patent war on pitch-tracking. In my view, math functions should be free for all to use.

If you want to see the IPPASS function in action, download IPPASS05.pd, a demo patch for Pd-extended:

IPPASS05.pd.zip, 8 KB, patch for Pd-extended