<<home <<previous next>>

FFT Output

Since Fast Fourier Transform is complex Fourier Transform by nature, the output has real and imaginary parts for positive and negative frequencies. The input of forward transform can be real or complex. In the case of audio signals, the signal input is generally real, although the imaginary input is often 'abused' to transform another (part of a) real signal. For the moment, I will feed real vectors in the forward transform and plot the complex conjugated output. Below is a transform of 64 points for cos((2*pi*x/64))*10), which is the 10th harmonic of the FFT fundamental. The output has just two real coefficients of value 0.5 each.

The transform output was 'normalised' with factor 1/N. What do the Fourier coefficients express in this case? They are certainly not amplitudes, not even amplitudes of cosine and sine parts. Recall that the amplitude of cos(x) is 1/sqrt(2), the familiar value 0.707andsomething. The transform of the pure cosine gives real coefficients with total value 0.5+0.5=1. So, it shows amount-of-cosine rather than amplitude.

Below is the transform of the same cosine wave, except it had a phase shift of 0.4 radians:

Rounded to 2 decimals, the coefficient values are:

harmonic nr 10, real part: 0.46, imaginary part: 0.19
harmonic nr 54, real part: 0.46, imaginary part: -0.19

Doing Pythagoras with these coefficients will also give 0.5+0.5=1. Again, we get an amount of phase-shifted cosine, not the amplitude of the input vector. If you want amplitudes, they can simply be computed by applying the 0.707etc factor to the amount-of-sinusoid results.

The Fourier coefficients could be fed into an inverse FFT, with 'normalisation' factor 1. Then we get our cosine of harmonic nr 10 with phaseshift 0.4 back. Here it is, having real parts only, as should be the case:

As you see, there is much more energy in the signal than in the spectrum. That is because of the normalisation type. Therefore I write 'normalisation' factor 1/N, inbetween quote marks. It is rather a norma-normalisation factor, the square of the factor 1/sqrt(N) which would be normalisation proper in the mathematical sense. Anyway, we have this convenient method of deriving (co)sine parts for the analysis, and getting our input vector back if needed.

I suffer perpetual confusion about signs of sines, and now is a good moment to explore this topic with the help of plots. Recall that forward Fourier Transform is supposed to do correlation with negative frequencies, being considered a block-convolution rather than correlation. In the example plot below, a cosine vector with a positive phase-shift of pi/2 radians looks like a negative-signed sine vector, but the Fourier Transform shows a positive correlation for the imaginary part at harmonic 1. Is that logical?

I was specially concerned about this when I checked someone else's FFT library and found coefficients with the signs flipped. Was I mistaken all the time? After all I am a starter in this field, and prone to believe that Stanford profs know it better. But correlation with negative frequencies in the forward transform is also a matter of tradition, and it is legitimately done the other way round at some places. That adds even more to the confusion. Anyway, one should make a clear statement about what output can be expected from specific FFT code, that is what I have learned from it.

There is good reason to implement Fourier Transform as a convolution and not correlation. The transform gives information about the frequency content over the past period, so it is analysed as a reflection, starting at the most recent sample and working backwards. You can compute a phase shift of cosine directly from real and imaginary parts in the spectrum. In the example above, the correlation at harmonic 1 is (0+0.5i). In polar coordinates this will give phase 1.57 indeed. How convenient. I will stay with my convolution type FFT.

Next case. Below is a plot of cos((2*pi*x/64)*10.2) transformed. The input is a pure cosine, but it is not a harmonic of the basic FFT interval. There is a correlation with real and imaginary parts of nearly all frequencies, although it still peaks at harmonics 10-11 and 53-54.

A plot like the above is somewhat disquieting again. Does my FFT work properly? I think it does. But what is the use of coefficients that do not faithfully represent input frequencies?

That depends on the purpose of the transform. The fun thing is that the input vector can be perfectly reconstructed with an inverse FFT, even though the coefficients suggest dramatic inaccuracy.

I have experimented, recycling the abovementioned cosine vector one million times through forward and inverse FFT. It takes a couple of seconds, but after being transformed two million times the input vector comes out with no sign of degradation in the first six decimals. That makes me suspicious, so I checked the process, inserted a counter reporting the number of function calls etcetera. Hmm, it seems indeed that we need not worry about the accuracy of the reconstruction.

The eventual 'smearing' of correlation over the spectrum is inherent to the mathematical process of Discrete Fourier Transform. All texts on Fourier Transform treat this topic in detail. I wanted to plot some cases and see how bad it really is. The 64 point spectrum looked pretty much spoiled indeed. Below is a 1024 point transform of another cosine input that does not harmonize with the FFT fundamental:

For a larger frame like this, the smearing effect does not seem quite so disastrous. The coefficients outside the peak regions now correlate in the order 0.001.

Since the inverse transform supplies such a reliable reconstruction of the signal, I still wonder what the practical problems caused by spectral smearing look like. In practical applications, the spectrum would be manipulated somehow before being reverted into a signal. Imagine the case of a dynamically changing manipulation, or fast convolution with a response that stretches over more than one frame. Would you then see sudden jumps at the boundaries of the processed signal frames, where they are 'glued' together? This asks for more detailed tests for which I will need to write code. I will postpone that for a later page, and now continue with a related question.

From the moment I started experimenting with Max Msp's FFT objects one year ago, I wondered how they could reconstruct a real time signal so faithfully. Think of it: in a 1024 point FFT there are 512 frequency bins below Nyquist. For a regular sample rate of 44.1 KHz this means a fundamental frequency of about 43 Hz. And also an interval of 43 Hz between each of the harmonics in the transform. How can an FFT have knowledge of the inbetween frequencies, and restore these?

Slowly, I started to recognize that the information must be in the phase differences of subsequent frames. The frequency bins are discrete, like the signal samples, but the phase information is a continuum. At least, to the extent of the datatype's precision. I had a 'Fourierscope' patch in which the smooth exchange between real and imaginary coefficients showed up, when a sinusoid was fed in. With gnuplot connected to my own homebrew FFT utils, I can now check this in more detail.

phase0

phase1

phase2

phase3

phase4

A sinusoid that does 10.2 cycles per FFT frame, completes 60 cycles over 5 FFT frames. It will be phase shifted with 0.2 cycle over each new FFT frame. That is about 1.25 radians per frame, and a full cycle over 5 frames. I plotted the transforms of 5 successive frames, so you can follow how a steady input sinusoid makes the correlations rotate over the real and imaginary coefficients.

Halfway, there is a remarkable correlation with cosines exclusively. How come? I think it is because the input waveform is then positioned symmetrically within the FFT frame, and cosines are also symmetric.

Since the inverse Fourier Transform is so good at restoring the original input vector, the signal will also get it's phaseshift back. Therefore, it does not matter what frequency it has, whether it comes near to one of the FFT harmonics or not. The reconstruction is perfect in any case.

Plotting 60 cycles of a cosine will not render a helpful illustration, but the effect is similar to 6 cycles of a cosine with periodicity 1.2 as plotted below:

Now I want to explore a question that is related to the above topic. It is said that Fourier coefficients are not localized in time or space, in contrast with Wavelet coefficients. Indeed, Fourier Transform computes an average for the presence of each frequency, high or low, over the full width of a frame shared by all frequencies. The coefficients of the fundamental frequency of an FFT frame show exactly the detail there is to show about that range. But for all higher frequencies, details in time domain may exist, which are 'summarized' in the frequency coefficients.

Does that mean the transform wipes the time position of (sudden) signal events within the frame? Dzjee, that would make Fourier Transform a pretty lousy tool for signal processing! Of course it is not the case, but how does a spectrum retain precise information about time domain events? To find out, I transformed vectors containing single pulses at different time positions.

kronecker0

kronecker1

kronecker2

kronecker1617

The first transform is of a so-called Kronecker delta at time zero. The Kronecker delta means just a discrete value 1 at a certain position and zero elsewere. The output looks rather dull. I have learnt, and believed, that a single pulse contains all frequencies in equal amount. The plot supports this common knowledge. The energy of the pulse signal is spread over all 64 real frequency coefficients, positive, negative, DC and Nyquist. They correlate 0.015625 each after 1/N 'normalisation'.

The second transform is of a Kronecker delta at time x[1]. Does it still contain all frequencies in equal amounts? Yes, but with different phases. It has the full flat spectrum, now multiplied by a complex sinusoid unit vector.

Transform of a Kronecker delta at time x[2] shows another familiar pattern, of sinusoids with periodicity 2 this time.

The Fourier Transform is really indifferent about time domain or frequency domain, it just transforms one function to another, and it happens to be the case that pulses transform to sinusoids and vice versa.

It also shows that time delay translates to rotation of the phase spectrum. The position of a pulse in time domain can be accurately stored in complex frequency information.

Look for example how signal pulses x[16]=1 and x[17]=-1 are stored in a spectrum. These signal pulses are also perfectly reconstructed by an inverse FFT. I did not plot that here because it looks so dull, you just have to believe that I checked it.

Together, the spectrum coefficients keep a perfectly accurate record of the position of time/space events. It is just the way they store the information that makes them so different from Wavelet coefficients. You can not, by means of the Fourier coefficients, manipulate frequencies over intervals smaller than the framesize. Fourier coefficients do not lend themselves for meticulous data reduction. They are not helpful in keeping the fingerprint records of United States citizens manageable. However, you can do fast convolution with them. Fourier coefficients are supercool.

^top <<home <<previous next>>