Since Fast Fourier Transform is complex Fourier Transform by nature,
the output has real and imaginary parts for positive and negative
frequencies. The input of forward transform can be real or complex. In
the case of audio signals, the signal input is generally real, although
the imaginary input is often 'abused' to transform another (part of a)
real signal. For the moment, I will feed real vectors in the forward
transform and plot the complex conjugated output. Below is a transform
of 64 points for cos((2*pi*x/64))*10), which is the 10th harmonic of
the FFT fundamental. The output has just two real coefficients of value
0.5 each.
![]() |
The transform output was 'normalised' with factor 1/N. What do the
Fourier coefficients express in this case? They are certainly
not amplitudes, not even amplitudes of cosine and sine parts. Recall
that the amplitude of cos(x) is 1/sqrt(2), the familiar value
0.707andsomething. The transform of the pure cosine gives real
coefficients with total value 0.5+0.5=1. So, it shows amount-of-cosine
rather than amplitude.
Below is the transform of the same cosine wave, except it had a
phase shift of 0.4 radians:
![]() |
Rounded to 2 decimals, the coefficient values are:
harmonic nr 10, real part: 0.46, imaginary part: 0.19
harmonic nr 54, real part: 0.46, imaginary part: -0.19
Doing Pythagoras with these coefficients will also give 0.5+0.5=1.
Again, we get an amount of phase-shifted cosine, not the amplitude of
the input vector. If you want amplitudes, they can simply be computed
by applying the 0.707etc factor to the amount-of-sinusoid results.
The Fourier coefficients could be fed into an inverse FFT, with
'normalisation' factor 1. Then we get our cosine of harmonic nr 10 with
phaseshift 0.4 back. Here it is, having real parts only, as should be
the case:
![]() |
As you see, there is much more energy in the signal than in the
spectrum. That is because of the normalisation type. Therefore I write
'normalisation' factor 1/N, inbetween quote marks. It is rather a
norma-normalisation factor, the square of the factor 1/sqrt(N) which
would be normalisation proper in the mathematical sense. Anyway, we
have this convenient method of deriving (co)sine parts for the
analysis, and getting our input vector back if needed.
-
I suffer perpetual confusion about signs of sines, and now is a
good moment to explore this topic with the help of plots. Recall that
forward Fourier Transform is supposed to do correlation with negative
frequencies, being considered a block-convolution rather than
correlation. In the example plot below, a cosine vector with a positive
phase-shift of pi/2 radians looks like a negative-signed sine vector,
but the Fourier Transform shows a positive correlation for the
imaginary part at harmonic 1. Is that logical?
![]() |
![]() |
I was specially concerned about this when I checked someone else's FFT
library and found coefficients with the signs flipped. Was I mistaken
all the time? After all I am a starter in this field, and prone to
believe that Stanford profs know it better. But correlation with
negative frequencies in the forward transform is also a matter of
tradition, and it is legitimately done the other way round at some
places. That adds even more to the confusion. Anyway, one should make a
clear statement about what output can be expected from specific FFT
code, that is what I have learned from it.
There is good reason to implement Fourier Transform as a convolution
and not correlation. The transform gives information about the
frequency content over the past period, so it is analysed as a
reflection, starting at the most recent sample and working backwards.
You can compute a phase shift of cosine directly
from real and imaginary parts in the spectrum. In the example above,
the correlation at
harmonic 1 is (0+0.5i). In polar coordinates this will give phase 1.57
indeed. How convenient. I will stay with my convolution type FFT.
-
Next case. Below is a plot of cos((2*pi*x/64)*10.2) transformed. The
input is a
pure cosine, but it is not a harmonic of the basic FFT interval. There
is a correlation with real and imaginary parts of nearly all
frequencies, although it still peaks at harmonics 10-11 and 53-54.
![]() |
A plot like the above is somewhat disquieting again. Does my FFT
work
properly? I think it does. But what is the use of coefficients that do
not faithfully represent input frequencies?
That depends on the purpose
of the transform. The fun thing is that the input vector can be
perfectly reconstructed with an inverse FFT, even though the
coefficients
suggest dramatic inaccuracy.
I have experimented, recycling the abovementioned cosine vector one
million times
through forward and inverse FFT. It takes a couple of seconds, but
after being transformed two million times the input vector comes out
with no sign of degradation in the first six decimals. That makes me
suspicious, so I checked the process, inserted a counter reporting the
number of function calls etcetera. Hmm, it seems indeed that we need
not worry about the accuracy of the reconstruction.
The eventual 'smearing' of correlation over the spectrum is inherent
to the
mathematical process of Discrete Fourier Transform. All texts on
Fourier Transform treat this topic in detail. I wanted to plot some
cases and see how bad it really is. The 64 point spectrum looked pretty
much spoiled indeed. Below is a 1024 point transform of another cosine
input that does not harmonize with the FFT fundamental:
![]() |
For a larger frame like this, the smearing effect does not seem
quite so disastrous. The coefficients outside the peak regions now
correlate in the order 0.001.
Since the inverse
transform supplies such a reliable reconstruction of the signal, I
still wonder what the practical problems caused by spectral
smearing look like. In
practical applications, the spectrum would be manipulated somehow
before being reverted into a signal. Imagine the case of a dynamically
changing manipulation, or fast
convolution with a response that stretches over more than one frame.
Would you then see sudden jumps
at the boundaries of the processed signal frames, where they are
'glued'
together? This asks for more detailed tests for which I will need to
write code.
I will postpone that for a later page, and now continue with a related
question.
-
From the moment I started experimenting with Max Msp's FFT objects
one year ago, I wondered how they could reconstruct a real time signal
so faithfully. Think of it: in a 1024 point FFT there are 512
frequency bins below Nyquist. For a regular sample rate of 44.1 KHz
this means a fundamental frequency of about 43 Hz. And also an interval
of 43 Hz between each of the harmonics in the transform. How can an FFT
have knowledge of the inbetween frequencies, and restore these?
Slowly,
I started to recognize that the information must be in the phase
differences of subsequent frames. The frequency bins are discrete, like
the signal samples, but the phase information is a continuum. At least,
to the extent of the datatype's precision. I had a 'Fourierscope' patch
in which the smooth exchange between real and imaginary coefficients
showed up, when a sinusoid was fed in. With gnuplot connected to my own
homebrew FFT utils, I can now check this in more detail.
|
A sinusoid that does 10.2 cycles per FFT frame, completes 60
cycles over 5 FFT frames. It will be phase
shifted with 0.2 cycle over each new FFT frame. That is about 1.25
radians per frame, and a full cycle over 5 frames. I plotted the
transforms of 5 successive frames, so you can
follow how a steady input sinusoid makes the correlations rotate over
the real and imaginary coefficients.
|
Plotting 60 cycles of a cosine will not render a helpful
illustration, but the effect is similar to 6 cycles of a cosine with
periodicity 1.2 as plotted below:
![]() |
Now I want to explore a question that is related to the above topic.
It is said that Fourier coefficients are not localized in time or
space, in contrast with Wavelet coefficients. Indeed, Fourier Transform
computes an average for the presence of each frequency, high or low,
over the full width of a frame shared by all frequencies. The
coefficients of the
fundamental frequency
of an FFT frame show exactly the detail there is to show about that
range. But
for all higher frequencies, details in time domain may exist, which are
'summarized' in the frequency coefficients.
Does that mean the
transform wipes the time position of (sudden) signal events within the
frame? Dzjee, that would make Fourier Transform a pretty lousy tool for
signal processing! Of course it is not the case, but how does a
spectrum retain precise information about time domain events? To find
out, I transformed vectors containing single pulses at different time
positions.
|
The first transform is of a so-called Kronecker delta at time
zero. The Kronecker delta means just a discrete value 1 at a certain
position and zero elsewere. The output looks rather dull. I have
learnt,
and believed, that a single pulse contains all frequencies in equal
amount. The plot supports this common knowledge. The energy of the
pulse signal is spread over all 64 real frequency coefficients,
positive, negative, DC and Nyquist. They correlate 0.015625 each after
1/N 'normalisation'. The second transform is of a Kronecker delta at time x[1].
Does it
still contain all frequencies in equal amounts? Yes, but with different
phases. It has the full
flat spectrum, now multiplied by a complex sinusoid unit vector. Transform of a Kronecker delta at time x[2] shows another
familiar pattern, of sinusoids with periodicity 2 this time. The Fourier Transform is really indifferent about time domain
or frequency domain, it just transforms one function to another, and it
happens to be the case that pulses transform to sinusoids and vice
versa. It also shows that time delay translates to rotation of the
phase spectrum. The position of a pulse in time domain can be
accurately
stored in complex frequency information. Look for example how signal pulses x[16]=1 and x[17]=-1 are
stored in a spectrum. These signal pulses are also perfectly
reconstructed by an inverse FFT. I did not plot that here because it
looks so dull, you just have to believe that I checked it. |
Together, the spectrum coefficients keep a perfectly accurate record of
the
position of time/space events. It is just the way they store the
information that
makes them so different from Wavelet coefficients. You can not, by
means of the Fourier coefficients, manipulate
frequencies over intervals smaller than the framesize. Fourier
coefficients do not lend
themselves for meticulous data reduction. They are not helpful in
keeping the fingerprint records of United States citizens
manageable. However, you can do fast convolution with them. Fourier
coefficients are supercool.