Wednesday, September 14, 2011

Uncertainty principle and spectrogram with pylab

The Fourier transform does not give any information on the time at which a frequency component occurs. One approach which can give information on the time resolution of the spectrum is the Short Time Fourier Transform (STFT). Here a moving window is applied to the signal and the Fourier transform is applied to the signal within the window as the window is moved [Ref]. The choice of window is very important with respect to the performance of the STFT in practice. Since the STFT is simply applying the Fourier transform to pieces of the time series of interest, a drawback of the STFT is that it will not be able to resolve events if they happen to appear within the width of the window. In this case, the lack of time resolution of the Fourier transform is present. In general, one cannot achieve simultaneous time and frequency resolution because of the Heisenberg uncertain principle. In the field of particle physics, an elementary particle does not have precise position and momentum. The better one determines the position of the particle, the less precisely is know at that time, and vice versa. For signal processing, this rule translates into the fact that a signal does not simultaneously have a precise location in time and precise frequency [Ref].

The library pylab provides the function specgram(...) to compute the spectrogram of a signal using the STFT. The following script uses that function to show the spectrogram of a signal with different windows size:
from import read,write
from pylab import plot,show,subplot,specgram

# Open the Homer Simpson voice: "Ummm, Crumbled up cookie things."
# from
rate,data = read('mcrumble.wav') # reading

# NFFT is the number of data points used in each block for the FFT
# and noverlap is the number of points of overlap between blocks
specgram(data, NFFT=128, noverlap=0) # small window
specgram(data, NFFT=512, noverlap=0) 
specgram(data, NFFT=1024, noverlap=0) # big window

This image is the result:
The pictures shows that changing the number of data points used for each Fourier transform block, the spectrogram loses definition in frequency or in the time.


  1. Hi there,

    how can I create an "animated" spectrogram, by e.g. capture live data stream directly from the sound card? Can this be done using the "specgram" function as well?


  2. Hello Jan,
    Yes, you can use specgram(...) to create an animated spectrogram. You have to read periodically n samples from your sound card and put them in a vector y, each time you read n sample you can update the graph using specgram(y,...) and then show().

    I used a similar approach here:

    And a different one here:

  3. Hi...

    When im trying to run this code i get this error:

    /usr/local/lib/python2.7/dist-packages/scipy/io/ WavFileWarning: chunk not understood
    warnings.warn("chunk not understood", WavFileWarning)
    Traceback (most recent call last):
    File "", line 4, in
    rate,data = read('rafa.wav') # reading
    File "/usr/local/lib/python2.7/dist-packages/scipy/io/", line 127, in read
    size = struct.unpack(fmt, data)[0]
    struct.error: unpack requires a string argument of length 4

    You know why???

  4. Hi Rafael, your file could be corrupted. Try with this one:

  5. Hi JustGlowing, I am very new to audio processing. Please pardon me for the ignorance. Could you tell me what is on the y-axis of the spectrogram produced? What do the values 0.0 and 1.0 represent?

    1. On the x axis you have time, on the y there should be frequencies. Check the specgram help to interpret the values OK on the y.

    2. Thanks for the clarification.
      I checked the help given here -

      So frequencies are returned in an 1-D array. How do you convert the values (0-1.0) to its corresponding value in Hz and display the same on the y-axis.

      Also, how is the timeline in the first figure (Fig.1 - wav signal) coordinated with the time line on the second figure (Fig. 2 - spectrogram)?

      What I mean is Fig.1 has values from 0 to 40,000 and Fig.2 runs from 0 - 20,000. So how do they correlate in time?


    3. Each row of the spectrogram is computed on a window of the original signal. You can correlate the x axis of signal with the one of the x axis through the window size.

  6. Hi!
    I've tried your code with Equation 'song' by Aphex Twin. At the end of the song there should be a face in specgram plot but there is none!
    Do you know why?

  7. Hi JustGlowing,

    how can i change the x-axis of a spectogram in python?
    can you help with this problem?

    normally x axis shows the various time instants at which the the FFT is calculated but I want to change it with some other parameter.

    or maybe I can just change the display ?? even that would be ok. as only me and my colleague would be using the code.

    1. hi Janmejay, check the function xticks. This function changes the values displayed on the x-axis.