how to extract frequency associated with fft values in python

Question:

I used fft function in numpy which resulted in a complex array. How to get the exact frequency values?

Asked By: ria

||

Answers:

The frequency is just the index of the array. At index n, the frequency is 2πn / the array’s length (radians per unit). Consider:

>>> numpy.fft.fft([1,2,1,0,1,2,1,0])
array([ 8.+0.j,  0.+0.j,  0.-4.j,  0.+0.j,  0.+0.j,  0.+0.j,  0.+4.j,
        0.+0.j])

the result has nonzero values at indices 0, 2 and 6. There are 8 elements. This means

       2πit/8 × 0       2πit/8 × 2       2πit/8 × 6
    8 e           - 4i e           + 4i e
y ~ ———————————————————————————————————————————————
                          8
Answered By: kennytm

np.fft.fftfreq tells you the frequencies associated with the coefficients:

import numpy as np

x = np.array([1,2,1,0,1,2,1,0])
w = np.fft.fft(x)
freqs = np.fft.fftfreq(len(x))

for coef,freq in zip(w,freqs):
    if coef:
        print('{c:>6} * exp(2 pi i t * {f})'.format(c=coef,f=freq))

# (8+0j) * exp(2 pi i t * 0.0)
#    -4j * exp(2 pi i t * 0.25)
#     4j * exp(2 pi i t * -0.25)

The OP asks how to find the frequency in Hertz.
I believe the formula is frequency (Hz) = abs(fft_freq * frame_rate).

Here is some code that demonstrates that.

First, we make a wave file at 440 Hz:

import math
import wave
import struct

if __name__ == '__main__':
    # http://stackoverflow.com/questions/3637350/how-to-write-stereo-wav-files-in-python
    # http://www.sonicspot.com/guide/wavefiles.html
    freq = 440.0
    data_size = 40000
    fname = "test.wav"
    frate = 11025.0
    amp = 64000.0
    nchannels = 1
    sampwidth = 2
    framerate = int(frate)
    nframes = data_size
    comptype = "NONE"
    compname = "not compressed"
    data = [math.sin(2 * math.pi * freq * (x / frate))
            for x in range(data_size)]
    wav_file = wave.open(fname, 'w')
    wav_file.setparams(
        (nchannels, sampwidth, framerate, nframes, comptype, compname))
    for v in data:
        wav_file.writeframes(struct.pack('h', int(v * amp / 2)))
    wav_file.close()

This creates the file test.wav.
Now we read in the data, FFT it, find the coefficient with maximum power,
and find the corresponding fft frequency, and then convert to Hertz:

import wave
import struct
import numpy as np

if __name__ == '__main__':
    data_size = 40000
    fname = "test.wav"
    frate = 11025.0
    wav_file = wave.open(fname, 'r')
    data = wav_file.readframes(data_size)
    wav_file.close()
    data = struct.unpack('{n}h'.format(n=data_size), data)
    data = np.array(data)

    w = np.fft.fft(data)
    freqs = np.fft.fftfreq(len(w))
    print(freqs.min(), freqs.max())
    # (-0.5, 0.499975)

    # Find the peak in the coefficients
    idx = np.argmax(np.abs(w))
    freq = freqs[idx]
    freq_in_hertz = abs(freq * frate)
    print(freq_in_hertz)
    # 439.8975
Answered By: unutbu

Here we deal with the Numpy implementation of the fft.

Frequencies associated with DFT values (in python)

By fft, Fast Fourier Transform, we understand a member of a large family of algorithms that enable the fast computation of the DFT, Discrete Fourier Transform, of an equisampled signal.

A DFT converts an ordered sequence of N complex numbers to an ordered sequence of N complex numbers, with the understanding that both sequences are periodic with period N.

In many cases, you think of

  • a signal x, defined in the time domain, of length N, sampled at a constant interval dt
  • its DFT X (here specifically X = np.fft.fft(x)), whose elements are sampled on the frequency axis with a sample rate .

Some definition

  • the period (aka duration²) of the signal x, sampled at dt with N samples is is

     T = dt*N
    
  • the fundamental frequencies (in Hz and in rad/s) of X, your DFT are

     df = 1/T
     dω = 2*pi/T # =df*2*pi
    
  • the top frequency is the Nyquist frequency

     ny = dω*N/2
    

    (NB: the Nyquist frequency is not dω*N

The frequencies associated with a particular element in the DFT

The frequencies corresponding to the elements in X = np.fft.fft(x) for a given index 0<=n<N can be computed as follows:

def rad_on_s(n, N, dω):
    return dω*n if n<N/2 else dω*(n-N)

or in a single sweep

ω = np.array([dω*n if n<N/2 else dω*(n-N) for n in range(N)])

if you prefer to consider frequencies in Hz, s/ω/f/

f = np.array([df*n if n<N/2 else df*(n-N) for n in range(N)])

Using those frequencies

If you want to modify the original signal x -> y applying an operator in the frequency domain in the form of a function of frequency only, the way to go is computing the ω‘s and

Y = X*f(ω)
y = ifft(Y)

Introducing np.fft.fftfreq

Of course numpy has a convenience function np.fft.fftfreq that returns dimensionless frequencies rather than dimensional ones but it’s as easy as

f = np.fft.fftfreq(N)*N*df
ω = np.fft.fftfreq(N)*N*dω

Because df = 1/T and T = N/sps (sps being the number of samples per second) one can also write

f = np.fft.fftfreq(N)*sps

Notes

  1. Dual to the sampling interval dt there is the sampling rate, sr, or how many samples are taken during a unit of time; of course dt=1/sr and sr=1/dt.
  2. Speaking of a duration, even if it is rather common, hides the
    fundamental idea of periodicity.
  3. The concept of Nyquist frequency is clearly exposed in any textbook dealing with the analysis of time signals, and also in the linked Wikipedia article. Does it suffice to say that information cannot be created?
Answered By: gboffi
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.