Recreating time series data using FFT results without using ifft

Question:

I analyzed the sunspots.dat data (below) using fft which is a classic example in this area. I obtained results from fft in real and imaginery parts. Then I tried to use these coefficients (first 20) to recreate the data following the formula for Fourier transform. Thinking real parts correspond to a_n and imaginery to b_n, I have

import numpy as np
from scipy import *
from matplotlib import pyplot as gplt
from scipy import fftpack

def f(Y,x):
    total = 0
    for i in range(20):
        total += Y.real[i]*np.cos(i*x) + Y.imag[i]*np.sin(i*x)
    return total

tempdata = np.loadtxt("sunspots.dat")

year=tempdata[:,0]
wolfer=tempdata[:,1]

Y=fft(wolfer)
n=len(Y)
print n

xs = linspace(0, 2*pi,1000)
gplt.plot(xs, [f(Y, x) for x in xs], '.')
gplt.show()       

For some reason however, my plot does not mirror the one generated by ifft (I use the same number of coefficients on both sides). What could be wrong ?

Data:

http://linuxgazette.net/115/misc/andreasen/sunspots.dat

Asked By: BBSysDyn

||

Answers:

When you called fft(wolfer), you told the transform to assume a fundamental period equal to the length of the data. To reconstruct the data, you have to use basis functions of the same fundamental period = 2*pi/N. By the same token, your time index xs has to range over the time samples of the original signal.

Another mistake was in forgetting to do to the full complex multiplication. It’s easier to think of this as Y[omega]*exp(1j*n*omega/N).

Here’s the fixed code. Note I renamed i to ctr to avoid confusion with sqrt(-1), and n to N to follow the usual signal processing convention of using the lower case for a sample, and the upper case for total sample length. I also imported __future__ division to avoid confusion about integer division.

forgot to add earlier: Note that SciPy’s fft doesn’t divide by N after accumulating. I didn’t divide this out before using Y[n]; you should if you want to get back the same numbers, rather than just seeing the same shape.

And finally, note that I am summing over the full range of frequency coefficients. When I plotted np.abs(Y), it looked like there were significant values in the upper frequencies, at least until sample 70 or so. I figured it would be easier to understand the result by summing over the full range, seeing the correct result, then paring back coefficients and seeing what happens.

from __future__ import division
import numpy as np
from scipy import *
from matplotlib import pyplot as gplt
from scipy import fftpack

def f(Y,x, N):
    total = 0
    for ctr in range(len(Y)):
        total += Y[ctr] * (np.cos(x*ctr*2*np.pi/N) + 1j*np.sin(x*ctr*2*np.pi/N))
    return real(total)

tempdata = np.loadtxt("sunspots.dat")

year=tempdata[:,0]
wolfer=tempdata[:,1]

Y=fft(wolfer)
N=len(Y)
print(N)

xs = range(N)
gplt.plot(xs, [f(Y, x, N) for x in xs])
gplt.show()
Answered By: mtrw

The answer from mtrw was extremely helpful and helped me answer the same question as the OP, but my head almost exploded trying to understand the nested loop.

Here’s the last part but with numpy broadcasting (not sure if this even existed when the question was asked) rather than calling the f function:

xs = np.arange(N)
omega = 2*np.pi/N
phase = omega * xs[:,None] * xs[None,:]
reconstruct = Y[None,:] * (np.cos(phase) + 1j*np.sin(phase))
reconstruct = (reconstruct).sum(axis=1).real / N

# same output
plt.plot(reconstruct)
plt.plot(wolfer)
Answered By: SuperCodeBrah
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.