Scipy processing large data

Question

I have a dataset which contain only one column (Pandas Series). Dataset is .dat file, which has about 2 000 000 rows and 1 column (166 MB). Reading this data with pd.read_csv takes about 7-8 minutes. This data is a signal, which need to be processed (using scipy.signal). So when I process the data I get MemoryError. Is there a way to speed up the loading of the file and increase the speed of its processing (scipy.signal.ellip) and bypass the memory problem? Thank you in advance.
Loading the data:

data = pd.read_csv('C:/Users/HP/Desktop/Python and programming/Jupyter/Filter/3200_Hz.dat', 
sep='rn', header=None, squeeze=True)

Data processing (takes about 7 minutes too):

b, a = signal.ellip(4, 5, 40, Wn, 'bandpass', analog=False)
output = signal.filtfilt(b, a, data)
#after that plotting 'output' with plt

Example of input data:

Asked By: Alex

||

Source

Answer 1

You set 'rn' as a separator, which means (if I understand correctly) that each line equals a new column. That means you’ll end up with millions of columns, and the squeeze argument doesn’t do anything.

Don’t set the sep argument (leave it at its default): newlines will separate the records, and squeeze will then return it into a Series.

Answered By: 9769953

Scipy processing large data

Question:

Answers: