Scipy processing large data

Question:

I have a dataset which contain only one column (Pandas Series). Dataset is .dat file, which has about 2 000 000 rows and 1 column (166 MB). Reading this data with pd.read_csv takes about 7-8 minutes. This data is a signal, which need to be processed (using scipy.signal). So when I process the data I get MemoryError. Is there a way to speed up the loading of the file and increase the speed of its processing (scipy.signal.ellip) and bypass the memory problem? Thank you in advance.
Loading the data:

data = pd.read_csv('C:/Users/HP/Desktop/Python and programming/Jupyter/Filter/3200_Hz.dat', 
sep='rn', header=None, squeeze=True)

Data processing (takes about 7 minutes too):

b, a = signal.ellip(4, 5, 40, Wn, 'bandpass', analog=False)
output = signal.filtfilt(b, a, data)
#after that plotting 'output' with plt

Example of input data:

6954
 25903
 42882
 17820
  3485
-11456
  4574
 34594
 25520
 26533
  9331
-22503
 14950
 30973
 23398
 41474
  -860
 -8528
Asked By: Alex

||

Answers:

You set 'rn' as a separator, which means (if I understand correctly) that each line equals a new column. That means you’ll end up with millions of columns, and the squeeze argument doesn’t do anything.

Don’t set the sep argument (leave it at its default): newlines will separate the records, and squeeze will then return it into a Series.

Answered By: 9769953
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.