"No columns to parse from file" error when trying to transform string into Pandas dataframe

Question:

I have a string object ("textData") which contains CSV data.

I’m able to save it as CSV by:

    with open(fileName, "w") as text_file:
        print(textData, file=text_file)

but I would like to work with the data in pandas before saving the csv. So I’m trying to get the data into a pandas df.

import pandas as pd
from io import StringIO

df = pd.read_csv(StringIO(textData), sep=",")

I get this error: EmptyDataError: No columns to parse from file

This is a the textData string:

R$M21,2021-01-26,1.3265,1.3265,1.3265,1.3265,0,0
R$M21,2021-01-27,1.3263,1.3263,1.3263,1.3263,0,0
R$M21,2021-01-28,1.3319,1.3319,1.3319,1.3319,0,0
R$M21,2021-01-29,1.3287,1.3287,1.3287,1.3287,0,0
R$M21,2021-02-01,1.3315,1.3315,1.3315,1.3315,0,0
R$M21,2021-02-02,1.3328,1.3328,1.3328,1.3328,0,0
R$M21,2021-02-03,1.3331,1.3331,1.3331,1.3331,0,0
R$M21,2021-02-04,1.3361,1.3361,1.3361,1.3361,0,0
R$M21,2021-02-05,1.3383,1.3383,1.3383,1.3383,0,0
R$M21,2021-02-08,1.3354,1.3354,1.3354,1.3354,0,0
R$M21,2021-02-09,1.3279,1.3279,1.3279,1.3279,0,0
R$M21,2021-02-10,1.3259,1.3259,1.3259,1.3259,0,0
R$M21,2021-02-11,1.3253,1.3253,1.3253,1.3253,0,0
R$M21,2021-02-12,1.3272,1.3272,1.3272,1.3272,0,0
R$M21,2021-02-15,1.3224,1.3224,1.3224,1.3224,0,0
R$M21,2021-02-16,1.3232,1.3232,1.3232,1.3232,0,0
R$M21,2021-02-17,1.329,1.329,1.329,1.329,0,0
R$M21,2021-02-18,1.3275,1.3275,1.3275,1.3275,0,0
R$M21,2021-02-19,1.3246,1.3246,1.3246,1.3246,0,0
R$M21,2021-02-22,1.3235,1.3235,1.3235,1.3235,0,0
R$M21,2021-02-23,1.3216,1.3216,1.3216,1.3216,0,0
R$M21,2021-02-24,1.321,1.321,1.321,1.321,0,0
R$M21,2021-02-25,1.3181,1.3181,1.3181,1.3181,0,0
R$M21,2021-02-26,1.3313,1.3313,1.3313,1.3313,0,0
R$M21,2021-03-01,1.3323,1.3323,1.3323,1.3323,0,0
R$M21,2021-03-02,1.3315,1.3315,1.3315,1.3315,0,0
R$M21,2021-03-03,1.3309,1.3309,1.3309,1.3309,0,0
R$M21,2021-03-04,1.3328,1.3328,1.3328,1.3328,0,0
R$M21,2021-03-05,1.3417,1.3417,1.3417,1.3417,0,0
R$M21,2021-03-08,1.3479,1.3479,1.3479,1.3479,0,0
R$M21,2021-03-09,1.345,1.345,1.345,1.345,0,0
R$M21,2021-03-10,1.3476,1.3476,1.3476,1.3476,0,0
R$M21,2021-03-11,1.3403,1.3403,1.3403,1.3403,0,0
R$M21,2021-03-12,1.3463,1.3463,1.3463,1.3463,0,0
R$M21,2021-03-15,1.3456,1.3456,1.3456,1.3456,35,35
R$M21,2021-03-16,1.3455,1.3456,1.3452,1.3454,85,20
R$M21,2021-03-17,1.3457,1.3479,1.3451,1.3479,0,20
R$M21,2021-03-18,1.3432,1.3432,1.3432,1.3432,0,20
R$M21,2021-03-19,1.3425,1.3425,1.3425,1.3425,20,0
R$M21,2021-03-22,1.3434,1.3434,1.3405,1.3405,20,0
R$M21,2021-03-23,1.3433,1.3433,1.3433,1.3433,0,0
R$M21,2021-03-24,1.3461,1.3461,1.3461,1.3461,6,6
R$M21,2021-03-25,1.3476,1.3476,1.3472,1.3472,0,6
R$M21,2021-03-26,1.3477,1.3477,1.3477,1.3477,0,6
R$M21,2021-03-29,1.3467,1.3467,1.3467,1.3467,0,6
R$M21,2021-03-30,1.3483,1.3483,1.3483,1.3483,0,6
R$M21,2021-03-31,1.3448,1.3448,1.3448,1.3448,0,6
R$M21,2021-04-01,1.3461,1.3461,1.3461,1.3461,0,6
R$M21,2021-04-02,1.3442,1.3442,1.3442,1.3442,0,6
R$M21,2021-04-05,1.3446,1.3446,1.3446,1.3446,0,6
R$M21,2021-04-06,1.3418,1.3418,1.3418,1.3418,10,11
R$M21,2021-04-07,1.339,1.3398,1.3389,1.3389,0,11
R$M21,2021-04-08,1.3406,1.3406,1.3406,1.3406,0,11
R$M21,2021-04-09,1.3411,1.3411,1.3411,1.3411,23,28
R$M21,2021-04-12,1.3427,1.3427,1.3406,1.3406,3,31
R$M21,2021-04-13,1.3425,1.3431,1.3425,1.3431,20,51
R$M21,2021-04-14,1.3374,1.3378,1.3374,1.3375,0,51
R$M21,2021-04-15,1.335,1.335,1.335,1.335,217,222
R$M21,2021-04-16,1.3358,1.3358,1.3337,1.3337,416,407
R$M21,2021-04-19,1.3344,1.3346,1.331,1.331,370,428
R$M21,2021-04-20,1.3305,1.3316,1.3265,1.3283,5,431
R$M21,2021-04-21,1.3291,1.3302,1.3291,1.3302,100,422
R$M21,2021-04-22,1.3304,1.3304,1.3279,1.3279,10,427
R$M21,2021-04-23,1.3277,1.3277,1.3274,1.3274,16,437
R$M21,2021-04-26,1.3273,1.3273,1.3256,1.326,204,438
R$M21,2021-04-27,1.3259,1.3267,1.3255,1.3257,79,429
R$M21,2021-04-28,1.3274,1.3278,1.3262,1.3262,22,441
R$M21,2021-04-29,1.326,1.3265,1.3245,1.3255,16,457
R$M21,2021-04-30,1.3266,1.3277,1.3266,1.3277,60,457
R$M21,2021-05-03,1.328,1.3341,1.328,1.3318,8,458
R$M21,2021-05-04,1.3298,1.3366,1.3298,1.3366,110,466
R$M21,2021-05-05,1.3376,1.3387,1.3351,1.3358,0,466
R$M21,2021-05-06,1.3349,1.3349,1.3349,1.3349,1,467
R$M21,2021-05-07,1.332,1.332,1.3316,1.3316,25,466
R$M21,2021-05-10,1.3263,1.3263,1.3247,1.3247,187,480
R$M21,2021-05-11,1.3244,1.3276,1.3244,1.3251,6,486
R$M21,2021-05-12,1.329,1.329,1.3287,1.3287,119,586
R$M21,2021-05-13,1.3312,1.3366,1.3294,1.3343,270,738
R$M21,2021-05-14,1.3346,1.3371,1.3338,1.3338,392,841
R$M21,2021-05-17,1.3332,1.3361,1.3319,1.3356,99,835
R$M21,2021-05-18,1.3358,1.3358,1.3295,1.33,93,785
R$M21,2021-05-19,1.3295,1.333,1.3287,1.3328,25,784
R$M21,2021-05-20,1.335,1.3354,1.3326,1.3329,26,773
R$M21,2021-05-21,1.3309,1.3309,1.3301,1.3301,25,777
R$M21,2021-05-24,1.3298,1.3318,1.3298,1.3301,39,767
R$M21,2021-05-25,1.3293,1.3293,1.3253,1.3254,28,782
R$M21,2021-05-26,1.3249,1.3249,1.323,1.3235,48,770
R$M21,2021-05-27,1.3245,1.3247,1.3229,1.3229,51,805
R$M21,2021-05-28,1.3238,1.3247,1.323,1.3244,76,826
R$M21,2021-05-31,1.3237,1.3237,1.3223,1.3226,16,826
R$M21,2021-06-01,1.3194,1.3227,1.3194,1.3227,34,808
R$M21,2021-06-02,1.323,1.3248,1.322,1.3248,50,785
R$M21,2021-06-03,1.3235,1.3245,1.3228,1.3244,137,720
R$M21,2021-06-04,1.3276,1.3285,1.3274,1.3285,219,564
R$M21,2021-06-07,1.3251,1.3252,1.3232,1.3232,42,544
R$M21,2021-06-08,1.3236,1.3238,1.3226,1.3237,290,343
R$M21,2021-06-09,1.3232,1.3243,1.3231,1.3233,48,343
R$M21,2021-06-10,1.3239,1.3253,1.3238,1.3244,406,292
R$M21,2021-06-11,1.3249,1.3261,1.3217,1.324,107,0
R$M21,2021-06-14,1.3252,1.3271,1.3252,1.3261,107,0

What am I doing wrong?
Thanks

Asked By: younggotti

||

Answers:

The error is in the parts you aren’t showing us, because your code works fine. I’m guessing you don’t have newlines separating the lines.

C:tmp>type x.py

textData="""
R$M21,2021-06-08,1.3236,1.3238,1.3226,1.3237,290,343
R$M21,2021-06-09,1.3232,1.3243,1.3231,1.3233,48,343
R$M21,2021-06-10,1.3239,1.3253,1.3238,1.3244,406,292
R$M21,2021-06-11,1.3249,1.3261,1.3217,1.324,107,0
R$M21,2021-06-14,1.3252,1.3271,1.3252,1.3261,107,0"""

import pandas as pd
from io import StringIO

df = pd.read_csv(StringIO(textData), sep=",")
print(df)

C:tmp>python x.py
   R$M21  2021-06-08  1.3236  1.3238  1.3226  1.3237  290  343
0  R$M21  2021-06-09  1.3232  1.3243  1.3231  1.3233   48  343
1  R$M21  2021-06-10  1.3239  1.3253  1.3238  1.3244  406  292
2  R$M21  2021-06-11  1.3249  1.3261  1.3217  1.3240  107    0
3  R$M21  2021-06-14  1.3252  1.3271  1.3252  1.3261  107    0

C:tmp>
Answered By: Tim Roberts

First, make sure to add newline after each line, best through os.linesep.

Then set the StringIO buffer "head position" to start, aka 0, before passing it to pandas:

import os
import pandas as pd
from io import StringIO

buffer = StringIO()
buffer.write('hello,23,2022,bye' + os.linesep)
buffer.write('world,43,2025,then' + os.linesep)
buffer.seek(0)
df = pd.read_csv(buffer, sep=',', header=None)

print(df)

This will yield:

       0   1     2     3
0  hello  23  2022   bye
1  world  43  2025  then

[Python-3.9]

Answered By: raoulsson