Python pandas to_csv causes OSError: [Errno 22] Invalid argument

Question:

My code is the following:

import pandas as pd
import numpy as np

df = pd.read_csv("path/to/my/infile.csv")
df = df.sort_values(['distance', 'time'])
df.to_csv("path/to/my/outfile.csv")

this code reads from infile.csv which is a 3GB csv file successfully, sorts it and fails when trying to write to outfile.csv with the following error:

OSError                                   Traceback (most recent call last)
<ipython-input-10-3a5c8279658d> in <module>
----> 1 df.to_csv('/Users/joaomatos/Desktop/cluster22_sorted_training.csv',index=False)

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/core/frame.py in to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, line_terminator, chunksize, tupleize_cols, date_format, doublequote, escapechar, decimal)
   1743                                  doublequote=doublequote,
   1744                                  escapechar=escapechar, decimal=decimal)
-> 1745         formatter.save()
   1746 
   1747         if path_or_buf is None:

/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pandas/io/formats/csvs.py in save(self)
    164                                          encoding=encoding,
    165                                          compression=self.compression)
--> 166                 f.write(buf)
    167                 f.close()
    168                 for _fh in handles:

OSError: [Errno 22] Invalid argument

My question is why?

Thank you for your help

Asked By: João Matos

||

Answers:

Apparently this problem is caused by a known bug reported here associated with a previous version of pandas. All I had to do was pip3 install --upgrade pandas and then restart the computer.

Answered By: João Matos

I just had a similar issue and I was using back slash which usually works in the past but this time turn out I had to use / instead, which is extremely weird but it worked.

Answered By: Vivian Ge

In my case (working on an external hard drive) it worked once I specified the absolute, rather than the relative, path.

Answered By: kjohnsen

After exploring a lot of options, including the pandas library update to the latest version (1.2.4 as of today), changing the engine to "python" or "c", debugging, etc. I finally discovered what the issue was:

I had my CSV files stored in a folder that was constantly being synchronized in real-time with OneDrive.

YES! I discovered that the tray icon was becoming crazy and OneDrive was consuming resources at the same time I was doing algorithmic trading backtesting to my pet project. I paused sync and then it never failed again!!

I guess you can also exclude the folder from OneDrive or simply change the location where the CSVs are stored/written/accessed.

Answered By: Nicolás M.

Just remove any file in "path/to/my/" or try:

if not files_present:
pd.to_csv(filename)
else:
print ‘WARNING: This file already exists!’

Answered By: Everton Silva

I resolved this by removing unusual characters contained in the timestamp that made up part of the filename.

Not working:

file_timestamp = datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")
df.to_csv(file_name + '_' + file_timestamp + '.csv')

Working:

file_timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
df.to_csv(file_name + '_' + file_timestamp + '.csv')
Answered By: David Ebert
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.