CSV in Python adding an extra carriage return, on Windows
Question:
import csv
with open('test.csv', 'w') as outfile:
writer = csv.writer(outfile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
writer.writerow(['hi', 'dude'])
writer.writerow(['hi2', 'dude2'])
The above code generates a file, test.csv
, with an extra r
at each row, like so:
hi,duderrnhi2,dude2rrn
instead of the expected
hi,dudernhi2,dude2rn
Why is this happening, or is this actually the desired behavior?
Answers:
Python 3:
The official csv
documentation recommends open
ing the file with newline=''
on all platforms to disable universal newlines translation:
with open('output.csv', 'w', newline='', encoding='utf-8') as f:
writer = csv.writer(f)
...
The CSV writer terminates each line with the lineterminator
of the dialect, which is 'rn'
for the default excel
dialect on all platforms because that’s what RFC 4180 recommends.
Python 2:
On Windows, always open your files in binary mode ("rb"
or "wb"
), before passing them to csv.reader
or csv.writer
.
Although the file is a text file, CSV is regarded a binary format by the libraries involved, with rn
separating records. If that separator is written in text mode, the Python runtime replaces the n
with rn
, hence the rrn
observed in the file.
See this previous answer.
While @john-machin gives a good answer, it’s not always the best approach. For example, it doesn’t work on Python 3 unless you encode all of your inputs to the CSV writer. Also, it doesn’t address the issue if the script wants to use sys.stdout as the stream.
I suggest instead setting the ‘lineterminator’ attribute when creating the writer:
import csv
import sys
doc = csv.writer(sys.stdout, lineterminator='n')
doc.writerow('abc')
doc.writerow(range(3))
That example will work on Python 2 and Python 3 and won’t produce the unwanted newline characters. Note, however, that it may produce undesirable newlines (omitting the LF character on Unix operating systems).
In most cases, however, I believe that behavior is preferable and more natural than treating all CSV as a binary format. I provide this answer as an alternative for your consideration.
In Python 3 (I haven’t tried this in Python 2), you can also simply do
with open('output.csv','w',newline='') as f:
writer=csv.writer(f)
writer.writerow(mystuff)
...
as per documentation.
More on this in the doc’s footnote:
If newline=” is not specified, newlines embedded inside quoted fields
will not be interpreted correctly, and on platforms that use rn
linendings on write an extra r will be added. It should always be
safe to specify newline=”, since the csv module does its own
(universal) newline handling.
You have to add attribute newline=”n” to open function like this:
with open('file.csv','w',newline="n") as out:
csv_out = csv.writer(out, delimiter =';')
You can introduce the lineterminator=’n’ parameter in the csv writer command.
import csv
delimiter='t'
with open('tmp.csv', '+w', encoding='utf-8') as stream:
writer = csv.writer(stream, delimiter=delimiter, quoting=csv.QUOTE_NONE, quotechar='', lineterminator='n')
writer.writerow(['A1' , 'B1', 'C1'])
writer.writerow(['A2' , 'B2', 'C2'])
writer.writerow(['A3' , 'B3', 'C3'])
Note that if you use DictWriter, you will have a new line from the open function and a new line from the writerow function.
You can use newline=” within the open function to remove the extra newline.
import csv
with open('test.csv', 'w') as outfile:
writer = csv.writer(outfile, delimiter=',', quoting=csv.QUOTE_MINIMAL)
writer.writerow(['hi', 'dude'])
writer.writerow(['hi2', 'dude2'])
The above code generates a file, test.csv
, with an extra r
at each row, like so:
hi,duderrnhi2,dude2rrn
instead of the expected
hi,dudernhi2,dude2rn
Why is this happening, or is this actually the desired behavior?
Python 3:
The official csv
documentation recommends open
ing the file with newline=''
on all platforms to disable universal newlines translation:
with open('output.csv', 'w', newline='', encoding='utf-8') as f:
writer = csv.writer(f)
...
The CSV writer terminates each line with the lineterminator
of the dialect, which is 'rn'
for the default excel
dialect on all platforms because that’s what RFC 4180 recommends.
Python 2:
On Windows, always open your files in binary mode ("rb"
or "wb"
), before passing them to csv.reader
or csv.writer
.
Although the file is a text file, CSV is regarded a binary format by the libraries involved, with rn
separating records. If that separator is written in text mode, the Python runtime replaces the n
with rn
, hence the rrn
observed in the file.
See this previous answer.
While @john-machin gives a good answer, it’s not always the best approach. For example, it doesn’t work on Python 3 unless you encode all of your inputs to the CSV writer. Also, it doesn’t address the issue if the script wants to use sys.stdout as the stream.
I suggest instead setting the ‘lineterminator’ attribute when creating the writer:
import csv
import sys
doc = csv.writer(sys.stdout, lineterminator='n')
doc.writerow('abc')
doc.writerow(range(3))
That example will work on Python 2 and Python 3 and won’t produce the unwanted newline characters. Note, however, that it may produce undesirable newlines (omitting the LF character on Unix operating systems).
In most cases, however, I believe that behavior is preferable and more natural than treating all CSV as a binary format. I provide this answer as an alternative for your consideration.
In Python 3 (I haven’t tried this in Python 2), you can also simply do
with open('output.csv','w',newline='') as f:
writer=csv.writer(f)
writer.writerow(mystuff)
...
as per documentation.
More on this in the doc’s footnote:
If newline=” is not specified, newlines embedded inside quoted fields
will not be interpreted correctly, and on platforms that use rn
linendings on write an extra r will be added. It should always be
safe to specify newline=”, since the csv module does its own
(universal) newline handling.
You have to add attribute newline=”n” to open function like this:
with open('file.csv','w',newline="n") as out:
csv_out = csv.writer(out, delimiter =';')
You can introduce the lineterminator=’n’ parameter in the csv writer command.
import csv
delimiter='t'
with open('tmp.csv', '+w', encoding='utf-8') as stream:
writer = csv.writer(stream, delimiter=delimiter, quoting=csv.QUOTE_NONE, quotechar='', lineterminator='n')
writer.writerow(['A1' , 'B1', 'C1'])
writer.writerow(['A2' , 'B2', 'C2'])
writer.writerow(['A3' , 'B3', 'C3'])
Note that if you use DictWriter, you will have a new line from the open function and a new line from the writerow function.
You can use newline=” within the open function to remove the extra newline.