How can I use io.StringIO() with the csv module?

Question:

I tried to backport a Python 3 program to 2.7, and I’m stuck with a strange problem:

>>> import io
>>> import csv
>>> output = io.StringIO()
>>> output.write("Hello!")            # Fail: io.StringIO expects Unicode
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unicode argument expected, got 'str'
>>> output.write(u"Hello!")           # This works as expected.
6L
>>> writer = csv.writer(output)       # Now let's try this with the csv module:
>>> csvdata = [u"Hello", u"Goodbye"]  # Look ma, all Unicode! (?)
>>> writer.writerow(csvdata)          # Sadly, no.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unicode argument expected, got 'str'

According to the docs, io.StringIO() returns an in-memory stream for Unicode text. It works correctly when I try and feed it a Unicode string manually. Why does it fail in conjunction with the csv module, even if all the strings being written are Unicode strings? Where does the str come from that causes the Exception?

(I do know that I can use StringIO.StringIO() instead, but I’m wondering what’s wrong with io.StringIO() in this scenario)

Asked By: Tim Pietzcker

||

Answers:

The Python 2.7 csv module doesn’t support Unicode input: see the note at the beginning of the documentation.

It seems that you’ll have to encode the Unicode strings to byte strings, and use io.BytesIO, instead of io.StringIO.

The examples section of the documentation includes examples for a UnicodeReader and UnicodeWriter wrapper classes (thanks @AlexeyKachayev for the pointer).

Answered By: Pedro Romano

From csv documentation:

The csv module doesn’t directly support reading and writing Unicode,
but it is 8-bit-clean save for some problems with ASCII NUL
characters. So you can write functions or classes that handle the
encoding and decoding for you as long as you avoid encodings like
UTF-16 that use NULs. UTF-8 is recommended.

You can find example of UnicodeReader, UnicodeWriter here http://docs.python.org/2/library/csv.html

Answered By: Alexey Kachayev

Please use StringIO.StringIO().

http://docs.python.org/library/io.html#io.StringIO

http://docs.python.org/library/stringio.html

io.StringIO is a class. It handles Unicode. It reflects the preferred Python 3 library structure.

StringIO.StringIO is a class. It handles strings. It reflects the legacy Python 2 library structure.

Answered By: ernest

I found this when I tried to serve a CSV file via Flask directly without creating the CSV file on the file system. This works:

import io
import csv

data = [[u'cell one', u'cell two'], [u'cell three', u'cell four']]

output = io.BytesIO()
writer = csv.writer(output, delimiter=',')
writer.writerows(data)
your_csv_string = output.getvalue()

See also

Answered By: Martin Thoma

To use CSV reader/writer with ‘memory files’ in python 2.7:

from io import BytesIO
import csv

csv_data = """a,b,c
foo,bar,foo"""

# creates and stores your csv data into a file the csv reader can read (bytes)
memory_file_in = BytesIO(csv_data.encode(encoding='utf-8'))

# classic reader
reader = csv.DictReader(memory_file_in)

# writes a csv file
fieldnames = reader.fieldnames  # here we use the data from the above csv file
memory_file_out = BytesIO()     # create a memory file (bytes)

# classic writer (here we copy the first file in the second file)
writer = csv.DictWriter(memory_file_out, fieldnames)
for row in reader:
    print(row)
    writer.writerow(row)
Answered By: pangur
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.