How to write header row with csv.DictWriter?
Question:
Assume I have a csv.DictReader
object and I want to write it out as a CSV file. How can I do this?
I know that I can write the rows of data like this:
dr = csv.DictReader(open(f), delimiter='t')
# process my dr object
# ...
# write out object
output = csv.DictWriter(open(f2, 'w'), delimiter='t')
for item in dr:
output.writerow(item)
But how can I include the fieldnames?
Answers:
Edit:
In 2.7 / 3.2 there is a new writeheader()
method. Also, John Machin’s answer provides a simpler method of writing the header row.
Simple example of using the writeheader()
method now available in 2.7 / 3.2:
from collections import OrderedDict
ordered_fieldnames = OrderedDict([('field1',None),('field2',None)])
with open(outfile,'wb') as fou:
dw = csv.DictWriter(fou, delimiter='t', fieldnames=ordered_fieldnames)
dw.writeheader()
# continue on to write data
Instantiating DictWriter requires a fieldnames argument.
From the documentation:
The fieldnames parameter identifies
the order in which values in the
dictionary passed to the writerow()
method are written to the csvfile.
Put another way: The Fieldnames argument is required because Python dicts are inherently unordered.
Below is an example of how you’d write the header and data to a file.
Note: with
statement was added in 2.6. If using 2.5: from __future__ import with_statement
with open(infile,'rb') as fin:
dr = csv.DictReader(fin, delimiter='t')
# dr.fieldnames contains values from first row of `f`.
with open(outfile,'wb') as fou:
dw = csv.DictWriter(fou, delimiter='t', fieldnames=dr.fieldnames)
headers = {}
for n in dw.fieldnames:
headers[n] = n
dw.writerow(headers)
for row in dr:
dw.writerow(row)
As @FM mentions in a comment, you can condense header-writing to a one-liner, e.g.:
with open(outfile,'wb') as fou:
dw = csv.DictWriter(fou, delimiter='t', fieldnames=dr.fieldnames)
dw.writerow(dict((fn,fn) for fn in dr.fieldnames))
for row in dr:
dw.writerow(row)
A few options:
(1) Laboriously make an identity-mapping (i.e. do-nothing) dict out of your fieldnames so that csv.DictWriter can convert it back to a list and pass it to a csv.writer instance.
(2) The documentation mentions “the underlying writer
instance” … so just use it (example at the end).
dw.writer.writerow(dw.fieldnames)
(3) Avoid the csv.Dictwriter overhead and do it yourself with csv.writer
Writing data:
w.writerow([d[k] for k in fieldnames])
or
w.writerow([d.get(k, restval) for k in fieldnames])
Instead of the extrasaction
“functionality”, I’d prefer to code it myself; that way you can report ALL “extras” with the keys and values, not just the first extra key. What is a real nuisance with DictWriter is that if you’ve verified the keys yourself as each dict was being built, you need to remember to use extrasaction=’ignore’ otherwise it’s going to SLOWLY (fieldnames is a list) repeat the check:
wrong_fields = [k for k in rowdict if k not in self.fieldnames]
============
>>> f = open('csvtest.csv', 'wb')
>>> import csv
>>> fns = 'foo bar zot'.split()
>>> dw = csv.DictWriter(f, fns, restval='Huh?')
# dw.writefieldnames(fns) -- no such animal
>>> dw.writerow(fns) # no such luck, it can't imagine what to do with a list
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:python26libcsv.py", line 144, in writerow
return self.writer.writerow(self._dict_to_list(rowdict))
File "C:python26libcsv.py", line 141, in _dict_to_list
return [rowdict.get(key, self.restval) for key in self.fieldnames]
AttributeError: 'list' object has no attribute 'get'
>>> dir(dw)
['__doc__', '__init__', '__module__', '_dict_to_list', 'extrasaction', 'fieldnam
es', 'restval', 'writer', 'writerow', 'writerows']
# eureka
>>> dw.writer.writerow(dw.fieldnames)
>>> dw.writerow({'foo':'oof'})
>>> f.close()
>>> open('csvtest.csv', 'rb').read()
'foo,bar,zotrnoof,Huh?,Huh?rn'
>>>
Another way to do this would be to add before adding lines in your output, the following line :
output.writerow(dict(zip(dr.fieldnames, dr.fieldnames)))
The zip would return a list of doublet containing the same value. This list could be used to initiate a dictionary.
writer.writeheader()
Above works like charm. Just don’t forget to add your fieldnames
attribute when intializing your csv.DictWriter
Assume I have a csv.DictReader
object and I want to write it out as a CSV file. How can I do this?
I know that I can write the rows of data like this:
dr = csv.DictReader(open(f), delimiter='t')
# process my dr object
# ...
# write out object
output = csv.DictWriter(open(f2, 'w'), delimiter='t')
for item in dr:
output.writerow(item)
But how can I include the fieldnames?
Edit:
In 2.7 / 3.2 there is a new writeheader()
method. Also, John Machin’s answer provides a simpler method of writing the header row.
Simple example of using the writeheader()
method now available in 2.7 / 3.2:
from collections import OrderedDict
ordered_fieldnames = OrderedDict([('field1',None),('field2',None)])
with open(outfile,'wb') as fou:
dw = csv.DictWriter(fou, delimiter='t', fieldnames=ordered_fieldnames)
dw.writeheader()
# continue on to write data
Instantiating DictWriter requires a fieldnames argument.
From the documentation:
The fieldnames parameter identifies
the order in which values in the
dictionary passed to the writerow()
method are written to the csvfile.
Put another way: The Fieldnames argument is required because Python dicts are inherently unordered.
Below is an example of how you’d write the header and data to a file.
Note: with
statement was added in 2.6. If using 2.5: from __future__ import with_statement
with open(infile,'rb') as fin:
dr = csv.DictReader(fin, delimiter='t')
# dr.fieldnames contains values from first row of `f`.
with open(outfile,'wb') as fou:
dw = csv.DictWriter(fou, delimiter='t', fieldnames=dr.fieldnames)
headers = {}
for n in dw.fieldnames:
headers[n] = n
dw.writerow(headers)
for row in dr:
dw.writerow(row)
As @FM mentions in a comment, you can condense header-writing to a one-liner, e.g.:
with open(outfile,'wb') as fou:
dw = csv.DictWriter(fou, delimiter='t', fieldnames=dr.fieldnames)
dw.writerow(dict((fn,fn) for fn in dr.fieldnames))
for row in dr:
dw.writerow(row)
A few options:
(1) Laboriously make an identity-mapping (i.e. do-nothing) dict out of your fieldnames so that csv.DictWriter can convert it back to a list and pass it to a csv.writer instance.
(2) The documentation mentions “the underlying writer
instance” … so just use it (example at the end).
dw.writer.writerow(dw.fieldnames)
(3) Avoid the csv.Dictwriter overhead and do it yourself with csv.writer
Writing data:
w.writerow([d[k] for k in fieldnames])
or
w.writerow([d.get(k, restval) for k in fieldnames])
Instead of the extrasaction
“functionality”, I’d prefer to code it myself; that way you can report ALL “extras” with the keys and values, not just the first extra key. What is a real nuisance with DictWriter is that if you’ve verified the keys yourself as each dict was being built, you need to remember to use extrasaction=’ignore’ otherwise it’s going to SLOWLY (fieldnames is a list) repeat the check:
wrong_fields = [k for k in rowdict if k not in self.fieldnames]
============
>>> f = open('csvtest.csv', 'wb')
>>> import csv
>>> fns = 'foo bar zot'.split()
>>> dw = csv.DictWriter(f, fns, restval='Huh?')
# dw.writefieldnames(fns) -- no such animal
>>> dw.writerow(fns) # no such luck, it can't imagine what to do with a list
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:python26libcsv.py", line 144, in writerow
return self.writer.writerow(self._dict_to_list(rowdict))
File "C:python26libcsv.py", line 141, in _dict_to_list
return [rowdict.get(key, self.restval) for key in self.fieldnames]
AttributeError: 'list' object has no attribute 'get'
>>> dir(dw)
['__doc__', '__init__', '__module__', '_dict_to_list', 'extrasaction', 'fieldnam
es', 'restval', 'writer', 'writerow', 'writerows']
# eureka
>>> dw.writer.writerow(dw.fieldnames)
>>> dw.writerow({'foo':'oof'})
>>> f.close()
>>> open('csvtest.csv', 'rb').read()
'foo,bar,zotrnoof,Huh?,Huh?rn'
>>>
Another way to do this would be to add before adding lines in your output, the following line :
output.writerow(dict(zip(dr.fieldnames, dr.fieldnames)))
The zip would return a list of doublet containing the same value. This list could be used to initiate a dictionary.
writer.writeheader()
Above works like charm. Just don’t forget to add your fieldnames
attribute when intializing your csv.DictWriter