How does the quotechar parameter of the csv reader function work?

Question:

My current understanding of the quotechar parameter is that it surrounds the fields that are separated by a comma. I’m reading the csv documentation for python and have written a similar code to theirs as such:

import csv
with open("test.csv", newline="") as file:
    reader = csv.reader(file, delimiter=",", quotechar="|")
    for row in reader:
        print(row)

My csv file contains the following:

|Hello|,|My|,|name|,|is|,|John|

The output gives a list of strings as expected:

['Hello', 'My', 'name', 'is', 'John']

The problem arises when I have whitespace in between the commas in my csv file.
For example, if i have a whitespace after the closing | of a field like such:

|Hello| ,|My| ,|name| ,|is| ,|John|

It gives the same output as before but now there’s a whitespace included in the strings in the list:

['Hello ', 'My ', 'name ', 'is ', 'John']

It was my understanding that the quotechar parameter would only consider what was between the | symbol.
Any help is greatly appreciated!

Asked By: KrabbyPatty

||

Answers:

The quotechar argument is

A one-character string used to quote fields containing special
characters, such as the delimiter or quotechar, or which contain
new-line characters. It defaults to ‘"’.

For example,

If your csv file contains data of the form

|Hello|,|My|,|name|,|is|,|"John"|
|Hello|,|My|,|name|,|is|,|"Tom"|

then in that case you can’t use the default quotechar which is " because its already present in entities of the csv data so to instruct the csv reader that you want "John" to be included as it is in the output you would specify the some other quotechar, it may be | or ; or any character depending on the requirements.

The output now include John and Tom in quotation marks,

['Hello', 'My', 'name', 'is', '"John"']
['Hello', 'My', 'name', 'is', '"Tom"']

Consider another example where csv field itself contains delimiter, consider the csv file contains

"Fruit","Quantity","Cost"
"Strawberry","1000","$2,200"
"Apple","500","$1,100"

Now in such case you have to specify the quotechar explicitly to instruct the csv reader so that it can distinguish between actual delimiter (control character) and comma (literal characters) in the csv field. Now in this case the quotechar " will also work.


Now coming to your code, you have to replace the extra white space before the delimiter in the csv file with the empty string. You can do this in the following way:

Try this:

from io import StringIO

with open("test.csv", newline="") as f:
    file = StringIO(f.read().replace(" ,", ","))
    reader = csv.reader(file, delimiter=",", quotechar="|")
    for row in reader:
        print(row)

This outputs,

['Hello', 'My', 'name', 'is', 'John']
Answered By: Shubham Sharma
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.