Read in a CSV file in lower case with Python

Question:

I’m reading a CSV file into a namedtuple as so:

import csv
from collections import namedtuple

#So we can handle bad CSV files gracefully
def unfussy_reader(reader):
    while True:
        try:
            yield next(reader.lower())

        # This is a bad row that has an error in it (csv.Error)
        # Alternately it may be a line that doesn't map to the structure we've been given (TypeError)
        except (csv.Error, TypeError):
            pass

        continue

# Create the CSV reader object
csv_reader = csv.reader(file_stream, delimiter=' ', quotechar='"', escapechar='^')

# Set up the named tuple
csvline = namedtuple('csv_line', 'field1, field2, field3')

# Create the named tuple mapping object
map_to_tuple = map(csvline._make, csv_reader)

for line in unfussy_reader(map_to_tuple):
    # do stuff

This works well, but my problem is – I want all of the content of the CSV to be read in lower-case. As per this question, a simple lambda would do it:
map(lambda x:x.lower(),["A","B","C"])
but I can’t find anywhere to put it before the data ends up in the tuple (and thus unchaneable).

Is there a way to do this within this structure (Python 3.5)?

Asked By: GIS-Jonathan

||

Answers:

How about this:

csv_reader = csv.reader(map(lambda line:line.lower(),file_stream), delimiter=' ', quotechar='"', escapechar='^')
Answered By: Iron Fist

You can apply the lower transform to the stream before you create a CSV reader for it.

lower_stream = (line.lower() for line in file_stream)
csv_reader = csv.reader(lower_stream, delimiter=' ', quotechar='"', escapechar='^')

The parentheses around the lower_stream assignment target designate a generator expression. It will not use up file_stream and will not pull all of file_stream into memory.

Answered By: bbayles

This isn’t the most relevant approach for the asker’s use-case, but for others who found this question based on the title, here is a solution using Pandas and converters. It’s ideal for csv files with mixed data types. Note that if you specify converters and dtype, it will override data type specs.

import pandas as pd

df = pd.read_csv(
    "./data/mydata.csv",
    converters={"col2": lambda x: x.lower()},
)
Answered By: BLT