Apply a function to each line of a CSV file in python

Question:

I have a regular expression that I want to apply to each line of a CSV file.

Here’s the function which basically removes all comma’s encountered before a single digit number. The function is working perfectly fine for the string.

Input : text = "52A, XYZ Street, ABC District, 2, M, Brown
Function : re.sub('(?<!s[dA-Z]),(?!s+d,?)', '', text)
Output : '52A XYZ Street ABC District, 2, M, Brown'

However, I have a CSV file containing hundreds of such lines. For instance

1, 5273249, 1061/72, 150-CF, S/O:XVZ, 1, ABX, 45, 0, Husband, 9213
1, 5272849, 1063/36, 150-AS, S/O:XVZ, 1, ABX, 45, 0, Wife, 9253
1, 5274549, 10626/12, 150-RT, S/O:XVZ, 1, ABX, 45, 0, Son, 9214

I tried to read it using CSV reader and apply the function but unfortunately, it’s not producing any output. What did I do wrong here:

def myFunction(text):
    return re.sub('(?<!s[dA-Z]),(?!s+d,?)', '', text)

import csv
with open('temp1.csv', 'r') as csvfile:
    spamreader = csv.reader(csvfile, delimiter=',')
    for row in spamreader:
        l = ','.join(row)    
        myFunction(l)
Asked By: Rohit Girdhar

||

Answers:

There’s a handy module called fileinput that allows you to easily edit files line by line.

Take a look at:

import re
import fileinput

# For each line
for line in fileinput.input(files='example.csv', inplace=True, backup='.bak'):
    # Replace it with the regex you provided
    print(re.sub('(?<!s[dA-Z]),(?!s+d,?)', '', line), end='')

Output:

Before:
1,  5273249, 1061/72, 150-CF, S/O:XVZ, 1, ABX, 45, 0, Husband, 9213
1,  5272849, 1063/36, 150-AS, S/O:XVZ, 1, ABX, 45, 0, Wife, 9253
1,  5274549, 10626/12, 150-RT, S/O:XVZ, 1, ABX, 45, 0, Son, 9214

After:
1,  5273249, 1061/72, 150-CF S/O:XVZ, 1, ABX, 45, 0, Husband, 9213
1,  5272849, 1063/36, 150-AS S/O:XVZ, 1, ABX, 45, 0, Wife, 9253
1,  5274549, 10626/12, 150-RT S/O:XVZ, 1, ABX, 45, 0, Son, 9214

This also creates a handy example.csv.bak file for you in case you need it!

Answered By: Aaron N. Brock
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.