Remove specific rows from csv file if matching the elements of a list – python / windows

Question:

I have a csv file with a name and url per row (on the first column).
On the other hand I have a list with names coming from a script.
I would like to remove the rows in the csv file containing the names in the list.
It sounds simple but I tried several options and none works.

The csv format is:

John Doe, johndoe.blog.com
Jane Doe, janedoe.blog.com
Jim Foe, jimfoe.blog.com

The list:

not_ok_name= [John Doe , Jim Foe]

The output of the csv file should be:

Jane Doe, janedoe.blog.com

On the last attempt I tried the following solution :

count= 0
while count< len(not_ok_name):
    writer = csv.writer(open('corrected.csv'))
    for row in csv.reader('myfile.csv.csv'):
        if not row[0].startswith(not_ok_name[count]):
            writer.writerow(row)
    writer.close()

Since I am still a newbie I look forward for some simple suggestions.
Thanks.

EDIT:
Just in case there could be some formatting issues with the original data, I am pasting the result of:

print repr(open("myfile.csv", "rb").read())

John Doe ,johndoe.blog.comrnJane Doe , janedoe.blog.com

I hope this could help.
Thanks

EDIT 2:
Here’s a code that partially does the work. It removes ONE name. Maybe it helps for developing one for the entire list.

reader = csv.reader(open("myfile.csv", "rb"), delimiter=',')
with open('corrected.csv', 'wb') as outfile:
    writer = csv.writer(outfile)
    for line in reader:
        #for item in Names:
        if not any ("Jim Foe" in x for x in line):
            writer.writerow(line)
            print line

Thanks again.

Asked By: Diego

||

Answers:

Try this. It uses a generator to exclude the names in the not_ok_name list.

import csv
with open("C:/path/a.csv","rU") as f,open("C:/path/des.csv","wb") as w:
    not_ok_name= ["John Doe" , "Jim Foe"]
    reader = csv.reader(f)
    for row in reader:
        name = row[0]
        if name not in not_ok_name:
            w.write(row)
Answered By: Daniel
not_ok_name = ["John", "Jim"]
not_ok_name = set(not_ok_name)  # sets give us O(1) lookup times

with open('myfile.csv') as infile, open('corrected.csv', 'w') as outfile:
    writer = csv.writer(outfile)
    for name, url in csv.reader(infile):  # for each row in the input file
        fname = name.split(None, 1)[0]
        if fname in not_ok_name:
            continue  # if the first name is in the list, ignore the row
        writer.writerow([name, url])
Answered By: inspectorG4dget
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.