Deleting rows with Python in a CSV file
Question:
All I would like to do is delete a row if it has a value of ‘0’ in the third column. An example of the data would be something like:
6.5, 5.4, 0, 320
6.5, 5.4, 1, 320
So the first row would need to be deleted whereas the second would stay.
What I have so far is as follows:
import csv
input = open('first.csv', 'rb')
output = open('first_edit.csv', 'wb')
writer = csv.writer(output)
for row in csv.reader(input):
if row[2]!=0:
writer.writerow(row)
input.close()
output.close()
Any help would be great
Answers:
You should have if row[2] != "0"
. Otherwise it’s not checking to see if the string value is equal to 0.
You are very close; currently you compare the row[2]
with integer 0
, make the comparison with the string "0"
. When you read the data from a file, it is a string and not an integer, so that is why your integer check fails currently:
row[2]!="0":
Also, you can use the with
keyword to make the current code slightly more pythonic so that the lines in your code are reduced and you can omit the .close
statements:
import csv
with open('first.csv', 'rb') as inp, open('first_edit.csv', 'wb') as out:
writer = csv.writer(out)
for row in csv.reader(inp):
if row[2] != "0":
writer.writerow(row)
Note that input
is a Python builtin, so I’ve used another variable name instead.
Edit: The values in your csv file’s rows are comma and space separated; In a normal csv, they would be simply comma separated and a check against "0"
would work, so you can either use strip(row[2]) != 0
, or check against " 0"
.
The better solution would be to correct the csv format, but in case you want to persist with the current one, the following will work with your given csv file format:
$ cat test.py
import csv
with open('first.csv', 'rb') as inp, open('first_edit.csv', 'wb') as out:
writer = csv.writer(out)
for row in csv.reader(inp):
if row[2] != " 0":
writer.writerow(row)
$ cat first.csv
6.5, 5.4, 0, 320
6.5, 5.4, 1, 320
$ python test.py
$ cat first_edit.csv
6.5, 5.4, 1, 320
Use pandas
amazing library:
The solution for the question:
import pandas as pd
df = pd.read_csv(file)
df = df[df.name != "dog"]
# df.column_name != whole string from the cell
# now, all the rows with the column: Name and Value: "dog" will be deleted
df.to_csv(file, index=False)
General generic solution:
Use this function:
def remove_specific_row_from_csv(file, column_name, *args):
'''
:param file: file to remove the rows from
:param column_name: The column that determines which row will be
deleted (e.g. if Column == Name and row-*args
contains "Gavri", All rows that contain this word will be deleted)
:param args: Strings from the rows according to the conditions with
the column
'''
row_to_remove = []
for row_name in args:
row_to_remove.append(row_name)
try:
df = pd.read_csv(file)
for row in row_to_remove:
df = df[eval("df.{}".format(column_name)) != row]
df.to_csv(file, index=False)
except Exception as e:
raise Exception("Error message....")
Function implementation:
remove_specific_row_from_csv(file_name, "column_name", "dog_for_example", "cat_for_example")
Note: In this function, you can send unlimited cells of strings and all these rows will be deleted (assuming they exist in the single-column sent).
All I would like to do is delete a row if it has a value of ‘0’ in the third column. An example of the data would be something like:
6.5, 5.4, 0, 320
6.5, 5.4, 1, 320
So the first row would need to be deleted whereas the second would stay.
What I have so far is as follows:
import csv
input = open('first.csv', 'rb')
output = open('first_edit.csv', 'wb')
writer = csv.writer(output)
for row in csv.reader(input):
if row[2]!=0:
writer.writerow(row)
input.close()
output.close()
Any help would be great
You should have if row[2] != "0"
. Otherwise it’s not checking to see if the string value is equal to 0.
You are very close; currently you compare the row[2]
with integer 0
, make the comparison with the string "0"
. When you read the data from a file, it is a string and not an integer, so that is why your integer check fails currently:
row[2]!="0":
Also, you can use the with
keyword to make the current code slightly more pythonic so that the lines in your code are reduced and you can omit the .close
statements:
import csv
with open('first.csv', 'rb') as inp, open('first_edit.csv', 'wb') as out:
writer = csv.writer(out)
for row in csv.reader(inp):
if row[2] != "0":
writer.writerow(row)
Note that input
is a Python builtin, so I’ve used another variable name instead.
Edit: The values in your csv file’s rows are comma and space separated; In a normal csv, they would be simply comma separated and a check against "0"
would work, so you can either use strip(row[2]) != 0
, or check against " 0"
.
The better solution would be to correct the csv format, but in case you want to persist with the current one, the following will work with your given csv file format:
$ cat test.py
import csv
with open('first.csv', 'rb') as inp, open('first_edit.csv', 'wb') as out:
writer = csv.writer(out)
for row in csv.reader(inp):
if row[2] != " 0":
writer.writerow(row)
$ cat first.csv
6.5, 5.4, 0, 320
6.5, 5.4, 1, 320
$ python test.py
$ cat first_edit.csv
6.5, 5.4, 1, 320
Use pandas
amazing library:
The solution for the question:
import pandas as pd
df = pd.read_csv(file)
df = df[df.name != "dog"]
# df.column_name != whole string from the cell
# now, all the rows with the column: Name and Value: "dog" will be deleted
df.to_csv(file, index=False)
General generic solution:
Use this function:
def remove_specific_row_from_csv(file, column_name, *args):
'''
:param file: file to remove the rows from
:param column_name: The column that determines which row will be
deleted (e.g. if Column == Name and row-*args
contains "Gavri", All rows that contain this word will be deleted)
:param args: Strings from the rows according to the conditions with
the column
'''
row_to_remove = []
for row_name in args:
row_to_remove.append(row_name)
try:
df = pd.read_csv(file)
for row in row_to_remove:
df = df[eval("df.{}".format(column_name)) != row]
df.to_csv(file, index=False)
except Exception as e:
raise Exception("Error message....")
Function implementation:
remove_specific_row_from_csv(file_name, "column_name", "dog_for_example", "cat_for_example")
Note: In this function, you can send unlimited cells of strings and all these rows will be deleted (assuming they exist in the single-column sent).