How to get the numerical value for each matching value in a row in a csv file
Question:
I have a csv file with "years" in row[0] and I need to get a count of how many times each year occurs and pair it with that year in a dictionary. To be clear, the year is the key and the amount of times it occurs in the csv is the value.
Here’s what I have, but I am missing something. I just can’t figure out how to get the count of each year to pair it with that year.
def incidents_per_year():
dict = {}
count = 0
with open("saved_data.csv") as f:
reader = csv.reader(f)
next(reader)
for row in reader:
count += 1
year = row[0]
dict[year] = count
return dict
Here is the part of the csv file (overall it’s 50,000 rows so this is a small subset).
year,month,hour_of_day,incident_type_primary,day_of_week
2022,6,8,LARCENY/THEFT,Monday
2016,10,5,ASSAULT,Tuesday
2016,8,12,LARCENY/THEFT,Wednesday
2014,9,5,LARCENY/THEFT,Sunday
2015,8,7,ASSAULT,Wednesday
2016,11,2,LARCENY/THEFT,Tuesday
2015,7,11,ASSAULT,Friday
2015,4,12,LARCENY/THEFT,Friday
2016,3,2,BURGLARY,Wednesday
2014,10,4,LARCENY/THEFT,Thursday
2016,8,3,LARCENY/THEFT,Friday
2016,1,12,LARCENY/THEFT,Monday
2016,3,1,BURGLARY,Friday
2014,8,7,BURGLARY,Saturday
2017,4,10,UUV,Wednesday
2017,6,5,BURGLARY,Thursday
2017,1,4,BURGLARY,Wednesday
2016,7,9,BURGLARY,Thursday
2015,9,9,LARCENY/THEFT,Monday
2017,4,12,LARCENY/THEFT,Thursday
2016,3,4,LARCENY/THEFT,Friday
2016,4,5,BURGLARY,Thursday
2017,10,12,LARCENY/THEFT,Sunday
2015,7,11,ASSAULT,Monday
2012,5,12,LARCENY/THEFT,Friday
2014,12,11,LARCENY/THEFT,Thursday
2015,3,4,LARCENY/THEFT,Tuesday
2017,11,8,LARCENY/THEFT,Wednesday
2011,7,17,LARCENY/THEFT,Thursday
2015,9,17,ROBBERY,Wednesday
2015,5,12,BURGLARY,Thursday
2013,11,14,ASSAULT,Tuesday
2015,6,16,LARCENY/THEFT,Friday
2010,10,18,LARCENY/THEFT,Monday
2007,8,21,LARCENY/THEFT,Tuesday
2015,5,16,LARCENY/THEFT,Tuesday
2013,11,8,LARCENY/THEFT,Wednesday
2007,6,15,BURGLARY,Sunday
2012,5,19,ASSAULT,Tuesday
2007,8,20,LARCENY/THEFT,Thursday
2018,7,18,LARCENY/THEFT,Sunday
2019,2,2,BURGLARY,Tuesday
2012,11,20,BURGLARY,Monday
2012,6,15,ASSAULT,Wednesday
2011,10,21,THEFT OF SERVICES,Monday
2008,11,23,LARCENY/THEFT,Wednesday
2014,7,4,BURGLARY,Wednesday
2013,11,9,LARCENY/THEFT,Wednesday
2021,7,11,ASSAULT,Saturday
2018,7,12,LARCENY/THEFT,Friday
2013,2,0,LARCENY/THEFT,Friday
2007,1,0,ROBBERY,Thursday
2008,7,4,ASSAULT,Saturday
2007,8,4,BURGLARY,Saturday
2019,12,17,LARCENY/THEFT,Thursday
2018,7,0,LARCENY/THEFT,Wednesday
2010,2,12,BURGLARY,Sunday
2012,4,3,BURGLARY,Tuesday
2012,1,23,LARCENY/THEFT,Monday
2006,1,23,LARCENY/THEFT,Wednesday
2015,6,0,LARCENY/THEFT,Tuesday
2015,4,21,BURGLARY,Wednesday
2017,5,11,ROBBERY,Tuesday
2017,9,16,LARCENY/THEFT,Wednesday
2016,7,12,LARCENY/THEFT,Friday
2006,6,6,LARCENY/THEFT,Friday
2010,5,0,BURGLARY,Thursday
2010,11,21,ROBBERY,Wednesday
2011,2,3,BURGLARY,Friday
2017,8,0,LARCENY/THEFT,Wednesday
2011,12,23,BURGLARY,Friday
2012,8,0,LARCENY/THEFT,Saturday
2012,7,22,ROBBERY,Thursday
2016,9,8,ASSAULT,Saturday
2013,7,7,BURGLARY,Friday
2010,4,5,ASSAULT,Saturday
2022,3,13,LARCENY/THEFT,Saturday
2009,5,13,BURGLARY,Thursday
2010,2,12,ASSAULT,Saturday
2011,12,20,LARCENY/THEFT,Friday
2007,10,4,BURGLARY,Thursday
2007,8,19,LARCENY/THEFT,Saturday
2011,12,4,LARCENY/THEFT,Saturday
2012,10,23,UUV,Friday
2018,4,19,UUV,Sunday
2010,5,13,LARCENY/THEFT,Wednesday
2017,5,11,BURGLARY,Saturday
2009,9,2,ASSAULT,Thursday
2016,6,0,LARCENY/THEFT,Wednesday
2012,4,4,LARCENY/THEFT,Monday
2009,9,19,BURGLARY,Tuesday
2009,6,10,BURGLARY,Monday
2007,10,0,LARCENY/THEFT,Wednesday
2016,5,1,ASSAULT,Tuesday
2010,10,0,ROBBERY,Friday
2013,10,11,LARCENY/THEFT,Monday
2018,9,19,BURGLARY,Friday
2006,6,14,BURGLARY,Wednesday
2010,5,21,ASSAULT,Sunday
2010,6,10,LARCENY/THEFT,Monday
2018,9,10,LARCENY/THEFT,Friday
2007,11,0,LARCENY/THEFT,Tuesday
2008,8,22,ASSAULT,Thursday
2016,10,19,LARCENY/THEFT,Wednesday
2018,1,1,LARCENY/THEFT,Tuesday
2015,8,7,BURGLARY,Friday
2016,4,20,LARCENY/THEFT,Tuesday
2015,10,1,LARCENY/THEFT,Thursday
2010,3,15,ASSAULT,Monday
2014,4,4,ASSAULT,Monday
2011,10,21,LARCENY/THEFT,Friday
2016,9,12,LARCENY/THEFT,Thursday
2011,8,10,LARCENY/THEFT,Wednesday
2012,10,16,LARCENY/THEFT,Friday
2016,3,20,ASSAULT,Saturday
2020,11,11,UUV,Tuesday
2013,11,5,LARCENY/THEFT,Wednesday
2010,4,4,BURGLARY,Friday
2011,9,23,ASSAULT,Friday
2008,10,14,LARCENY/THEFT,Thursday
2015,6,0,UUV,Saturday
2010,12,23,LARCENY/THEFT,Saturday
2015,6,14,LARCENY/THEFT,Tuesday
2008,10,22,ASSAULT,Friday
2010,11,12,BURGLARY,Monday
2006,5,20,ASSAULT,Sunday
2012,9,16,BURGLARY,Sunday
2020,7,3,ASSAULT,Thursday
2014,1,5,BURGLARY,Tuesday
2015,4,1,ASSAULT,Thursday
2014,10,7,LARCENY/THEFT,Sunday
2007,11,9,LARCENY/THEFT,Wednesday
2008,7,17,BURGLARY,Sunday
2011,4,23,BURGLARY,Saturday
2014,7,17,LARCENY/THEFT,Wednesday
2008,10,10,LARCENY/THEFT,Tuesday
2007,7,18,LARCENY/THEFT,Sunday
2011,3,18,ROBBERY,Wednesday
2010,12,0,LARCENY/THEFT,Thursday
2013,5,0,LARCENY/THEFT,Tuesday
2006,9,14,LARCENY/THEFT,Friday
2014,2,1,ROBBERY,Thursday
2020,5,17,UUV,Sunday
2007,4,23,LARCENY/THEFT,Sunday
2015,6,12,LARCENY/THEFT,Monday
2010,1,5,ROBBERY,Monday
2011,11,18,LARCENY/THEFT,Tuesday
2008,10,23,LARCENY/THEFT,Thursday
2019,8,17,UUV,Friday
2006,9,17,LARCENY/THEFT,Friday
2015,7,9,LARCENY/THEFT,Monday
2013,2,23,ROBBERY,Sunday
2012,8,15,ASSAULT,Sunday
2015,3,0,LARCENY/THEFT,Friday
2006,12,15,BURGLARY,Thursday
2021,12,10,LARCENY/THEFT,Thursday
2006,11,11,BURGLARY,Sunday
2009,7,0,LARCENY/THEFT,Tuesday
2006,5,17,LARCENY/THEFT,Thursday
2016,7,0,BURGLARY,Wednesday
2017,1,14,LARCENY/THEFT,Tuesday
2010,11,13,LARCENY/THEFT,Tuesday
2015,9,13,BURGLARY,Wednesday
2008,10,1,BURGLARY,Wednesday
2009,4,22,LARCENY/THEFT,Thursday
2016,5,20,ASSAULT,Wednesday
2009,7,12,LARCENY/THEFT,Thursday
2021,6,20,LARCENY/THEFT,Sunday
Answers:
You are using the count variable wrong, try this:
def incidents_per_year():
dict = {}
with open("saved_data.csv") as f:
reader = csv.reader(f)
next(reader)
for row in reader:
year = row[0]
dict[year] = (dict.get(year) or 0) + 1
return dict
For every year in the file it will either set the count to 0 if the dict doesn’t contain that specific year yet, or add 1 to the count of the specific year
First, I’d suggest try to not use the builtin keywords as variables.
Second, your code uses the counter as a global counter and therefor it is not unique for each year.
def incidents_per_year():
year_dict = {}
with open("saved_data.csv") as f:
reader = csv.reader(f)
next(reader)
for row in reader:
year = row[0]
year_dict[year] = year_dict.get(year, default=0) + 1
return year_dict
In this code, I’m using dict.get
method, which get the key
and returns the value
, if the key
in dict
, else a default value (defaults to None
if not passed).
This way, I’m making sure that each year will be calculated separately with it’s own counter.
You can try using DictReader
from csv
to read csv header as key:
from csv import DictReader
def incidents_per_year():
res = {}
with open("saved_data.csv") as f:
reader = DictReader(f)
for k in reader:
if k['year'] in res:
res[k['year']] += 1
else:
res[k['year']] = 1
return res
or using Counter
from collection
to count value occurance:
from csv import DictReader
from collections import Counter
def incidents_per_year():
return dict(Counter(k['year'] for k in DictReader(open("saved_data.csv"))))
I have a csv file with "years" in row[0] and I need to get a count of how many times each year occurs and pair it with that year in a dictionary. To be clear, the year is the key and the amount of times it occurs in the csv is the value.
Here’s what I have, but I am missing something. I just can’t figure out how to get the count of each year to pair it with that year.
def incidents_per_year():
dict = {}
count = 0
with open("saved_data.csv") as f:
reader = csv.reader(f)
next(reader)
for row in reader:
count += 1
year = row[0]
dict[year] = count
return dict
Here is the part of the csv file (overall it’s 50,000 rows so this is a small subset).
year,month,hour_of_day,incident_type_primary,day_of_week
2022,6,8,LARCENY/THEFT,Monday
2016,10,5,ASSAULT,Tuesday
2016,8,12,LARCENY/THEFT,Wednesday
2014,9,5,LARCENY/THEFT,Sunday
2015,8,7,ASSAULT,Wednesday
2016,11,2,LARCENY/THEFT,Tuesday
2015,7,11,ASSAULT,Friday
2015,4,12,LARCENY/THEFT,Friday
2016,3,2,BURGLARY,Wednesday
2014,10,4,LARCENY/THEFT,Thursday
2016,8,3,LARCENY/THEFT,Friday
2016,1,12,LARCENY/THEFT,Monday
2016,3,1,BURGLARY,Friday
2014,8,7,BURGLARY,Saturday
2017,4,10,UUV,Wednesday
2017,6,5,BURGLARY,Thursday
2017,1,4,BURGLARY,Wednesday
2016,7,9,BURGLARY,Thursday
2015,9,9,LARCENY/THEFT,Monday
2017,4,12,LARCENY/THEFT,Thursday
2016,3,4,LARCENY/THEFT,Friday
2016,4,5,BURGLARY,Thursday
2017,10,12,LARCENY/THEFT,Sunday
2015,7,11,ASSAULT,Monday
2012,5,12,LARCENY/THEFT,Friday
2014,12,11,LARCENY/THEFT,Thursday
2015,3,4,LARCENY/THEFT,Tuesday
2017,11,8,LARCENY/THEFT,Wednesday
2011,7,17,LARCENY/THEFT,Thursday
2015,9,17,ROBBERY,Wednesday
2015,5,12,BURGLARY,Thursday
2013,11,14,ASSAULT,Tuesday
2015,6,16,LARCENY/THEFT,Friday
2010,10,18,LARCENY/THEFT,Monday
2007,8,21,LARCENY/THEFT,Tuesday
2015,5,16,LARCENY/THEFT,Tuesday
2013,11,8,LARCENY/THEFT,Wednesday
2007,6,15,BURGLARY,Sunday
2012,5,19,ASSAULT,Tuesday
2007,8,20,LARCENY/THEFT,Thursday
2018,7,18,LARCENY/THEFT,Sunday
2019,2,2,BURGLARY,Tuesday
2012,11,20,BURGLARY,Monday
2012,6,15,ASSAULT,Wednesday
2011,10,21,THEFT OF SERVICES,Monday
2008,11,23,LARCENY/THEFT,Wednesday
2014,7,4,BURGLARY,Wednesday
2013,11,9,LARCENY/THEFT,Wednesday
2021,7,11,ASSAULT,Saturday
2018,7,12,LARCENY/THEFT,Friday
2013,2,0,LARCENY/THEFT,Friday
2007,1,0,ROBBERY,Thursday
2008,7,4,ASSAULT,Saturday
2007,8,4,BURGLARY,Saturday
2019,12,17,LARCENY/THEFT,Thursday
2018,7,0,LARCENY/THEFT,Wednesday
2010,2,12,BURGLARY,Sunday
2012,4,3,BURGLARY,Tuesday
2012,1,23,LARCENY/THEFT,Monday
2006,1,23,LARCENY/THEFT,Wednesday
2015,6,0,LARCENY/THEFT,Tuesday
2015,4,21,BURGLARY,Wednesday
2017,5,11,ROBBERY,Tuesday
2017,9,16,LARCENY/THEFT,Wednesday
2016,7,12,LARCENY/THEFT,Friday
2006,6,6,LARCENY/THEFT,Friday
2010,5,0,BURGLARY,Thursday
2010,11,21,ROBBERY,Wednesday
2011,2,3,BURGLARY,Friday
2017,8,0,LARCENY/THEFT,Wednesday
2011,12,23,BURGLARY,Friday
2012,8,0,LARCENY/THEFT,Saturday
2012,7,22,ROBBERY,Thursday
2016,9,8,ASSAULT,Saturday
2013,7,7,BURGLARY,Friday
2010,4,5,ASSAULT,Saturday
2022,3,13,LARCENY/THEFT,Saturday
2009,5,13,BURGLARY,Thursday
2010,2,12,ASSAULT,Saturday
2011,12,20,LARCENY/THEFT,Friday
2007,10,4,BURGLARY,Thursday
2007,8,19,LARCENY/THEFT,Saturday
2011,12,4,LARCENY/THEFT,Saturday
2012,10,23,UUV,Friday
2018,4,19,UUV,Sunday
2010,5,13,LARCENY/THEFT,Wednesday
2017,5,11,BURGLARY,Saturday
2009,9,2,ASSAULT,Thursday
2016,6,0,LARCENY/THEFT,Wednesday
2012,4,4,LARCENY/THEFT,Monday
2009,9,19,BURGLARY,Tuesday
2009,6,10,BURGLARY,Monday
2007,10,0,LARCENY/THEFT,Wednesday
2016,5,1,ASSAULT,Tuesday
2010,10,0,ROBBERY,Friday
2013,10,11,LARCENY/THEFT,Monday
2018,9,19,BURGLARY,Friday
2006,6,14,BURGLARY,Wednesday
2010,5,21,ASSAULT,Sunday
2010,6,10,LARCENY/THEFT,Monday
2018,9,10,LARCENY/THEFT,Friday
2007,11,0,LARCENY/THEFT,Tuesday
2008,8,22,ASSAULT,Thursday
2016,10,19,LARCENY/THEFT,Wednesday
2018,1,1,LARCENY/THEFT,Tuesday
2015,8,7,BURGLARY,Friday
2016,4,20,LARCENY/THEFT,Tuesday
2015,10,1,LARCENY/THEFT,Thursday
2010,3,15,ASSAULT,Monday
2014,4,4,ASSAULT,Monday
2011,10,21,LARCENY/THEFT,Friday
2016,9,12,LARCENY/THEFT,Thursday
2011,8,10,LARCENY/THEFT,Wednesday
2012,10,16,LARCENY/THEFT,Friday
2016,3,20,ASSAULT,Saturday
2020,11,11,UUV,Tuesday
2013,11,5,LARCENY/THEFT,Wednesday
2010,4,4,BURGLARY,Friday
2011,9,23,ASSAULT,Friday
2008,10,14,LARCENY/THEFT,Thursday
2015,6,0,UUV,Saturday
2010,12,23,LARCENY/THEFT,Saturday
2015,6,14,LARCENY/THEFT,Tuesday
2008,10,22,ASSAULT,Friday
2010,11,12,BURGLARY,Monday
2006,5,20,ASSAULT,Sunday
2012,9,16,BURGLARY,Sunday
2020,7,3,ASSAULT,Thursday
2014,1,5,BURGLARY,Tuesday
2015,4,1,ASSAULT,Thursday
2014,10,7,LARCENY/THEFT,Sunday
2007,11,9,LARCENY/THEFT,Wednesday
2008,7,17,BURGLARY,Sunday
2011,4,23,BURGLARY,Saturday
2014,7,17,LARCENY/THEFT,Wednesday
2008,10,10,LARCENY/THEFT,Tuesday
2007,7,18,LARCENY/THEFT,Sunday
2011,3,18,ROBBERY,Wednesday
2010,12,0,LARCENY/THEFT,Thursday
2013,5,0,LARCENY/THEFT,Tuesday
2006,9,14,LARCENY/THEFT,Friday
2014,2,1,ROBBERY,Thursday
2020,5,17,UUV,Sunday
2007,4,23,LARCENY/THEFT,Sunday
2015,6,12,LARCENY/THEFT,Monday
2010,1,5,ROBBERY,Monday
2011,11,18,LARCENY/THEFT,Tuesday
2008,10,23,LARCENY/THEFT,Thursday
2019,8,17,UUV,Friday
2006,9,17,LARCENY/THEFT,Friday
2015,7,9,LARCENY/THEFT,Monday
2013,2,23,ROBBERY,Sunday
2012,8,15,ASSAULT,Sunday
2015,3,0,LARCENY/THEFT,Friday
2006,12,15,BURGLARY,Thursday
2021,12,10,LARCENY/THEFT,Thursday
2006,11,11,BURGLARY,Sunday
2009,7,0,LARCENY/THEFT,Tuesday
2006,5,17,LARCENY/THEFT,Thursday
2016,7,0,BURGLARY,Wednesday
2017,1,14,LARCENY/THEFT,Tuesday
2010,11,13,LARCENY/THEFT,Tuesday
2015,9,13,BURGLARY,Wednesday
2008,10,1,BURGLARY,Wednesday
2009,4,22,LARCENY/THEFT,Thursday
2016,5,20,ASSAULT,Wednesday
2009,7,12,LARCENY/THEFT,Thursday
2021,6,20,LARCENY/THEFT,Sunday
You are using the count variable wrong, try this:
def incidents_per_year():
dict = {}
with open("saved_data.csv") as f:
reader = csv.reader(f)
next(reader)
for row in reader:
year = row[0]
dict[year] = (dict.get(year) or 0) + 1
return dict
For every year in the file it will either set the count to 0 if the dict doesn’t contain that specific year yet, or add 1 to the count of the specific year
First, I’d suggest try to not use the builtin keywords as variables.
Second, your code uses the counter as a global counter and therefor it is not unique for each year.
def incidents_per_year():
year_dict = {}
with open("saved_data.csv") as f:
reader = csv.reader(f)
next(reader)
for row in reader:
year = row[0]
year_dict[year] = year_dict.get(year, default=0) + 1
return year_dict
In this code, I’m using dict.get
method, which get the key
and returns the value
, if the key
in dict
, else a default value (defaults to None
if not passed).
This way, I’m making sure that each year will be calculated separately with it’s own counter.
You can try using DictReader
from csv
to read csv header as key:
from csv import DictReader
def incidents_per_year():
res = {}
with open("saved_data.csv") as f:
reader = DictReader(f)
for k in reader:
if k['year'] in res:
res[k['year']] += 1
else:
res[k['year']] = 1
return res
or using Counter
from collection
to count value occurance:
from csv import DictReader
from collections import Counter
def incidents_per_year():
return dict(Counter(k['year'] for k in DictReader(open("saved_data.csv"))))