How to get the numerical value for each matching value in a row in a csv file

Question:

I have a csv file with "years" in row[0] and I need to get a count of how many times each year occurs and pair it with that year in a dictionary. To be clear, the year is the key and the amount of times it occurs in the csv is the value.

Here’s what I have, but I am missing something. I just can’t figure out how to get the count of each year to pair it with that year.

def incidents_per_year():
  dict = {}
  count = 0
  with open("saved_data.csv") as f:
    reader = csv.reader(f)
    next(reader)
    for row in reader:
      count += 1
      year = row[0]
      dict[year] = count
  return dict

Here is the part of the csv file (overall it’s 50,000 rows so this is a small subset).

year,month,hour_of_day,incident_type_primary,day_of_week
2022,6,8,LARCENY/THEFT,Monday
2016,10,5,ASSAULT,Tuesday
2016,8,12,LARCENY/THEFT,Wednesday
2014,9,5,LARCENY/THEFT,Sunday
2015,8,7,ASSAULT,Wednesday
2016,11,2,LARCENY/THEFT,Tuesday
2015,7,11,ASSAULT,Friday
2015,4,12,LARCENY/THEFT,Friday
2016,3,2,BURGLARY,Wednesday
2014,10,4,LARCENY/THEFT,Thursday
2016,8,3,LARCENY/THEFT,Friday
2016,1,12,LARCENY/THEFT,Monday
2016,3,1,BURGLARY,Friday
2014,8,7,BURGLARY,Saturday
2017,4,10,UUV,Wednesday
2017,6,5,BURGLARY,Thursday
2017,1,4,BURGLARY,Wednesday
2016,7,9,BURGLARY,Thursday
2015,9,9,LARCENY/THEFT,Monday
2017,4,12,LARCENY/THEFT,Thursday
2016,3,4,LARCENY/THEFT,Friday
2016,4,5,BURGLARY,Thursday
2017,10,12,LARCENY/THEFT,Sunday
2015,7,11,ASSAULT,Monday
2012,5,12,LARCENY/THEFT,Friday
2014,12,11,LARCENY/THEFT,Thursday
2015,3,4,LARCENY/THEFT,Tuesday
2017,11,8,LARCENY/THEFT,Wednesday
2011,7,17,LARCENY/THEFT,Thursday
2015,9,17,ROBBERY,Wednesday
2015,5,12,BURGLARY,Thursday
2013,11,14,ASSAULT,Tuesday
2015,6,16,LARCENY/THEFT,Friday
2010,10,18,LARCENY/THEFT,Monday
2007,8,21,LARCENY/THEFT,Tuesday
2015,5,16,LARCENY/THEFT,Tuesday
2013,11,8,LARCENY/THEFT,Wednesday
2007,6,15,BURGLARY,Sunday
2012,5,19,ASSAULT,Tuesday
2007,8,20,LARCENY/THEFT,Thursday
2018,7,18,LARCENY/THEFT,Sunday
2019,2,2,BURGLARY,Tuesday
2012,11,20,BURGLARY,Monday
2012,6,15,ASSAULT,Wednesday
2011,10,21,THEFT OF SERVICES,Monday
2008,11,23,LARCENY/THEFT,Wednesday
2014,7,4,BURGLARY,Wednesday
2013,11,9,LARCENY/THEFT,Wednesday
2021,7,11,ASSAULT,Saturday
2018,7,12,LARCENY/THEFT,Friday
2013,2,0,LARCENY/THEFT,Friday
2007,1,0,ROBBERY,Thursday
2008,7,4,ASSAULT,Saturday
2007,8,4,BURGLARY,Saturday
2019,12,17,LARCENY/THEFT,Thursday
2018,7,0,LARCENY/THEFT,Wednesday
2010,2,12,BURGLARY,Sunday
2012,4,3,BURGLARY,Tuesday
2012,1,23,LARCENY/THEFT,Monday
2006,1,23,LARCENY/THEFT,Wednesday
2015,6,0,LARCENY/THEFT,Tuesday
2015,4,21,BURGLARY,Wednesday
2017,5,11,ROBBERY,Tuesday
2017,9,16,LARCENY/THEFT,Wednesday
2016,7,12,LARCENY/THEFT,Friday
2006,6,6,LARCENY/THEFT,Friday
2010,5,0,BURGLARY,Thursday
2010,11,21,ROBBERY,Wednesday
2011,2,3,BURGLARY,Friday
2017,8,0,LARCENY/THEFT,Wednesday
2011,12,23,BURGLARY,Friday
2012,8,0,LARCENY/THEFT,Saturday
2012,7,22,ROBBERY,Thursday
2016,9,8,ASSAULT,Saturday
2013,7,7,BURGLARY,Friday
2010,4,5,ASSAULT,Saturday
2022,3,13,LARCENY/THEFT,Saturday
2009,5,13,BURGLARY,Thursday
2010,2,12,ASSAULT,Saturday
2011,12,20,LARCENY/THEFT,Friday
2007,10,4,BURGLARY,Thursday
2007,8,19,LARCENY/THEFT,Saturday
2011,12,4,LARCENY/THEFT,Saturday
2012,10,23,UUV,Friday
2018,4,19,UUV,Sunday
2010,5,13,LARCENY/THEFT,Wednesday
2017,5,11,BURGLARY,Saturday
2009,9,2,ASSAULT,Thursday
2016,6,0,LARCENY/THEFT,Wednesday
2012,4,4,LARCENY/THEFT,Monday
2009,9,19,BURGLARY,Tuesday
2009,6,10,BURGLARY,Monday
2007,10,0,LARCENY/THEFT,Wednesday
2016,5,1,ASSAULT,Tuesday
2010,10,0,ROBBERY,Friday
2013,10,11,LARCENY/THEFT,Monday
2018,9,19,BURGLARY,Friday
2006,6,14,BURGLARY,Wednesday
2010,5,21,ASSAULT,Sunday
2010,6,10,LARCENY/THEFT,Monday
2018,9,10,LARCENY/THEFT,Friday
2007,11,0,LARCENY/THEFT,Tuesday
2008,8,22,ASSAULT,Thursday
2016,10,19,LARCENY/THEFT,Wednesday
2018,1,1,LARCENY/THEFT,Tuesday
2015,8,7,BURGLARY,Friday
2016,4,20,LARCENY/THEFT,Tuesday
2015,10,1,LARCENY/THEFT,Thursday
2010,3,15,ASSAULT,Monday
2014,4,4,ASSAULT,Monday
2011,10,21,LARCENY/THEFT,Friday
2016,9,12,LARCENY/THEFT,Thursday
2011,8,10,LARCENY/THEFT,Wednesday
2012,10,16,LARCENY/THEFT,Friday
2016,3,20,ASSAULT,Saturday
2020,11,11,UUV,Tuesday
2013,11,5,LARCENY/THEFT,Wednesday
2010,4,4,BURGLARY,Friday
2011,9,23,ASSAULT,Friday
2008,10,14,LARCENY/THEFT,Thursday
2015,6,0,UUV,Saturday
2010,12,23,LARCENY/THEFT,Saturday
2015,6,14,LARCENY/THEFT,Tuesday
2008,10,22,ASSAULT,Friday
2010,11,12,BURGLARY,Monday
2006,5,20,ASSAULT,Sunday
2012,9,16,BURGLARY,Sunday
2020,7,3,ASSAULT,Thursday
2014,1,5,BURGLARY,Tuesday
2015,4,1,ASSAULT,Thursday
2014,10,7,LARCENY/THEFT,Sunday
2007,11,9,LARCENY/THEFT,Wednesday
2008,7,17,BURGLARY,Sunday
2011,4,23,BURGLARY,Saturday
2014,7,17,LARCENY/THEFT,Wednesday
2008,10,10,LARCENY/THEFT,Tuesday
2007,7,18,LARCENY/THEFT,Sunday
2011,3,18,ROBBERY,Wednesday
2010,12,0,LARCENY/THEFT,Thursday
2013,5,0,LARCENY/THEFT,Tuesday
2006,9,14,LARCENY/THEFT,Friday
2014,2,1,ROBBERY,Thursday
2020,5,17,UUV,Sunday
2007,4,23,LARCENY/THEFT,Sunday
2015,6,12,LARCENY/THEFT,Monday
2010,1,5,ROBBERY,Monday
2011,11,18,LARCENY/THEFT,Tuesday
2008,10,23,LARCENY/THEFT,Thursday
2019,8,17,UUV,Friday
2006,9,17,LARCENY/THEFT,Friday
2015,7,9,LARCENY/THEFT,Monday
2013,2,23,ROBBERY,Sunday
2012,8,15,ASSAULT,Sunday
2015,3,0,LARCENY/THEFT,Friday
2006,12,15,BURGLARY,Thursday
2021,12,10,LARCENY/THEFT,Thursday
2006,11,11,BURGLARY,Sunday
2009,7,0,LARCENY/THEFT,Tuesday
2006,5,17,LARCENY/THEFT,Thursday
2016,7,0,BURGLARY,Wednesday
2017,1,14,LARCENY/THEFT,Tuesday
2010,11,13,LARCENY/THEFT,Tuesday
2015,9,13,BURGLARY,Wednesday
2008,10,1,BURGLARY,Wednesday
2009,4,22,LARCENY/THEFT,Thursday
2016,5,20,ASSAULT,Wednesday
2009,7,12,LARCENY/THEFT,Thursday
2021,6,20,LARCENY/THEFT,Sunday
Asked By: P Mastr

||

Answers:

You are using the count variable wrong, try this:

def incidents_per_year():
 dict = {}
  with open("saved_data.csv") as f:
    reader = csv.reader(f)
    next(reader)
    for row in reader:
      year = row[0]
      dict[year] = (dict.get(year) or 0) + 1
  return dict

For every year in the file it will either set the count to 0 if the dict doesn’t contain that specific year yet, or add 1 to the count of the specific year

Answered By: Yip

First, I’d suggest try to not use the builtin keywords as variables.
Second, your code uses the counter as a global counter and therefor it is not unique for each year.

def incidents_per_year():
    year_dict = {}
    with open("saved_data.csv") as f:
        reader = csv.reader(f)
        next(reader)
        for row in reader:
            year = row[0]
            year_dict[year] = year_dict.get(year, default=0) + 1
    return year_dict

In this code, I’m using dict.get method, which get the key and returns the value, if the key in dict, else a default value (defaults to None if not passed).

This way, I’m making sure that each year will be calculated separately with it’s own counter.

Answered By: DjLegolas

You can try using DictReader from csv to read csv header as key:

from csv import DictReader

def incidents_per_year():
    res = {}
    with open("saved_data.csv") as f:
        reader = DictReader(f)
        for k in reader:
            if k['year'] in res:
                res[k['year']] += 1
            else:
                res[k['year']] = 1
    return res

or using Counter from collection to count value occurance:

from csv import DictReader
from collections import Counter

def incidents_per_year():
    return dict(Counter(k['year'] for k in DictReader(open("saved_data.csv"))))
Answered By: Arifa Chan
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.