Python: Trying to create nested dictionary from CSV File, but getting KeyError

Question:

I am currently am trying to create a nested dictionary from a csv file.

The CSV File represents how many people there are for each demographic region. In the nested dictionary each key is a region and the value is another dictionary. The inner dictionary uses the demographic as key and the number of people for its value.

Region,American,Asian,Black
midwest,2500,2300,2150
north,1200,2300,2300
south,1211,211,2100

Currently have:

def load_csv(filename):
data={}
    with open(filename) as csvfile:
        fh = csv.DictReader(csvfile)
        for row in fh:
            key = row.pop('Region')
            data[key] = row
        return data

Expected Output (must convert the numbers from strings to integers):

{'west':{'American': 2500, 'ASIAN': 2300, ...}, 'north':{'American': 1200, ..}...}

I’m getting stuck when running my code as it is giving me "KeyError: ‘Region’"

Asked By: Ak2012

||

Answers:

Use a comprehension to convert string values to integers:

import csv

def load_csv(filename):
    data = {}
    with open(filename) as csvfile:
        # Your file has 3 invisible characters at the beginning, skip them
        csvfile.seek(3)
        fh = csv.DictReader(csvfile)
        for row in fh:
            key = row.pop('Region')
            data[key] = {k: int(v) for k, v in row.items()}  # <- HERE
        return data

data = load_csv('data.csv')

Output:

>>> data
{'midwest': {'American': 2500, 'Asian': 2300, 'Black': 2150},
 'north': {'American': 1200, 'Asian': 2300, 'Black': 2300},
 'south': {'American': 1211, 'Asian': 211, 'Black': 2100}}

Bonus: The same operation with Pandas:

import pandas as pd

data = pd.read_csv('data.csv', index_col='Region').T.to_dict()
print(data)

# Output
{'midwest': {'American': 2500, 'Asian': 2300, 'Black': 2150},
 'north': {'American': 1200, 'Asian': 2300, 'Black': 2300},
 'south': {'American': 1211, 'Asian': 211, 'Black': 2100}}
Answered By: Corralien
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.