Copy keys to a new dictionary (Python)
Question:
I’m reading a csv file, using DictReader(). The function returns a dictionary, where the header items are the keys and the cells are the values. Pretty cool.
But I’m trying to account for rows where the data may not be what I expect it to be. In that case (I’m catching a ValueError exception), I would like the rows that are ‘suspect’ to go into a separate dictionary, for manual processing.
My question is this: since my first dictionary (the object returned by DictReader) has all of its keys set up properly, how do I copy just the keys into my second dictionary, the one which I want to be just a dictionary of suspect rows, to be manually processed?
I’ve been toying around with dict.fromkeys() and such for a while now and I’m just not getting anywhere. Halp!
EDIT: Pasting some of my erroneous code. Going to go hide in shame of my code. Don’t judge me! 😉
unsure_rows = dict.fromkeys(dict(csv_reader).keys(), [])
for row in csv_reader:
# if row['Start Time'] != 'None':
try:
if before_date > strptime(row['Start Time'], '%Y-%m-%d %H:%M:%S') > after_date:
continue
except ValueError:
unsure_rows += row
ValueError: dictionary update sequence element #0 has length 13; 2 is required
Answers:
You wouldn’t want to copy “just the keys” into another dictionary, but if you have “just the keys” you will have a set.
To get the keys from dict d
, you need only say d.keys()
.
This returns a list (with keys in arbitrary order), which you can keep as a list or copy into a set with
set(d.keys())
Example:
>>> d = {'one': 1, 'three': 3, 'two': 2, 'four': 4}
>>> set(d.keys())
set(['four', 'one', 'three', 'two'])
EDIT:
Now I see that you intend to capture suspect key-value pairs as you catch exceptions. In this case, just start with an empty dictionary
suspect = {}
And inside your code, which I would imagine is some kind of loop, add the suspect key value pairs like so:
while something():
k, v = generate_pair()
try:
analyze_something_with_k_and_v_that_might_throw_an_exception
add_it_to_regular_dict
except:
suspect[k] = v
Have you tried:
newd = dict.fromkeys(origdict)
? If that doesn’t work for you then please add more details about the error you are getting.
You are close. Try:
dict.fromkeys(my_csv_dict.keys(),[])
This will initialize a dictionary with the same keys that you parsed from your CSV file, and each one will map to an empty list (to which, I assume, you will append your suspect row values).
Try this. (There are several subtler changes here that are all necessary, like how you can’t initialize unsure_rows before you start reading the CSV.)
unsure_rows = None
for row in csv_reader:
# if row['Start Time'] != 'None':
try:
if before_date > strptime(row['Start Time'], '%Y-%m-%d %H:%M:%S') > after_date:
continue
except ValueError:
if not unsure_rows:
# Initialize the unsure rows dictionary
unsure_rows = dict.fromkeys(csv_reader.fieldnames,[])
for key in unsure_rows:
unsure_rows[key].append(row[key])
The solution from cheeken isn’t working anymore as dict.fromkeys
is creating the dictionary with all values pointing to the same list you should instead use {k: [] for k in my_csv_dict.keys()}
I’m reading a csv file, using DictReader(). The function returns a dictionary, where the header items are the keys and the cells are the values. Pretty cool.
But I’m trying to account for rows where the data may not be what I expect it to be. In that case (I’m catching a ValueError exception), I would like the rows that are ‘suspect’ to go into a separate dictionary, for manual processing.
My question is this: since my first dictionary (the object returned by DictReader) has all of its keys set up properly, how do I copy just the keys into my second dictionary, the one which I want to be just a dictionary of suspect rows, to be manually processed?
I’ve been toying around with dict.fromkeys() and such for a while now and I’m just not getting anywhere. Halp!
EDIT: Pasting some of my erroneous code. Going to go hide in shame of my code. Don’t judge me! 😉
unsure_rows = dict.fromkeys(dict(csv_reader).keys(), [])
for row in csv_reader:
# if row['Start Time'] != 'None':
try:
if before_date > strptime(row['Start Time'], '%Y-%m-%d %H:%M:%S') > after_date:
continue
except ValueError:
unsure_rows += row
ValueError: dictionary update sequence element #0 has length 13; 2 is required
You wouldn’t want to copy “just the keys” into another dictionary, but if you have “just the keys” you will have a set.
To get the keys from dict d
, you need only say d.keys()
.
This returns a list (with keys in arbitrary order), which you can keep as a list or copy into a set with
set(d.keys())
Example:
>>> d = {'one': 1, 'three': 3, 'two': 2, 'four': 4}
>>> set(d.keys())
set(['four', 'one', 'three', 'two'])
EDIT:
Now I see that you intend to capture suspect key-value pairs as you catch exceptions. In this case, just start with an empty dictionary
suspect = {}
And inside your code, which I would imagine is some kind of loop, add the suspect key value pairs like so:
while something():
k, v = generate_pair()
try:
analyze_something_with_k_and_v_that_might_throw_an_exception
add_it_to_regular_dict
except:
suspect[k] = v
Have you tried:
newd = dict.fromkeys(origdict)
? If that doesn’t work for you then please add more details about the error you are getting.
You are close. Try:
dict.fromkeys(my_csv_dict.keys(),[])
This will initialize a dictionary with the same keys that you parsed from your CSV file, and each one will map to an empty list (to which, I assume, you will append your suspect row values).
Try this. (There are several subtler changes here that are all necessary, like how you can’t initialize unsure_rows before you start reading the CSV.)
unsure_rows = None
for row in csv_reader:
# if row['Start Time'] != 'None':
try:
if before_date > strptime(row['Start Time'], '%Y-%m-%d %H:%M:%S') > after_date:
continue
except ValueError:
if not unsure_rows:
# Initialize the unsure rows dictionary
unsure_rows = dict.fromkeys(csv_reader.fieldnames,[])
for key in unsure_rows:
unsure_rows[key].append(row[key])
The solution from cheeken isn’t working anymore as dict.fromkeys
is creating the dictionary with all values pointing to the same list you should instead use {k: [] for k in my_csv_dict.keys()}