Extract all keys from a list of dictionaries

Question:

I’m trying to get a list of all keys in a list of dictionaries in order to fill out the fieldnames argument for csv.DictWriter.

previously, I had something like this:

[
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5},
{"name": "Pam", "age": 7}
]

and I was using fieldnames = list[0].keys() to take the first dictionary in the list and extract its keys.

Now I have something like this where one of the dictionaries has more key:value pairs than the others (could be any of the results). The new keys are added dynamically based on information coming from an API so they may or may not occur in each dictionary and I don’t know in advance how many new keys there will be.

[
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5, "height":4},
{"name": "Pam", "age": 7}
]

I can’t just use fieldnames = list[1].keys() since it isn’t necessarily the second element that will have extra keys.

A simple solution would be to find the dictionary with the greatest number of keys and use it for the fieldnames, but that won’t work if you have an example like this:

[
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5, "height":4},
{"name": "Pam", "age": 7, "weight":90}
]

where both the second and third dictionary have 3 keys but the end result should really be the list ["name", "age", "height", "weight"]

Asked By: orh

||

Answers:

all_keys = set().union(*(d.keys() for d in mylist))

Edit: have to unpack the list. Now fixed.

Answered By: Hugh Bothwell

The following example will extract the keys:

set_ = set()
for dict_ in dictionaries:
    set_.update(dict_.keys())
print set_
Answered By: user1277476
>>> lis=[
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5, "height":4},
{"name": "Pam", "age": 7, "weight":90}
]
>>> {z for y in (x.keys() for x in lis) for z in y}
set(['age', 'name', 'weight', 'height'])
Answered By: Ashwini Chaudhary

Your data:

>>> LoD
[{'age': 10, 'name': 'Tom'}, 
 {'age': 5, 'name': 'Mark', 'height': 4}, 
 {'age': 7, 'name': 'Pam', 'weight': 90}]

This set comprehension will do it:

>>> {k for d in LoD for k in d.keys()}
{'age', 'name', 'weight', 'height'}

It works this way. First, create a list of lists of the dict keys:

>>> [list(d.keys()) for d in LoD]
[['age', 'name'], ['age', 'name', 'height'], ['age', 'name', 'weight']]

Then create a flattened version of this list of lists:

>>> [i for s in [d.keys() for d in LoD] for i in s]
['age', 'name', 'age', 'name', 'height', 'age', 'name', 'weight']

And create a set to eliminate duplicates:

>>> set([i for s in [d.keys() for d in LoD] for i in s])
{'age', 'name', 'weight', 'height'}

Which can be simplified to:

{k for d in LoD for k in d.keys()}
Answered By: dawg

Borrowing lis from @AshwiniChaudhary’s answer, here is an explanation of how you could solve your problem.

>>> lis=[
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5, "height":4},
{"name": "Pam", "age": 7, "weight":90}
]

Iterating directly over a dict returns its keys, so you don’t have to call keys() to get them back, saving a function call and a list construction per element in your list.

>>> {k for d in lis for k in d}
set(['age', 'name', 'weight', 'height'])

or use itertools.chain:

>>> from itertools import chain
>>> {k for k in chain(*lis)}
set(['age', 'name', 'weight', 'height'])
Answered By: PaulMcG
from itertools import chain

lis = [
    {"name": "Tom", "age": 10},
    {"name": "Mark", "age": 5, "height":4},
    {"name": "Pam", "age": 7, "weight":90}
]

# without qualification a dict iterates over its keys
# and set takes any iterable in its constructor
headers_as_set = set(chain.from_iterable(lis))

# you asked for a list
headers = list(
    set(chain.from_iterable(lis))
)
Answered By: bwv549

If order matters to you, read on…

Input your data:

>>> list_of_dicts = [{'age': 10, 'name': 'Tom'},{'age': 5, 'name': 'Mark', 'height': 4}, {'age': 7, 'name': 'Pam', 'weight': 90}]

Define your function:

>>> def get_all_keys_in_order(list_of_dicts):
        ordered_keys = []
        for dict_ in list_of_dicts:
            for key in dict_:
                if key not in ordered_keys:
                    ordered_keys.append(key)
        return ordered_keys

Run your function to get output:

>>> get_all_keys_in_order(list_of_dicts)
['age', 'name', 'height', 'weight']
Answered By: mareoraft
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.