Get keys of dictionary based on rules
Question:
given a dictionary
dictionary = {'Animal 1': {'Dog': 'Yes', 'Cat': 'No', 'Color': 'Black'},
'Animal 2': {'Dog': 'Yes', 'Cat': 'No', 'Color': 'Brown'},
'Animal 3': {'Dog': 'No', 'Cat': 'Yes', 'Color': 'Grey'}}
How do I select the Animals that are dogs?
expected output ['Animal 1','Animal 2']
I could use:
pd.DataFrame.from_dict(dictionary).T.loc[pd.DataFrame.from_dict(dictionary).T["Dog"]=='Yes',:].index.to_list()
but it looks very ugly
Answers:
You can use list comprehension:
dictionary = {
"Animal 1": {"Dog": "Yes", "Cat": "No", "Color": "Black"},
"Animal 2": {"Dog": "Yes", "Cat": "No", "Color": "Brown"},
"Animal 3": {"Dog": "No", "Cat": "Yes", "Color": "Grey"},
}
out = [k for k, d in dictionary.items() if d.get("Dog") == "Yes"]
print(out)
Prints:
['Animal 1', 'Animal 2']
The pandas version could be trimmed by using an intermediate variable so that you don’t double calculate the mask. And you don’t need .loc for this filter.
df = pd.DataFrame.from_dict(dictionary).T
dogs = df[df["Dog"] == "Yes"].index.to_list()
But this is still complex compared to running through the dict items in another answer to this question. It would only be interesting if there was some future need of the dataframe.
given a dictionary
dictionary = {'Animal 1': {'Dog': 'Yes', 'Cat': 'No', 'Color': 'Black'},
'Animal 2': {'Dog': 'Yes', 'Cat': 'No', 'Color': 'Brown'},
'Animal 3': {'Dog': 'No', 'Cat': 'Yes', 'Color': 'Grey'}}
How do I select the Animals that are dogs?
expected output ['Animal 1','Animal 2']
I could use:
pd.DataFrame.from_dict(dictionary).T.loc[pd.DataFrame.from_dict(dictionary).T["Dog"]=='Yes',:].index.to_list()
but it looks very ugly
You can use list comprehension:
dictionary = {
"Animal 1": {"Dog": "Yes", "Cat": "No", "Color": "Black"},
"Animal 2": {"Dog": "Yes", "Cat": "No", "Color": "Brown"},
"Animal 3": {"Dog": "No", "Cat": "Yes", "Color": "Grey"},
}
out = [k for k, d in dictionary.items() if d.get("Dog") == "Yes"]
print(out)
Prints:
['Animal 1', 'Animal 2']
The pandas version could be trimmed by using an intermediate variable so that you don’t double calculate the mask. And you don’t need .loc for this filter.
df = pd.DataFrame.from_dict(dictionary).T
dogs = df[df["Dog"] == "Yes"].index.to_list()
But this is still complex compared to running through the dict items in another answer to this question. It would only be interesting if there was some future need of the dataframe.