Converting a dataframe to dictionary with multiple values
Question:
I have a dataframe like
Sr.No ID A B C D
1 Tom Earth English BMW
2 Tom Mars Spanish BMW Green
3 Michael Mercury Hindi Audi Yellow
4 John Venus Portugese Mercedes Blue
5 John German Audi Red
I am trying to convert this to a dictionary by ID like :
{'ID' : 'Tom', 'A' : ['Earth', 'Mars'], 'B' : ['English', 'Spanish'], 'C' :
['BMW', 'BMW'], 'D':['Green'] },
{'ID' : 'Michael', 'A' : ['Mercury'], 'B' : ['Hindi'], 'C' : ['Audi'],
'D':['Yellow']},
{'ID' : 'John', 'A' : ['Venus'], 'B' : ['Portugese', 'German'], 'C' :
['Mercedes', 'Audi'], 'D':['Blue', 'Red'] }
This is somewhat similar to what I want.
I also tried ,
df.set_index('ID').to_dict()
but this gives me dictionary of length 5 instead of 3. Any help would be appreciated.
Answers:
You can use groupby
with orient of to_dict
as list
and convert the resultant series to a dictionary
.
df.set_index('Sr.No', inplace=True)
df.groupby('ID').apply(lambda x: x.to_dict('list')).reset_index(drop=True).to_dict()
{0: {'C': ['Mercedes', 'Audi'], 'ID': ['John', 'John'], 'A': ['Venus', nan],
'B': ['Portugese', 'German'], 'D': ['Blue', 'Red']},
1: {'C': ['Audi'], 'ID': ['Michael'], 'A': ['Mercury'], 'B': ['Hindi'], 'D': ['Yellow']},
2: {'C': ['BMW', 'BMW'], 'ID': ['Tom', 'Tom'], 'A': ['Earth', 'Mars'],
'B': ['English', 'Spanish'], 'D': [nan, 'Green']}}
Inorder to remove ID
, you can also do:
df.groupby('ID')['A','B','C','D'].apply(lambda x: x.to_dict('list'))
.reset_index(drop=True).to_dict()
Grouping by 'ID'
and apply to_dict
to each group with orient='list'
comes pretty close:
df.groupby('ID').apply(lambda dfg: dfg.to_dict(orient='list')).to_dict()
Out[25]:
{'John': {'A': ['Venus', nan],
'B': ['Portugese', 'German'],
'C': ['Mercedes', 'Audi'],
'D': ['Blue', 'Red'],
'ID': ['John', 'John'],
'Sr.No': [4, 5]},
'Michael': {'A': ['Mercury'],
'B': ['Hindi'],
'C': ['Audi'],
'D': ['Yellow'],
'ID': ['Michael'],
'Sr.No': [3]},
'Tom': {'A': ['Earth', 'Mars'],
'B': ['English', 'Spanish'],
'C': ['BMW', 'BMW'],
'D': [nan, 'Green'],
'ID': ['Tom', 'Tom'],
'Sr.No': [1, 2]}}
It should just be a matter of formatting the result slightly.
Edit: to remove 'ID'
from the dictionaries:
df.groupby('ID').apply(lambda dfg: dfg.drop('ID', axis=1).to_dict(orient='list')).to_dict()
Out[5]:
{'John': {'A': ['Venus', nan],
'B': ['Portugese', 'German'],
'C': ['Mercedes', 'Audi'],
'D': ['Blue', 'Red'],
'Sr.No': [4, 5]},
'Michael': {'A': ['Mercury'],
'B': ['Hindi'],
'C': ['Audi'],
'D': ['Yellow'],
'Sr.No': [3]},
'Tom': {'A': ['Earth', 'Mars'],
'B': ['English', 'Spanish'],
'C': ['BMW', 'BMW'],
'D': [nan, 'Green'],
'Sr.No': [1, 2]}}
Hope this can help.
# sample data
df = pd.DataFrame([[1,'a'],[1,'b'],[2,'c']], columns=['key', 'value'])
df
key value
0 1 a
1 1 b
2 2 c
df.groupby('key')['value'].agg(list).to_dict()
{1: ['a', 'b'], 2: ['c']}
I have a dataframe like
Sr.No ID A B C D
1 Tom Earth English BMW
2 Tom Mars Spanish BMW Green
3 Michael Mercury Hindi Audi Yellow
4 John Venus Portugese Mercedes Blue
5 John German Audi Red
I am trying to convert this to a dictionary by ID like :
{'ID' : 'Tom', 'A' : ['Earth', 'Mars'], 'B' : ['English', 'Spanish'], 'C' :
['BMW', 'BMW'], 'D':['Green'] },
{'ID' : 'Michael', 'A' : ['Mercury'], 'B' : ['Hindi'], 'C' : ['Audi'],
'D':['Yellow']},
{'ID' : 'John', 'A' : ['Venus'], 'B' : ['Portugese', 'German'], 'C' :
['Mercedes', 'Audi'], 'D':['Blue', 'Red'] }
This is somewhat similar to what I want.
I also tried ,
df.set_index('ID').to_dict()
but this gives me dictionary of length 5 instead of 3. Any help would be appreciated.
You can use groupby
with orient of to_dict
as list
and convert the resultant series to a dictionary
.
df.set_index('Sr.No', inplace=True)
df.groupby('ID').apply(lambda x: x.to_dict('list')).reset_index(drop=True).to_dict()
{0: {'C': ['Mercedes', 'Audi'], 'ID': ['John', 'John'], 'A': ['Venus', nan],
'B': ['Portugese', 'German'], 'D': ['Blue', 'Red']},
1: {'C': ['Audi'], 'ID': ['Michael'], 'A': ['Mercury'], 'B': ['Hindi'], 'D': ['Yellow']},
2: {'C': ['BMW', 'BMW'], 'ID': ['Tom', 'Tom'], 'A': ['Earth', 'Mars'],
'B': ['English', 'Spanish'], 'D': [nan, 'Green']}}
Inorder to remove ID
, you can also do:
df.groupby('ID')['A','B','C','D'].apply(lambda x: x.to_dict('list'))
.reset_index(drop=True).to_dict()
Grouping by 'ID'
and apply to_dict
to each group with orient='list'
comes pretty close:
df.groupby('ID').apply(lambda dfg: dfg.to_dict(orient='list')).to_dict()
Out[25]:
{'John': {'A': ['Venus', nan],
'B': ['Portugese', 'German'],
'C': ['Mercedes', 'Audi'],
'D': ['Blue', 'Red'],
'ID': ['John', 'John'],
'Sr.No': [4, 5]},
'Michael': {'A': ['Mercury'],
'B': ['Hindi'],
'C': ['Audi'],
'D': ['Yellow'],
'ID': ['Michael'],
'Sr.No': [3]},
'Tom': {'A': ['Earth', 'Mars'],
'B': ['English', 'Spanish'],
'C': ['BMW', 'BMW'],
'D': [nan, 'Green'],
'ID': ['Tom', 'Tom'],
'Sr.No': [1, 2]}}
It should just be a matter of formatting the result slightly.
Edit: to remove 'ID'
from the dictionaries:
df.groupby('ID').apply(lambda dfg: dfg.drop('ID', axis=1).to_dict(orient='list')).to_dict()
Out[5]:
{'John': {'A': ['Venus', nan],
'B': ['Portugese', 'German'],
'C': ['Mercedes', 'Audi'],
'D': ['Blue', 'Red'],
'Sr.No': [4, 5]},
'Michael': {'A': ['Mercury'],
'B': ['Hindi'],
'C': ['Audi'],
'D': ['Yellow'],
'Sr.No': [3]},
'Tom': {'A': ['Earth', 'Mars'],
'B': ['English', 'Spanish'],
'C': ['BMW', 'BMW'],
'D': [nan, 'Green'],
'Sr.No': [1, 2]}}
Hope this can help.
# sample data
df = pd.DataFrame([[1,'a'],[1,'b'],[2,'c']], columns=['key', 'value'])
df
key value
0 1 a
1 1 b
2 2 c
df.groupby('key')['value'].agg(list).to_dict()
{1: ['a', 'b'], 2: ['c']}