Interacting through dicts, grabbing their values and transitioning to a panda df
Question:
I have a list of dicts
[{a:'jeffrey',b:'pineapple',c:'apple'},{a:'epstein',c:'banana'},{a:'didnt kill'},{a:'himself',b:'jebus'}]
What I want to do is transition those values to a pandas df. But as you can see a few dicts are lacking a few keys and therefore lacking values. So I took a glance at defaultdict object so I could transform the list object to something that pandas actually is able to interpret. And transform it into a dataframe.
dd = defaultdict(list)
for d in l:
for k in d.keys():
dd[k]
for d in l:
for k in dd.keys():
try:
dd[k].append(d[k])
except KeyError:
dd[k].append(0)
# Dict auto adaptavél
The code works, and folows the order given of those events meaning with the key is empty return a 0. But I was wondering if there better alternative or a code which has a better o(n) complexity
Wanted result:
defaultdict(<class 'list'>, {'a': ['jeffrey', 'epstein', 'didnt kill', 'himself'], 'b': ['pineapple', 0, 0, 'jebus'], 'c': ['apple', 'banana', 0, 0]})
Answers:
Why use a defaultdict
? Just use the get
method from dict
and a default value:
d_list = [{a:'jeffrey',b:'pineapple',c:'apple'},{a:'epstein',c:'banana'},{a:'didnt kill'},{a:'himself',b:'jebus'}]
dd = dict()
for key in [a, b, c]:
dd[key] = [d.get(key, 0) for d in d_list]
print(dd)
Output:
{a: ['jeffrey', 'epstein', 'didnt kill', 'himself'], b: ['pineapple', 0, 0, 'jebus'], c: ['apple', 'banana', 0, 0]}
You may also use something else for [a, b, c]
, but I can’t guarantee to know all keys given the list you’ve presented (not in a nice short way).
You can use DataFrame
constructor and fill missing values with 0 then use to_dict
method to export the dataframe as a dict of lists:
>>> pd.DataFrame(l).fillna(0).to_dict('list')
{'a': ['jeffrey', 'epstein', 'didnt kill', 'himself'],
'b': ['pineapple', 0, 0, 'jebus'],
'c': ['apple', 'banana', 0, 0]}
Intermediate result:
>>> pd.DataFrame(l)
a b c
0 jeffrey pineapple apple
1 epstein NaN banana
2 didnt kill NaN NaN
3 himself jebus NaN
I am not sure I fully understand your question but given a list_of_dictionaries
you can convert it to a pandas dataframe
using the pd.Dataframe()
method.
df = pd.DataFrame(list_of_dicts)
You can then fill the dataframe with 0
value where a key is missing using the .fillna()
method:
df.fillna(0, inplace=True)
While @B Remmelzwaal answer is also correct, iterating over lists through lists and dictionaries is not good practice, especially when dealing with large amount of data.
I have a list of dicts
[{a:'jeffrey',b:'pineapple',c:'apple'},{a:'epstein',c:'banana'},{a:'didnt kill'},{a:'himself',b:'jebus'}]
What I want to do is transition those values to a pandas df. But as you can see a few dicts are lacking a few keys and therefore lacking values. So I took a glance at defaultdict object so I could transform the list object to something that pandas actually is able to interpret. And transform it into a dataframe.
dd = defaultdict(list)
for d in l:
for k in d.keys():
dd[k]
for d in l:
for k in dd.keys():
try:
dd[k].append(d[k])
except KeyError:
dd[k].append(0)
# Dict auto adaptavél
The code works, and folows the order given of those events meaning with the key is empty return a 0. But I was wondering if there better alternative or a code which has a better o(n) complexity
Wanted result:
defaultdict(<class 'list'>, {'a': ['jeffrey', 'epstein', 'didnt kill', 'himself'], 'b': ['pineapple', 0, 0, 'jebus'], 'c': ['apple', 'banana', 0, 0]})
Why use a defaultdict
? Just use the get
method from dict
and a default value:
d_list = [{a:'jeffrey',b:'pineapple',c:'apple'},{a:'epstein',c:'banana'},{a:'didnt kill'},{a:'himself',b:'jebus'}]
dd = dict()
for key in [a, b, c]:
dd[key] = [d.get(key, 0) for d in d_list]
print(dd)
Output:
{a: ['jeffrey', 'epstein', 'didnt kill', 'himself'], b: ['pineapple', 0, 0, 'jebus'], c: ['apple', 'banana', 0, 0]}
You may also use something else for [a, b, c]
, but I can’t guarantee to know all keys given the list you’ve presented (not in a nice short way).
You can use DataFrame
constructor and fill missing values with 0 then use to_dict
method to export the dataframe as a dict of lists:
>>> pd.DataFrame(l).fillna(0).to_dict('list')
{'a': ['jeffrey', 'epstein', 'didnt kill', 'himself'],
'b': ['pineapple', 0, 0, 'jebus'],
'c': ['apple', 'banana', 0, 0]}
Intermediate result:
>>> pd.DataFrame(l)
a b c
0 jeffrey pineapple apple
1 epstein NaN banana
2 didnt kill NaN NaN
3 himself jebus NaN
I am not sure I fully understand your question but given a list_of_dictionaries
you can convert it to a pandas dataframe
using the pd.Dataframe()
method.
df = pd.DataFrame(list_of_dicts)
You can then fill the dataframe with 0
value where a key is missing using the .fillna()
method:
df.fillna(0, inplace=True)
While @B Remmelzwaal answer is also correct, iterating over lists through lists and dictionaries is not good practice, especially when dealing with large amount of data.