Interacting through dicts, grabbing their values and transitioning to a panda df

Question

I have a list of dicts

[{a:'jeffrey',b:'pineapple',c:'apple'},{a:'epstein',c:'banana'},{a:'didnt kill'},{a:'himself',b:'jebus'}]

What I want to do is transition those values to a pandas df. But as you can see a few dicts are lacking a few keys and therefore lacking values. So I took a glance at defaultdict object so I could transform the list object to something that pandas actually is able to interpret. And transform it into a dataframe.

dd = defaultdict(list)

for d in l:
    for k in d.keys():
        dd[k]

for d in l:
    for k in dd.keys():
        try: 
            dd[k].append(d[k])
        except KeyError:
            dd[k].append(0)
# Dict auto adaptavél

The code works, and folows the order given of those events meaning with the key is empty return a 0. But I was wondering if there better alternative or a code which has a better o(n) complexity

Wanted result:

defaultdict(<class 'list'>, {'a': ['jeffrey', 'epstein', 'didnt kill', 'himself'], 'b': ['pineapple', 0, 0, 'jebus'], 'c': ['apple', 'banana', 0, 0]})

Asked By: INGl0R1AM0R1

||

Source

Answer 1

Why use a defaultdict? Just use the get method from dict and a default value:

d_list = [{a:'jeffrey',b:'pineapple',c:'apple'},{a:'epstein',c:'banana'},{a:'didnt kill'},{a:'himself',b:'jebus'}]

dd = dict()

for key in [a, b, c]:
    dd[key] = [d.get(key, 0) for d in d_list]

print(dd)

Output:

{a: ['jeffrey', 'epstein', 'didnt kill', 'himself'], b: ['pineapple', 0, 0, 'jebus'], c: ['apple', 'banana', 0, 0]}

You may also use something else for [a, b, c], but I can’t guarantee to know all keys given the list you’ve presented (not in a nice short way).

Answered By: B Remmelzwaal

Answer 2

You can use DataFrame constructor and fill missing values with 0 then use to_dict method to export the dataframe as a dict of lists:

>>> pd.DataFrame(l).fillna(0).to_dict('list')
{'a': ['jeffrey', 'epstein', 'didnt kill', 'himself'],
 'b': ['pineapple', 0, 0, 'jebus'],
 'c': ['apple', 'banana', 0, 0]}

Intermediate result:

>>> pd.DataFrame(l)
            a          b       c
0     jeffrey  pineapple   apple
1     epstein        NaN  banana
2  didnt kill        NaN     NaN
3     himself      jebus     NaN

Answered By: Corralien

Answer 3

I am not sure I fully understand your question but given a list_of_dictionaries you can convert it to a pandas dataframe using the pd.Dataframe() method.

df = pd.DataFrame(list_of_dicts)

You can then fill the dataframe with 0 value where a key is missing using the .fillna() method:

df.fillna(0, inplace=True)

While @B Remmelzwaal answer is also correct, iterating over lists through lists and dictionaries is not good practice, especially when dealing with large amount of data.

Answered By: corvusMidnight

Interacting through dicts, grabbing their values and transitioning to a panda df

Question:

Answers: