How can I append each key and value of dictionary to columns?

Question:

In Python, I have a dataframe that has json variables.

For example, if keys and values of a column are like:

{'a': 3, 'b': 2, 'c': 1}
{'d': 287, 'e': 56, 'f': 17, 'g': 9}
{'h': 22, 'i': 15, 'a': 7}
{'g': 17, 'd': 15, 'e': 15, 'f': 7, 'h': 15}

I want to append columns to be like

key_1 value_1 key_2 value_2 key_3 value_3 key_4 value_4 key_5 value_5
a      3       b      2       c      1
d      287     e      56      f      17     g      9
h      22      i      15      a       7       
g      17      d      15      e      15     f      7      h      15

i.e.,

  1. One pair generates two columns. (key_, value_)

  2. Each row has different numbers of key-value pairs.

  3. Though keys are the same, the values of the keys can be different.

  4. Though keys are different, the values of the keys can be the same.

  5. I need to make only up to 600 columns (300 pairs * 2).

Asked By: Peter Park

||

Answers:

Try:

# if type of the column is string, convert it to python object first:

#from ast import literal_eval
#df["column"] = df["column"].apply(literal_eval)

x = (
    df["column"]
    .apply(
        lambda x: {
            f"{k}{i}": v
            for i, t in enumerate(x.items(), 1)
            for k, v in zip(["key_", "value_"], t)
        },
    )
    .to_list()
)

print(pd.DataFrame(x).fillna(""))

Prints:

  key_1  value_1 key_2  value_2 key_3  value_3 key_4 value_4 key_5 value_5
0     a        3     b        2     c        1                            
1     d      287     e       56     f       17     g     9.0              
2     h       22     i       15     a        7                            
3     g       17     d       15     e       15     f     7.0     h    15.0

Dataframe used:

                                         column
0                      {'a': 3, 'b': 2, 'c': 1}
1          {'d': 287, 'e': 56, 'f': 17, 'g': 9}
2                    {'h': 22, 'i': 15, 'a': 7}
3  {'g': 17, 'd': 15, 'e': 15, 'f': 7, 'h': 15}
Answered By: Andrej Kesely

For fun (and as a "challenge", I would have used something similar to @Andrej’s approach in real life that is more concise and efficient), here is a pure pandas solution:

s = df['col'].apply(list).explode().rename('key')
idx = pd.MultiIndex.from_arrays([s.index, s])

(pd
 .json_normalize(df['col']).stack()
 .to_frame(name='value')
 .reindex(idx)
 .assign(pos=lambda d: d.groupby(level=0).cumcount().add(1))
 .reset_index()
 .pivot(index='level_0', columns='pos')
 .sort_index(level=1, axis=1, sort_remaining=False)
 .pipe(lambda d: d.set_axis(d.columns.map(lambda x: f'{x[0]}_{x[1]}'), axis=1))
)

output:

        key_1  value_1 key_2  value_2 key_3  value_3 key_4  value_4 key_5  value_5
level_0                                                                           
0           a      3.0     b      2.0     c      1.0   NaN      NaN   NaN      NaN
1           d    287.0     e     56.0     f     17.0     g      9.0   NaN      NaN
2           h     22.0     i     15.0     a      7.0   NaN      NaN   NaN      NaN
3           g     17.0     d     15.0     e     15.0     f      7.0     h     15.0
Answered By: mozway
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.