How to add a list of irregular dictionaries to DataFrame

Question:

I am trying to add a list of dictionaries information to DataFrame, but I don’t know the way.

For example, I have a DataFrame shown below.

    Name    City  Age
0   John      NY   25
1    Ken  London   32
2  Smith  Boston   29
3   Kate    York   21
4    Tom   Paris   42

At the same time, I have a list of dictionaries shown below.

[{'A': 15, 'B': 35, 'D': 10},
 {'C': 124, 'E': 36},
 {'A': 3, 'F': 10},
 {},
 {'B': 4, 'A': 8, 'C': 1}]

Each dictiory is related to each row of the DataFrame above.
For example, the first dictionary information is related to the first row of the DataFrame.

Thus, I’d like to add the list information to the DataFrame to make the modified DataFrame below, but I don’t know the way. I would be grateful if anyone tell me how to write the codes to summarise the infromation. (I made the Data below manually.)

    Name    City  Age   A   B    C   D   E   F
0   John      NY   25  15  35    0  10   0   0
1    Ken  London   32   0   0  124   0  36   0
2  Smith  Boston   29   3   0    0   0   0  10
3   Kate    York   21   0   0    0   0   0   0
4    Tom   Paris   42   8   4    1   0   0   0

The points that I think are difficult are:

  • Each dictionary has the different length. In the real DataFrame I’d like to analyse, there are a variety of keys in each dictionary while some dictionaries are empty.
  • Some keys appear several times in different dictionaries like "A", "B", and "C" above. In these cases, I’d like to use only one "A", "B", or "C" column by summarising the information.
  • The DataFrame example has only five rows and the list has only five dictionaries, so I was able to summarise the information manually. However, the real DataFrame and list I’d like to analyse have huge rows and dictionaries, so it is impossible to organise the information without writing codes.

I looked for the same question online and wrote codes by myself, but I was not able to find the way. I would like to know the codes which solve my problem.

Asked By: hiro

||

Answers:

To convert the irregular list of dictionaries (let’s name it ild) to a dataframe on its own, use

df2 = pd.DataFrame(ild, dtype=object).fillna(0).astype(int)

After that you only have to append the columns of df2 to the other dataframe.

The code first creates a dataframe from the ild. Pandas is smart enough to do the most work alone, missing data is filled with NaN. Without dtype=object it would automatically use floats (as int doesn’t have a NaN value) which could introduce rounding errors.

The NaN are then replaced by zeros with fillna and the int objects are finally converted to integers with astype.

Answered By: Michael Butscher
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.