Convert Dataframe to Dictionary where Some Cells Contain Lists

Question:

I have a dataframe that is structured like this:

ID CustRef
0 111
1 222
2 333, 444, 555, 666

It is simple enough to convert it to a dictionary using to_dict but where there are multiple CustRefs I would like those values to be converted to a list.

So, in this example the dict would be:

result_dict = {'0': 111, '1': 222, '2': [333, 444, 555, 666}}

Is that possible?

Asked By: Robsmith

||

Answers:

You can split and rework depending on the number of items:

s = df['CustRef'].str.split(', ')
s.loc[s.str.len().le(1)] = s.str[0]

out = dict(zip(df['ID'], s))

Or, using pure python:

out = {k: next(iter(l), None) if len(l:=v.split(', '))<2 else l
       for k,v in zip(df['ID'], df['CustRef'])}

output:

{0: '111', 1: '222', 2: ['333', '444', '555', '666']}

If you really need to use to_dict (e.g., to modify to handle more columns):

s = df['CustRef'].str.split(', ')
out = (df
   .assign(CustRef=df['CustRef'].where(s.str.len().le(1), s))
   .set_index('ID')['CustRef']
   .to_dict()
)
Answered By: mozway
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.