Making long-to-wide transformation by grouping/seperating rows by a delimiter

Question:

I need to make a long-to-wide transformation (see image below) using Pandas.

enter image description here

I made this code but unfortunately, it does not work !

Code :

import pandas as pd

df = pd.DataFrame({'Id': ['Id001', 'Id001', 'Id002', 'Id003', 'Id003', 'Id003'],
 'val1': [np.nan, 'B', 3, 'H', np.nan, 'J'],
 'val2': ['N', np.nan, 'M', 2, 'K', 'I'],
 'val3': [5, 'E', 'P', 'L', np.nan, 'R']})

df = df.groupby('Id')
             .agg(
              val1=('val1',' | '.join),
              val2=('val2',' | '.join),
              val3=('val3',' | '.join))
             .rename_axis(None))

df

Here is the error I’m getting :

Error :

TypeError: sequence item 0: expected str instance, float found

Do you have any suggestions/solutions ?

Asked By: L'Artiste

||

Answers:

The error is due to the presence of NaN values, NaN values are categroized as floating point types and hence you can’t join strings with NaN. The solution is to explicitly cast the NaN’s to string

df.filter(like='val').astype(str).groupby(df['Id']).agg('|'.join)

          val1   val2     val3
Id                            
Id001    nan|B  N|nan      5|E
Id002        3      M        P
Id003  H|nan|J  2|K|I  L|nan|R
Answered By: Shubham Sharma
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.