Pandas : Concat rows of a dataframe with same index to form custom string in pairs

Question:

Say I have a dataframe

df = pd.DataFrame({'colA' : ['ABC', 'JKL', 'STU', '123'],
                   'colB' : ['DEF', 'MNO', 'VWX', '456'],
                   'colC' : ['GHI', 'PQR', 'YZ', '789'],}, index = [0,0,1,1])
   colA colB colC
0  ABC   DEF  GHI 
0  JKL   MNO  PQR
1  STU   VWX   YZ
1  123   456  789

Its guranteed that every pair will have the same index, so we would like the end result to be :

     colA        colB       colC
0  ABC_JKL_0   DEF_MNO_0  GHI_PQR_0 
1  STU_123_1   VWX_456_1   YZ_789_1

where the suffix _number is the index of that group.

I tried doing this by iterating through rows but that’s taking a lot of time.
I was thinking of something like .groupby(level=0) but can’t figure out the next aggregation apply part

Asked By: Himanshu Poddar

||

Answers:

df.groupby(level=0).agg(lambda x: f"{'_'.join(x)}_{x.index[0]}")

Output:

        colA       colB       colC
0  ABC_JKL_0  DEF_MNO_0  GHI_PQR_0
1  STU_123_1  VWX_456_1   YZ_789_1
Answered By: BeRT2me

You can do:

df.groupby(level=0).agg('_'.join).transform(lambda x:x+'_'+str(x.index[0]))
Answered By: SomeDude

IIUC, you can try something like this using .agg and a lambda function or you can add it into the dataframe after the groupby:

df_out=df.groupby(level=0).agg(lambda x: '_'.join(x)+'_'+str(x.index[0]))

Output:

        colA       colB       colC
0  ABC_JKL_0  DEF_MNO_0  GHI_pQR_0
1  STU_123_1  VWX_456_1   YZ_789_1

Or

df_out=df.groupby(level=0).agg('_'.join)
df_out = df_out.add('_'+df_out.index.to_series().astype(str), axis=0)
print(df_out)
Answered By: Scott Boston