concatenate features with identical ID in DataFrame

Question

I have tables that have many features, and these features can have the same ID.

How can I check each for ID, then concatenate identical ID features in one row?

For example, here’s an example of a simple table stored in a dataframe several features and one ID and the output will concatenate all features that have same ID and put them as new features and for IDs that don’t have other features will be zero value as in this table:

result

Asked By: Hiss

||

Source

Answer 1

join is what you need. And specify how='outer' if you dont want to lose any of row.

df1.set_index('ID').join(df2.set_index('ID'), how='outer')

Answered By: Bulat Ibragimov

Answer 2

IIUC, You can use:

dfx=df.groupby('ID').agg(list)
max_list_lenght=len(dfx.max()[0])
final=pd.DataFrame(dfx.apply(lambda x: [x[i][j] if len(x[i]) > j else 0 for j in range(0,max_list_lenght) for i in dfx.columns],axis=1).tolist(), index=dfx.index)
final.columns=['dat' + str(i) for i in range(1,len(final.columns) + 1)]

Output:

    dat1  dat2  dat3  dat4  dat5  dat6
ID                                    
1      9     3     6     5     7     7
2      5     5     5     6     5     5
3      3     0     5     0     0     0

Answered By: Bushmaster

concatenate features with identical ID in DataFrame

Question:

Answers: