Pandas – syntax of merging data from multiple rows into a single row

Question:

I am trying merge data from multiple rows into a single row with multiple columns. I can only manage one column to work when I want multiple.

Given below is how the data looks like :

column1 column2 column3
1       a       Y
1       b       Z
2       d       Z
2       c       Y

Expected output:

column1 column2 column3
1       a, b       Y, Z
2       d, c       Z, Y

Using df.groupby('column1')['column2'].apply(', '.join).reset_index() I’ve got the output:

column1 column2 
1       a, b      
2       d, c    

But unsure the syntax for adding column3.

df.groupby('column1')['column2', 'column3'].apply(', '.join).reset_index() doesn’t work.

Neither does df.groupby('column1')[['column2', 'column3']].apply(', '.join).reset_index().

Asked By: user19830155

||

Answers:

You can try agg for multiple concatenations:

df.groupby('column1').agg(lambda x: ', '.join(x)).reset_index()
Answered By: Nuri Taş
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.