Group by dataframe in python and concat strings on multiple columns

Question:

I have dataframe like below

enter image description here

A,B,C,D
91102,1,john,
91102,2,john,
91102,3,john,
91102,1,,mary
91102,2,,mary
91102,3,,mary
91103,1,sarah,
91103,2,sarah,
91103,3,sarah,
91103,1,,khan
91103,2,,khan
91103,3,,khan

and I want groupby column A and column B and want get desired output like below
enter image description here

A,B,C,D
91102,1,john,mary
91102,2,john,mary
91102,3,john,mary
91103,1,sarah,khan
91103,2,sarah,khan
91103,3,sarah,khan

I tried below but not giving desired output

df=df.groupby(['A', 'B'], as_index=False).agg('' .join)
Asked By: AB SEA

||

Answers:

In the groupby you could back-fill and then take the first row of the group.

df.groupby(['A','B'], as_index=False).apply(lambda x: x.bfill().iloc[0])

Result

       A  B      C     D
0  91102  1   john  mary
1  91102  2   john  mary
2  91102  3   john  mary
3  91103  1  sarah  khan
4  91103  2  sarah  khan
5  91103  3  sarah  khan
Answered By: jch

Try:

x = df.set_index(["A", "B"]).stack().unstack().reset_index()
print(x)

Prints:

       A  B      C     D
0  91102  1   john  mary
1  91102  2   john  mary
2  91102  3   john  mary
3  91103  1  sarah  khan
4  91103  2  sarah  khan
5  91103  3  sarah  khan
Answered By: Andrej Kesely
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.