Unstack dataframe column

Question

I have this dataframe:

df = pd.DataFrame()
df["A"] = [2,2,4,4,4,8,9]
df["B"] = [2,2,4,4,4,7,9]
df["C"] = list("axcdxef")
print(df.to_string(index=False))

 A  B  C
 2  2  a
 2  2  x
 4  4  c
 4  4  d
 4  4  x
 8  7  e
 9  9  f

I want to convert to this, unstacking column C for rows where columns A and B are duplicates:

df2 = pd.DataFrame()
df2["A"] = [2,4,8,9]
df2["B"] = [2,4,7,9]
df2["C"] = ["a,x", "c,d,x", "e", "f"]

print()
print(df2.to_string(index=False))

 A  B      C
 2  2    a,x
 4  4  c,d,x
 8  7      e
 9  9      f

I’ve looked at pivot() and unstack(), but haven’t found the right recipe yet. Any ideas?

Asked By: Steve

||

Source

Answer 1

use .groupby and .agg(','.join)

df1 = df.groupby(['A','B'],as_index=False)['C'].agg(','.join)

print(df1)

   A  B      C
0  2  2    a,x
1  4  4  c,d,x
2  8  7      e
3  9  9      f

Answered By: Umar.H

Unstack dataframe column

Question:

Answers: