pandas dataframe reshape cast

Question:

I have a dataframe like this:

import pandas
df=pandas.DataFrame([['a','b'],['a','c'],['b','c'],['b','d'],['c','f']],columns=['id','key'])
print(df)

  id key
0  a   b
1  a   c
2  b   c
3  b   d
4  c   f

the result that I wanted:

   id  key
0  a  b,c
1  b  c,d
2  c    f

I try use pivot function, but I don’t get the result. The cast packages in R seems to tackle the problem. Thanks!

Asked By: pang2016

||

Answers:

You need groupby with apply join:

df1 = df.groupby('id')['key'].apply(','.join).reset_index()
print (df1)
  id  key
0  a  b,c
1  b  c,d
2  c    f
Answered By: jezrael

a numpy approach

g = df.id.values
k = df.key.values
a = g.argsort(kind='mergesort')
gg = g[a]
kg = k[a]

w = np.where(gg[:-1] != gg[1:])[0]

pd.DataFrame(dict(
        id=gg[np.append(w, len(a) - 1)],
        key=[','.join(l.tolist()) for l in np.split(kg, w + 1)]
    ))

  id  key
0  a  b,c
1  b  c,d
2  c    f

speed versus intuition

enter image description here

Answered By: piRSquared
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.