pandas dataframe reshape cast
Question:
I have a dataframe like this:
import pandas
df=pandas.DataFrame([['a','b'],['a','c'],['b','c'],['b','d'],['c','f']],columns=['id','key'])
print(df)
id key
0 a b
1 a c
2 b c
3 b d
4 c f
the result that I wanted:
id key
0 a b,c
1 b c,d
2 c f
I try use pivot function, but I don’t get the result. The cast packages in R seems to tackle the problem. Thanks!
Answers:
You need groupby
with apply
join
:
df1 = df.groupby('id')['key'].apply(','.join).reset_index()
print (df1)
id key
0 a b,c
1 b c,d
2 c f
a numpy
approach
g = df.id.values
k = df.key.values
a = g.argsort(kind='mergesort')
gg = g[a]
kg = k[a]
w = np.where(gg[:-1] != gg[1:])[0]
pd.DataFrame(dict(
id=gg[np.append(w, len(a) - 1)],
key=[','.join(l.tolist()) for l in np.split(kg, w + 1)]
))
id key
0 a b,c
1 b c,d
2 c f
speed versus intuition
I have a dataframe like this:
import pandas
df=pandas.DataFrame([['a','b'],['a','c'],['b','c'],['b','d'],['c','f']],columns=['id','key'])
print(df)
id key
0 a b
1 a c
2 b c
3 b d
4 c f
the result that I wanted:
id key
0 a b,c
1 b c,d
2 c f
I try use pivot function, but I don’t get the result. The cast packages in R seems to tackle the problem. Thanks!
You need groupby
with apply
join
:
df1 = df.groupby('id')['key'].apply(','.join).reset_index()
print (df1)
id key
0 a b,c
1 b c,d
2 c f
a numpy
approach
g = df.id.values
k = df.key.values
a = g.argsort(kind='mergesort')
gg = g[a]
kg = k[a]
w = np.where(gg[:-1] != gg[1:])[0]
pd.DataFrame(dict(
id=gg[np.append(w, len(a) - 1)],
key=[','.join(l.tolist()) for l in np.split(kg, w + 1)]
))
id key
0 a b,c
1 b c,d
2 c f
speed versus intuition