How to get Dataframe after applying groupby and should not be in list or string in python
Question:
Phone_no Date Problem
1 2020-01-19 G
1 2020-01-19 A
1 2020-01-27 B
1 2020-01-28 C
1 2020-01-28 H
2 2020-01-19 T
3 2020-01-04 U
3 2020-01-22 P
4 2020-01-12 E
5 2020-01-01 G
5 2020-01-11 A
2 2020-01-31 I
2 2020-01-31 E
**I want to apply groupby on Phone_no,Date and want data should bein dataframe as mention below **
Phone_no Date Problem
1 2020-01-19 G,A
1 2020-01-27 B
1 2020-01-28 C,H
2 2020-01-19 T
2 2020-01-31 I,E
3 2020-01-04 U
3 2020-01-22 P
4 2020-01-12 E
5 2020-01-01 G
5 2020-01-11 A
Answers:
You can groupby
"Phone_no" and "Date" and use agg
to apply a lambda function that join
s "Problem" values in each group:
out = df.groupby(['Phone_no','Date'])['Problem'].agg(lambda x: ','.join(x)).reset_index()
or in a simpler way (thanks @Corralien):
out = df.groupby(['Phone_no','Date'], as_index=False)['Problem'].apply(','.join)
if you want tuples:
out = df.groupby(['Phone_no','Date'])['Problem'].apply(tuple).reset_index()
Output:
Phone_no Date Problem
0 1 2020-01-19 G,A
1 1 2020-01-27 B
2 1 2020-01-28 C,H
3 2 2020-01-19 T
4 2 2020-01-31 I,E
5 3 2020-01-04 U
6 3 2020-01-22 P
7 4 2020-01-12 E
8 5 2020-01-01 G
9 5 2020-01-11 A
Phone_no Date Problem
1 2020-01-19 G
1 2020-01-19 A
1 2020-01-27 B
1 2020-01-28 C
1 2020-01-28 H
2 2020-01-19 T
3 2020-01-04 U
3 2020-01-22 P
4 2020-01-12 E
5 2020-01-01 G
5 2020-01-11 A
2 2020-01-31 I
2 2020-01-31 E
**I want to apply groupby on Phone_no,Date and want data should bein dataframe as mention below **
Phone_no Date Problem
1 2020-01-19 G,A
1 2020-01-27 B
1 2020-01-28 C,H
2 2020-01-19 T
2 2020-01-31 I,E
3 2020-01-04 U
3 2020-01-22 P
4 2020-01-12 E
5 2020-01-01 G
5 2020-01-11 A
You can groupby
"Phone_no" and "Date" and use agg
to apply a lambda function that join
s "Problem" values in each group:
out = df.groupby(['Phone_no','Date'])['Problem'].agg(lambda x: ','.join(x)).reset_index()
or in a simpler way (thanks @Corralien):
out = df.groupby(['Phone_no','Date'], as_index=False)['Problem'].apply(','.join)
if you want tuples:
out = df.groupby(['Phone_no','Date'])['Problem'].apply(tuple).reset_index()
Output:
Phone_no Date Problem
0 1 2020-01-19 G,A
1 1 2020-01-27 B
2 1 2020-01-28 C,H
3 2 2020-01-19 T
4 2 2020-01-31 I,E
5 3 2020-01-04 U
6 3 2020-01-22 P
7 4 2020-01-12 E
8 5 2020-01-01 G
9 5 2020-01-11 A