Rename column names of groupby and count result with Pandas
Question:
Given the following dataframe:
import numpy as np
df = pd.DataFrame({'price': np.random.random_integers(0, high=100, size=100)})
ranges = [0,10,20,30,40,50,60,70,80,90,100]
df.groupby(pd.cut(df.price, ranges)).count()
Out:
price
price
(0, 10] 9
(10, 20] 11
(20, 30] 11
(30, 40] 9
(40, 50] 16
(50, 60] 7
(60, 70] 10
(70, 80] 9
(80, 90] 14
(90, 100] 4
How could I reset_index
the result and rename column names as bins
and counts
? Thanks.
bins counts
0 (0, 10] 9
1 (10, 20] 11
2 (20, 30] 11
3 (30, 40] 9
4 (40, 50] 16
5 (50, 60] 7
6 (60, 70] 10
7 (70, 80] 9
8 (80, 90] 14
9 (90, 100] 4
Answers:
This code works but not concise enough, if you have other options, welcome to share:
df.groupby(pd.cut(df.price, ranges)).count()
.rename(columns={'price' : 'counts'})
.reset_index()
.rename(columns={'price': 'bins'})
Out:
bins counts
0 (0, 10] 9
1 (10, 20] 11
2 (20, 30] 11
3 (30, 40] 9
4 (40, 50] 16
5 (50, 60] 7
6 (60, 70] 10
7 (70, 80] 9
8 (80, 90] 14
9 (90, 100] 4
One idea is use rename
for Series from pd.cut
, so if select column price
for processing groups output is Series
, so add Series.reset_index
with name
parameter for 2 columns DataFrame
:
df1 = (df.groupby(pd.cut(df.price, ranges).rename('bins'))['price'].count()
.reset_index(name='counts'))
print (df1)
bins counts
0 (0, 10] 13
1 (10, 20] 13
2 (20, 30] 9
3 (30, 40] 9
4 (40, 50] 7
5 (50, 60] 9
6 (60, 70] 9
7 (70, 80] 12
8 (80, 90] 9
9 (90, 100] 9
df.groupby('team', as_index=False).agg(my_sum=('points', sum),my_max=('points', max))
Given the following dataframe:
import numpy as np
df = pd.DataFrame({'price': np.random.random_integers(0, high=100, size=100)})
ranges = [0,10,20,30,40,50,60,70,80,90,100]
df.groupby(pd.cut(df.price, ranges)).count()
Out:
price
price
(0, 10] 9
(10, 20] 11
(20, 30] 11
(30, 40] 9
(40, 50] 16
(50, 60] 7
(60, 70] 10
(70, 80] 9
(80, 90] 14
(90, 100] 4
How could I reset_index
the result and rename column names as bins
and counts
? Thanks.
bins counts
0 (0, 10] 9
1 (10, 20] 11
2 (20, 30] 11
3 (30, 40] 9
4 (40, 50] 16
5 (50, 60] 7
6 (60, 70] 10
7 (70, 80] 9
8 (80, 90] 14
9 (90, 100] 4
This code works but not concise enough, if you have other options, welcome to share:
df.groupby(pd.cut(df.price, ranges)).count()
.rename(columns={'price' : 'counts'})
.reset_index()
.rename(columns={'price': 'bins'})
Out:
bins counts
0 (0, 10] 9
1 (10, 20] 11
2 (20, 30] 11
3 (30, 40] 9
4 (40, 50] 16
5 (50, 60] 7
6 (60, 70] 10
7 (70, 80] 9
8 (80, 90] 14
9 (90, 100] 4
One idea is use rename
for Series from pd.cut
, so if select column price
for processing groups output is Series
, so add Series.reset_index
with name
parameter for 2 columns DataFrame
:
df1 = (df.groupby(pd.cut(df.price, ranges).rename('bins'))['price'].count()
.reset_index(name='counts'))
print (df1)
bins counts
0 (0, 10] 13
1 (10, 20] 13
2 (20, 30] 9
3 (30, 40] 9
4 (40, 50] 7
5 (50, 60] 9
6 (60, 70] 9
7 (70, 80] 12
8 (80, 90] 9
9 (90, 100] 9
df.groupby('team', as_index=False).agg(my_sum=('points', sum),my_max=('points', max))