How to count the number of group elements with pandas
Question:
I have a dataframe and just want count the number of elements in each group. I know, I can use the groupby().count() to get all the counts of all the columns, but it is too much for me, I just want the number of elements in each group. How can I do this?
Here is the example:
mydf = pd.DataFrame({"fruit":["apple","banana","apple"],"weight":[7,8,3],"price":[4,5,6]})
mydf
>> fruit price weight
>> 0 apple 4 7
>> 1 banana 5 8
>> 2 apple 6 3
If I use the groupby(“fruit”).mean(), I will get the value for each column.
mydf.groupby("fruit").mean()
>> price weight
>> fruit
>> apple 2 2
>> banana 1 1
But my expect output is:
>> number_of_fruit
>> fruit
>> apple 2
>> banana 1
How can I do this?
Answers:
You want size
to count the number of each fruit:
In [102]:
mydf.groupby('fruit').size()
Out[102]:
fruit
apple 2
banana 1
dtype: int64
The answer by EdChum is great and I just want to give an alternative solution using value_counts function:
mydf["fruit"].value_counts()
Output:
apple 2
banana 1
Name: fruit, dtype: int64
I have a dataframe and just want count the number of elements in each group. I know, I can use the groupby().count() to get all the counts of all the columns, but it is too much for me, I just want the number of elements in each group. How can I do this?
Here is the example:
mydf = pd.DataFrame({"fruit":["apple","banana","apple"],"weight":[7,8,3],"price":[4,5,6]})
mydf
>> fruit price weight
>> 0 apple 4 7
>> 1 banana 5 8
>> 2 apple 6 3
If I use the groupby(“fruit”).mean(), I will get the value for each column.
mydf.groupby("fruit").mean()
>> price weight
>> fruit
>> apple 2 2
>> banana 1 1
But my expect output is:
>> number_of_fruit
>> fruit
>> apple 2
>> banana 1
How can I do this?
You want size
to count the number of each fruit:
In [102]:
mydf.groupby('fruit').size()
Out[102]:
fruit
apple 2
banana 1
dtype: int64
The answer by EdChum is great and I just want to give an alternative solution using value_counts function:
mydf["fruit"].value_counts()
Output:
apple 2
banana 1
Name: fruit, dtype: int64