Python pandas: Add a column to my dataframe that counts a variable
Question:
I have a dataframe ‘gt’ like this:
org group
org1 1
org2 1
org3 2
org4 3
org5 3
org6 3
and I would like to add column ‘count’ to gt dataframe to counts number member of the groups, expected results like this:
org group count
org1 1 2
org2 1 2
org3 2 1
org4 3 3
org5 3 3
org6 3 3
I know how to do it per one item of the group, but do not know how to make the count repeated for all of the group items, here is the code I have used:
gtcounts = gt.groupby('group').count()
Answers:
Call transform
this will return a Series aligned with the original df:
In [223]:
df['count'] = df.groupby('group')['group'].transform('count')
df
Out[223]:
org group count
0 org1 1 2
1 org2 1 2
2 org3 2 1
3 org4 3 3
4 org5 3 3
5 org6 3 3
It can also be done with a combination of value_counts()
and map
as well. Basically, the idea is to find the counts of each group; then map these counts back to the groups.
df['count'] = df['group'].map(df['group'].value_counts())
# or
df['count'] = df['group'].map(df.groupby('group')['group'].count())
I have a dataframe ‘gt’ like this:
org group
org1 1
org2 1
org3 2
org4 3
org5 3
org6 3
and I would like to add column ‘count’ to gt dataframe to counts number member of the groups, expected results like this:
org group count
org1 1 2
org2 1 2
org3 2 1
org4 3 3
org5 3 3
org6 3 3
I know how to do it per one item of the group, but do not know how to make the count repeated for all of the group items, here is the code I have used:
gtcounts = gt.groupby('group').count()
Call transform
this will return a Series aligned with the original df:
In [223]:
df['count'] = df.groupby('group')['group'].transform('count')
df
Out[223]:
org group count
0 org1 1 2
1 org2 1 2
2 org3 2 1
3 org4 3 3
4 org5 3 3
5 org6 3 3
It can also be done with a combination of value_counts()
and map
as well. Basically, the idea is to find the counts of each group; then map these counts back to the groups.
df['count'] = df['group'].map(df['group'].value_counts())
# or
df['count'] = df['group'].map(df.groupby('group')['group'].count())