Count the group occurrences

Question:

I have dataframe

      id webpage
       1   google
       2    bing
       3   google
       4   google
       5   yahoo
       6   yahoo
       7   google
       8   google

Would like to count the groups

like

       id webpage count
       1   google  1
       2    bing   2
       3   google  3
       4   google  3
       5   yahoo   4
       6   yahoo   4
       7   google  5
       8   google  5

I have tried using the cumcount or ngroup when using groupby it is grouping all occurrence.

Asked By: AAk

||

Answers:

I believe you need to cumsum() over the state transitions. Every time webpage differs from the previous row you increase your count.

df["count"] = (df.webpage != df.webpage.shift()).cumsum()
Answered By: filippo

I m not so used to pandas but I just made a quick dataframe and made a program to get the expected result. I made a variable count whose value increase when we find a data which is different than last data (_data) the current count value is added to the list of all counts then finally after getting all the counts. the count colum is added to the dataframe.

import pandas as pd
webpage_list=['google', 'bing','google','google','yahoo','yahoo','google','google']
df=pd.DataFrame(webpage_list,columns=['webpage'])
count=0
counts=[]
_data=''
for data in df['webpage']:
   if data!=_data:
       count+=1
   _data=data
   counts.append(count)
df['count']=counts
print(df)
Answered By: Jeson Pun
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.