How can I do a cumulative count of IDs, sorted, in a dataframe?

Question:

I have a dataframe that looks like this.

import pandas as pd

data = {'ID':[29951,29952,29953,29951,29951],'DESCRIPTION':['IPHONE 15','SAMSUNG S40','MOTOROLA G1000','IPHONE 15','IPHONE 15'],'PRICE_PROVIDER1':[1000.00,1200.00,1100.00,1000.00,1000.00]}

df = pd.DataFrame(data)
df

enter image description here

I want to add a new column that counts unique IDs. The final DF should look like this.

enter image description here

I thought it would be as simple as this:

df['cnt'] = df.groupby('ID').cumcount(ascending=True)
df

That’s not doing what I want.

Asked By: ASH

||

Answers:

try this:

df['cnt'], _ = pd.factorize(df['ID'])
print(df)
ID DESCRIPTION PRICE_PROVIDER1 cnt
0 29951 IPHONE 15 1000 0
1 29952 SAMSUNG S40 1200 1
2 29953 MOTOROLA G1000 1100 2
3 29951 IPHONE 15 1000 0
4 29951 IPHONE 15 1000 0
Answered By: ziying35

Here’s another way:

df['cnt'] = df['ID'].astype('category').cat.codes

Output:

      ID     DESCRIPTION  PRICE_PROVIDER1  cnt
0  29951       IPHONE 15           1000.0    0
1  29952     SAMSUNG S40           1200.0    1
2  29953  MOTOROLA G1000           1100.0    2
3  29951       IPHONE 15           1000.0    0
4  29951       IPHONE 15           1000.0    0
Answered By: Scott Boston