How can I do a cumulative count of IDs, sorted, in a dataframe?

Question

I have a dataframe that looks like this.

import pandas as pd

data = {'ID':[29951,29952,29953,29951,29951],'DESCRIPTION':['IPHONE 15','SAMSUNG S40','MOTOROLA G1000','IPHONE 15','IPHONE 15'],'PRICE_PROVIDER1':[1000.00,1200.00,1100.00,1000.00,1000.00]}

df = pd.DataFrame(data)
df

I want to add a new column that counts unique IDs. The final DF should look like this.

I thought it would be as simple as this:

df['cnt'] = df.groupby('ID').cumcount(ascending=True)
df

That’s not doing what I want.

Asked By: ASH

||

Source

Answer 1

try this:

df['cnt'], _ = pd.factorize(df['ID'])
print(df)

	ID	DESCRIPTION	PRICE_PROVIDER1	cnt
0	29951	IPHONE 15	1000	0
1	29952	SAMSUNG S40	1200	1
2	29953	MOTOROLA G1000	1100	2
3	29951	IPHONE 15	1000	0
4	29951	IPHONE 15	1000	0

Answered By: ziying35

Answer 2

Here’s another way:

df['cnt'] = df['ID'].astype('category').cat.codes

Output:

      ID     DESCRIPTION  PRICE_PROVIDER1  cnt
0  29951       IPHONE 15           1000.0    0
1  29952     SAMSUNG S40           1200.0    1
2  29953  MOTOROLA G1000           1100.0    2
3  29951       IPHONE 15           1000.0    0
4  29951       IPHONE 15           1000.0    0

Answered By: Scott Boston

How can I do a cumulative count of IDs, sorted, in a dataframe?

Question:

Answers: