Adding a numerical index to a categorical index in Pandas

Question:

Say for example that we have the data set that looks like

client client case client payment
foo 1 200
foo 1 250
foo 1 225
foo 2 200
bar 1 500
bar 1 500

Within Pandas, I can set client and client case to be indexes, but I would like to number these indexes within groups to have an additional column that becomes something like below.

client client case client payment payment number
foo 1 200 1
foo 1 250 2
foo 1 225 3
foo 2 200 1
bar 1 500 1
bar 1 500 2

Essentially just creating some kind of payment numbering system within each client case.

I could theoretically do this by creating some kind of list by iterating over the rows but I know that this isn’t the optimal or suggested way of using Pandas.

Asked By: Ryan Honea

||

Answers:

this would do

df['payment number'] = 1 + df.groupby(['client', 'client case']).cumcount()

output

  client  client case  client payment  payment number
0    foo            1             200               1
1    foo            1             250               2
2    foo            1             225               3
3    foo            2             200               1
4    bar            1             500               1
5    bar            1             500               2
Answered By: Yuca
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.