Adding a numerical index to a categorical index in Pandas
Question:
Say for example that we have the data set that looks like
client
client case
client payment
foo
1
200
foo
1
250
foo
1
225
foo
2
200
bar
1
500
bar
1
500
Within Pandas, I can set client and client case to be indexes, but I would like to number these indexes within groups to have an additional column that becomes something like below.
client
client case
client payment
payment number
foo
1
200
1
foo
1
250
2
foo
1
225
3
foo
2
200
1
bar
1
500
1
bar
1
500
2
Essentially just creating some kind of payment numbering system within each client case.
I could theoretically do this by creating some kind of list by iterating over the rows but I know that this isn’t the optimal or suggested way of using Pandas.
Answers:
this would do
df['payment number'] = 1 + df.groupby(['client', 'client case']).cumcount()
output
client client case client payment payment number
0 foo 1 200 1
1 foo 1 250 2
2 foo 1 225 3
3 foo 2 200 1
4 bar 1 500 1
5 bar 1 500 2
Say for example that we have the data set that looks like
client | client case | client payment |
---|---|---|
foo | 1 | 200 |
foo | 1 | 250 |
foo | 1 | 225 |
foo | 2 | 200 |
bar | 1 | 500 |
bar | 1 | 500 |
Within Pandas, I can set client and client case to be indexes, but I would like to number these indexes within groups to have an additional column that becomes something like below.
client | client case | client payment | payment number |
---|---|---|---|
foo | 1 | 200 | 1 |
foo | 1 | 250 | 2 |
foo | 1 | 225 | 3 |
foo | 2 | 200 | 1 |
bar | 1 | 500 | 1 |
bar | 1 | 500 | 2 |
Essentially just creating some kind of payment numbering system within each client case.
I could theoretically do this by creating some kind of list by iterating over the rows but I know that this isn’t the optimal or suggested way of using Pandas.
this would do
df['payment number'] = 1 + df.groupby(['client', 'client case']).cumcount()
output
client client case client payment payment number
0 foo 1 200 1
1 foo 1 250 2
2 foo 1 225 3
3 foo 2 200 1
4 bar 1 500 1
5 bar 1 500 2