How to create rank column in Python based on other columns
Question:
I have a python dataframe that looks like the following:
This dataframe has been sorted in descending order by 'transaction_count'
. I want to create another column in that dataframe called 'rank'
that contains the count of occurrences of cust_ID
. My desired output would look something like the following:
For cust_ID = 1234
with transaction_count = 4
, the rank would be 1, for the next appearance of cust_ID = 1234
, the rank would be 2 and so on.
I tried the following among other things:
df['rank'] = df["cust_ID"].value_counts()
df.head(10)
But the rank column gets created as all NaN values
Any suggestions on how to approach this would be greatly appreciated!
Answers:
you can do:
df['rank'] = df.groupby('cust_ID')['transaction_count'].rank(ascending=False)
Output:
cust_ID associate_ID transaction_count rank
0 1234 608 4 1.0
1 1234 785 1 2.0
2 4789 345 2 1.0
3 3456 268 5 1.0
4 3456 725 3 2.0
5 3456 795 1 3.0
Note that this gives not only the counts, but also the rank of the transaction, based on the transaction_count
value.
I have a python dataframe that looks like the following:
This dataframe has been sorted in descending order by 'transaction_count'
. I want to create another column in that dataframe called 'rank'
that contains the count of occurrences of cust_ID
. My desired output would look something like the following:
For cust_ID = 1234
with transaction_count = 4
, the rank would be 1, for the next appearance of cust_ID = 1234
, the rank would be 2 and so on.
I tried the following among other things:
df['rank'] = df["cust_ID"].value_counts()
df.head(10)
But the rank column gets created as all NaN values
Any suggestions on how to approach this would be greatly appreciated!
you can do:
df['rank'] = df.groupby('cust_ID')['transaction_count'].rank(ascending=False)
Output:
cust_ID associate_ID transaction_count rank
0 1234 608 4 1.0
1 1234 785 1 2.0
2 4789 345 2 1.0
3 3456 268 5 1.0
4 3456 725 3 2.0
5 3456 795 1 3.0
Note that this gives not only the counts, but also the rank of the transaction, based on the transaction_count
value.