PANDAS : column with previous transactions between users

Question:

I have a dataframe with this structure

df = pd.DataFrame([
    (1000, 'Frank', 'Hugo'),
    (100, 'Peter', 'Frank'),
    (10000, 'Hugo', 'James'),
    (300, 'James', 'Jennifer'),
    (250, 'Frank', 'Hugo'),
    (550, 'Peter', 'Frank'),
    (10000, 'Frank', 'Hugo')
], columns=['Amount', 'Payer', 'Payee'])

And, basing upon the information contained in dataframe (approx 2M rows) I should create a column with the information on how many transactions the payer did previously to that same payee. The final goal should be to put a flag on every new transaction between two users.
The output I should get is this :

df = pd.DataFrame([
    (1000, 'Frank', 'Hugo', 0),
    (100, 'Peter', 'Frank', 0),
    (10000, 'Hugo', 'James', 0),
    (300, 'James', 'Jennifer', 0),
    (250, 'Frank', 'Hugo', 1),
    (550, 'Peter', 'Frank', 1),
    (10000, 'Frank', 'Hugo', 2)
], columns=['Amount', 'Payer', 'Payee', 'Previous_transactions'])

I really can’t figure it out, I just got stuck with

count = df['Payee'].groupby(df['Payer']).value_counts()

but I feel I am far away from the solution, since this only generates a list of a different length.

Asked By: MLonzo

||

Answers:

You can use cumcount:

df['Previous_transactions']=df.groupby(['Payer','Payee']).cumcount()

'''
|    |   Amount | Payer   | Payee    |   x |   Previous_transactions |
|---:|---------:|:--------|:---------|----:|------------------------:|
|  0 |     1000 | Frank   | Hugo     |   0 |                       0 |
|  1 |      100 | Peter   | Frank    |   0 |                       0 |
|  2 |    10000 | Hugo    | James    |   0 |                       0 |
|  3 |      300 | James   | Jennifer |   0 |                       0 |
|  4 |      250 | Frank   | Hugo     |   1 |                       1 |
|  5 |      550 | Peter   | Frank    |   1 |                       1 |
|  6 |    10000 | Frank   | Hugo     |   2 |                       2 |
'''
Answered By: Bushmaster