Calculate the number of days since last event for specific user

Question:

I have a dataset like this.

  1. I would like to calculate the mean for each user based on previous events for the user.
  2. Calculate the cumulative number of events for each user
  3. Calculate the number of days since the last event for a specific user.

from datetime import date
import pandas as pd

Sales = [1000, 1450, 1390, 1778, 1966, 1100, 700, 900]
User_ID = [50, 52, 50, 53,53, 52, 52, 52]
begin_date = '2022-10-16 10:12:00'

# initialize data of lists.
data = {'date':pd.date_range(begin_date, periods=len(Sales)),
        'User_ID': User_ID,
       'Sales' : Sales}
  
# Create DataFrame
df = pd.DataFrame(data)
  
# Print the output.
df

Input

enter image description here

Output

enter image description here

Asked By: AAA

||

Answers:

you can get cumulative average by expanding

g = df.groupby('User_ID')
g['Sales'].transform(lambda x: x.expanding().mean())

you can get time gap following code:

s = g['date'].transform(lambda x: x.sub(x.shift(1)))

s

0      NaT
1      NaT
2   2 days
3      NaT
4   1 days
5   4 days
6   1 days
7   1 days
Name: date, dtype: timedelta64[ns]

you can get cumcount

g.cumcount() + 1

make desired output

df.assign(mean= g['Sales'].transform(lambda x: x.expanding().mean()), 
          Time_Elapsed=s.div(pd.Timedelta('1day')).fillna(0).astype('int'), 
          No_of_event=g.cumcount() + 1)

result:

    date            User_ID Sales   mean    Time_Elapsed    No_of_event
0   2022-10-16 10:12:00 50  1000    1000.0  0               1
1   2022-10-17 10:12:00 52  1450    1450.0  0               1
2   2022-10-18 10:12:00 50  1390    1195.0  2               2
3   2022-10-19 10:12:00 53  1778    1778.0  0               1
4   2022-10-20 10:12:00 53  1966    1872.0  1               2
5   2022-10-21 10:12:00 52  1100    1275.0  4               2
6   2022-10-22 10:12:00 52  700     1083.3  1               3
7   2022-10-23 10:12:00 52  900     1037.5  1               4
Answered By: Panda Kim
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.