Python: resample on a rolling basis

Question:

I have a DataFrame as follows:

data = [[99330,12,122],
   [1123,1230,1287],
   [123,101,812739],
   [1143,12301230,252],
   [234,342,4546],
   [2445,3453,3457],
   [7897,8657,5675],
   [46,5675,453],
   [76,484,3735],
   [363,93,4568],
   [385,568,367],
   [458,846,4847],
   [574,45747,658468],
   [57457,46534,4675]]
df1 = pd.DataFrame(data, index=['2022-01-01', '2022-01-02', '2022-01-03', '2022-01-04',
                           '2022-01-05', '2022-01-06', '2022-01-07', '2022-01-08',
                           '2022-01-09', '2022-01-10', '2022-01-11', '2022-01-12',
                           '2022-01-13', '2022-01-14'], 
              columns=['col_A', 'col_B', 'col_C'])
df1.index = pd.to_datetime(df1.index)
df1.resample('1D').last().rolling(7).last()

The last line gives me the following error: AttributeError: 'Rolling' object has no attribute 'last'

What I want to do is resample the data on a rolling basis (for 7, 30, 90 days).

Is there a way to this without using many loops?

Asked By: MathMan 99

||

Answers:

You apply last as if rolling gave you dataframe; it doesn’t (because it’s actually "incomplete" as you can see below).

A general tip is that you can grab whatever you get from the previous step, and use help on it.
In this case

x = df1.resample('1D').last().rolling(7)
help(x)

which gives you a very extensive manual.

What’s missing from your problem is that you haven’t actually precisely specified what you want to roll. Do you want a rolling mean?
That gives you a hint that you want to use .mean() on the rolled data.

More specifically, in this case, it’s probably most helpful to use the timedelta, which will make the code even more clear, and help you with corner-cases as opposed to using the integer count

rolled_df = df1.rolling(datetime.timedelta(days=7)).mean()
Answered By: Mikael Öhman