rolling window in pandas dataframe in reverse date order?
Question:
I have a dataframe that i apply a rolling()
window. However the dates are sorted newest to oldest, so the missing part (the window width) is at the top.
To avoid this i reverse sort by dates
and then apply the rolling()
method. However, this seems to be inefficient, so was wondering if there is a way to apply rolling from the bottom upwards?
Example dataframe:
Symbol Date Open High Low Close
0 UKX:IND 2022-09-01 7284.15 7284.15 7131.69 7148.50
1 UKX:IND 2022-08-31 7361.63 7378.44 7263.62 7284.15
2 UKX:IND 2022-08-30 7427.31 7486.40 7351.12 7361.63
3 UKX:IND 2022-08-26 7479.74 7530.65 7422.02 7427.31
4 UKX:IND 2022-08-25 7471.51 7535.70 7469.17 7479.74
5 UKX:IND 2022-08-24 7488.11 7488.12 7410.40 7471.51
6 UKX:IND 2022-08-23 7533.79 7533.79 7467.56 7488.11
7 UKX:IND 2022-08-22 7550.37 7550.41 7491.26 7533.79
8 UKX:IND 2022-08-19 7541.85 7578.85 7513.26 7550.37
9 UKX:IND 2022-08-18 7515.75 7541.89 7493.66 7541.85
This is the relevant part of the code:
df = df.sort_values(by='Date') # <-- do a reverse sort
df['ma'] = df['Close'].rolling(window=5).mean()
df = df.sort_values(by='Date', ascending=False) # <-- sort back again
What i have tried ?
- i try things like including a negative number:
rolling(window=-5)
- i try searching the docs: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rolling.html
So far, my only solution is to reverse the dataframe by date (and then reverse it back).
Answers:
No need to sort twice, use indexing to temporarily reverse, which should be faster:
df = df.sort_values(by='Date', ascending=False)
df['ma'] = df['Close'][::-1].rolling(window=5).mean()[::-1]
print(df)
Or even:
df['ma'] = df['Close'][::-1].rolling(window=5).mean()
as pandas aligns the indices before assignement
output:
Symbol Date Open High Low Close ma
0 UKX:IND 2022-09-01 7284.15 7284.15 7131.69 7148.50 7340.266
1 UKX:IND 2022-08-31 7361.63 7378.44 7263.62 7284.15 7404.868
2 UKX:IND 2022-08-30 7427.31 7486.40 7351.12 7361.63 7445.660
3 UKX:IND 2022-08-26 7479.74 7530.65 7422.02 7427.31 7480.092
4 UKX:IND 2022-08-25 7471.51 7535.70 7469.17 7479.74 7504.704
5 UKX:IND 2022-08-24 7488.11 7488.12 7410.40 7471.51 7517.126
6 UKX:IND 2022-08-23 7533.79 7533.79 7467.56 7488.11 NaN
7 UKX:IND 2022-08-22 7550.37 7550.41 7491.26 7533.79 NaN
8 UKX:IND 2022-08-19 7541.85 7578.85 7513.26 7550.37 NaN
9 UKX:IND 2022-08-18 7515.75 7541.89 7493.66 7541.85 NaN
I have a dataframe that i apply a rolling()
window. However the dates are sorted newest to oldest, so the missing part (the window width) is at the top.
To avoid this i reverse sort by dates
and then apply the rolling()
method. However, this seems to be inefficient, so was wondering if there is a way to apply rolling from the bottom upwards?
Example dataframe:
Symbol Date Open High Low Close
0 UKX:IND 2022-09-01 7284.15 7284.15 7131.69 7148.50
1 UKX:IND 2022-08-31 7361.63 7378.44 7263.62 7284.15
2 UKX:IND 2022-08-30 7427.31 7486.40 7351.12 7361.63
3 UKX:IND 2022-08-26 7479.74 7530.65 7422.02 7427.31
4 UKX:IND 2022-08-25 7471.51 7535.70 7469.17 7479.74
5 UKX:IND 2022-08-24 7488.11 7488.12 7410.40 7471.51
6 UKX:IND 2022-08-23 7533.79 7533.79 7467.56 7488.11
7 UKX:IND 2022-08-22 7550.37 7550.41 7491.26 7533.79
8 UKX:IND 2022-08-19 7541.85 7578.85 7513.26 7550.37
9 UKX:IND 2022-08-18 7515.75 7541.89 7493.66 7541.85
This is the relevant part of the code:
df = df.sort_values(by='Date') # <-- do a reverse sort
df['ma'] = df['Close'].rolling(window=5).mean()
df = df.sort_values(by='Date', ascending=False) # <-- sort back again
What i have tried ?
- i try things like including a negative number:
rolling(window=-5)
- i try searching the docs: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rolling.html
So far, my only solution is to reverse the dataframe by date (and then reverse it back).
No need to sort twice, use indexing to temporarily reverse, which should be faster:
df = df.sort_values(by='Date', ascending=False)
df['ma'] = df['Close'][::-1].rolling(window=5).mean()[::-1]
print(df)
Or even:
df['ma'] = df['Close'][::-1].rolling(window=5).mean()
as pandas aligns the indices before assignement
output:
Symbol Date Open High Low Close ma
0 UKX:IND 2022-09-01 7284.15 7284.15 7131.69 7148.50 7340.266
1 UKX:IND 2022-08-31 7361.63 7378.44 7263.62 7284.15 7404.868
2 UKX:IND 2022-08-30 7427.31 7486.40 7351.12 7361.63 7445.660
3 UKX:IND 2022-08-26 7479.74 7530.65 7422.02 7427.31 7480.092
4 UKX:IND 2022-08-25 7471.51 7535.70 7469.17 7479.74 7504.704
5 UKX:IND 2022-08-24 7488.11 7488.12 7410.40 7471.51 7517.126
6 UKX:IND 2022-08-23 7533.79 7533.79 7467.56 7488.11 NaN
7 UKX:IND 2022-08-22 7550.37 7550.41 7491.26 7533.79 NaN
8 UKX:IND 2022-08-19 7541.85 7578.85 7513.26 7550.37 NaN
9 UKX:IND 2022-08-18 7515.75 7541.89 7493.66 7541.85 NaN