How to calculate relative volume using pandas with faster way?

Question:

I am trying to implement the RVOL by the time of day technical indicator, which can be used as the indication of market strength.

The logic behind this is as follows:

If the current time is 2022/3/19 13:00, we look through the same moment (13:00) at the previous N days and average all the previous volumes at that moment to calculate Average_volume_previous.
Then, RVOL(t) is volume(t)/Average_volume_previous(t).

It is hard to use methods like rolling and apply to deal with this complex logic in the code I wrote.
However, the operation time of for loop is catastrophically long.

from datetime import datetime
import pandas as pd
import numpy as np

datetime_array = pd.date_range(datetime.strptime('2015-03-19 13:00:00', '%Y-%m-%d %H:%M:%S'), datetime.strptime("2022-03-19 13:00:00", '%Y-%m-%d %H:%M:%S'), freq='30min')
volume_array = pd.Series(np.random.uniform(1000, 10000, len(datetime_array)))
df = pd.DataFrame({'Date':datetime_array, 'Volume':volume_array})
df.set_index(['Date'], inplace=True)

output = []
for idx in range(len(df)):
    date = str(df.index[idx].hour)+':'+str(df.index[idx].minute)
    temp_date = df.iloc[:idx].between_time(date, date)
    output.append(temp_date.tail(day_len).mean().iloc[0])

output = np.array(output)

Practically, there might be missing data in the datetime array. So, it would be hard to use fixed length lookback period to solve this. Is there any way to make this code work faster?

Asked By: wanga10000

||

Answers:

I’m not sure I understand, however this is the solution as far as I understand.
I didn’t use date as index
df.set_index([‘Date’], inplace=True)

# Filter data to find instant 
rolling_day = 10
hour = df['Date'].dt.hour == 13
minute = df['Date'].dt.minute == 0
df_moment = df[ore&minuti].copy()

Calculation of moving averages

df_moment['rolling'] = df_moment.rolling(rolling_day).mean()

Calculation of Average_volume_previous(t)/volume(t)

for idx_s, idx_e in zip(df_moment['Volume'][::rolling_day], df_moment['rolling'][rolling_day::rolling_day]):
     print(f'{idx_s/idx_e}')

Output:

0.566379345408499
0.7229214799940626
0.6753586759429548
2.0588617812341354
0.7494803741982076
1.2132554086225438
Answered By: Cristian Festi