Generate daily time series date from monthly usage with python

Question:

I have about two years of monthly gas usage for a city and want to generate daily use concerning daily usage sum equal to monthly and keep time-series shape, but I don’t know how to do that.

Here is my data Link [1]

Asked By: Milad Firoozeh

||

Answers:

The following code sample demonstrates date and data interpolation using pandas.

The following steps are taken:

  • Using the provided dataset, read this into a DataFrame.
  • Calculate a cumulative sum of usage data.
  • Set the DataFrame’s index as the date, to facilitate date resampling.
  • Resample for dates to a daily frequency.
  • Calculate the daily usage.

Example code:

# Read the CSV and convert dates to a datetime object.
path = '~/Downloads/usage.csv'
df = pd.read_csv(path, 
                 header=0, 
                 names=['date', 'gas_usage'], 
                 converters={'date': pd.to_datetime})
# Calculate a cumulative sum to be interpolated.
df['gas_usage_c'] = df['gas_usage'].cumsum()
# Move the date to the index, for resampling.
df.set_index('date', inplace=True)

# Resample the data to a daily ('D') frequency.
df2 = df.resample('D').interpolate('time')
# Calculate the daily usage.
df2['daily_usage'] = df2['gas_usage_c'].diff()

Sample output of df2:

               gas_usage   gas_usage_c   daily_usage
date                                                
2016-03-20  3.989903e+07  3.989903e+07           NaN
2016-03-21  3.932781e+07  4.061487e+07  7.158445e+05
2016-03-22  3.875659e+07  4.133072e+07  7.158445e+05
                 ...           ...           ...
2018-02-18  4.899380e+07  7.967041e+08  1.598856e+06
2018-02-19  4.847973e+07  7.983029e+08  1.598856e+06
2018-02-20  4.796567e+07  7.999018e+08  1.598856e+06

[703 rows x 3 columns]

Visual confirmation

I’ve included two simple graphs to illustrate the dataset alignment and interpolation.

enter image description here

enter image description here

Plotting code:

For completeness, the rough plotting code is included below.

from plotly.offline import plot

plot({'data': [{'x': df.index, 
                'y': df['gas_usage'], 
                'type': 'bar'}], 
      'layout': {'title': 'Original',
                 'template': 'plotly_dark'}})
plot({'data': [{'x': df2.index, 
                'y': df2['daily_usage'], 
                'type': 'bar'}], 
      'layout': {'title': 'Interpolated',
                 'template': 'plotly_dark'}})
Answered By: S3DEV

Have yet tried fractional diference compiled in De Prado (2019) Machine leraning advances in finance?
One could fractionally diferenciate the mouthly basis into dailly one.kinda with d parameter assuming 1/30 .idk , just sharing a thought though…

Have a nice day ^ ^