Pandas monthly resample maintaining first date
Question:
when resampling a time series into a monthly series, pandas changes the initial date of my time series with the start of the month. From:
2020-01-12 0.730439
2020-01-13 0.559328
...
2021-06-29 0.188461
2021-06-30 0.750668
To:
2020-01-01 8.613978
2020-02-01 14.614601
... ...
2021-05-01 11.936765
2021-06-01 13.758198
Instead of the desired result, where in the fisrt month the date is the first date of my time series:
2020-01-12 8.613978
2020-02-01 14.614601
... ...
2021-05-01 11.936765
2021-06-01 13.758198
Is there a way to perform a monthly resample without losing the initial date?
Currently I do correct it afterwards. I am asking if there is a way to do it on the fly. I have tried with all resample
‘s parameters without achieving the desired result. I had a look at pd.Grouper
but also without success.
Thank you,
Gonxo
PS: Small script to replicate the issue.
import pandas as pd
from numpy.random import random
index = pd.date_range('20200112', '20210630')
df = pd.Series(random(len(index)), index=index)
df.resample('MS').sum()
Answers:
You can do a manual update:
s = df.resample('MS').sum()
s.index = [df.index.min()] + list(s.index[1:])
Output:
2020-01-12 7.345615
2020-02-01 15.873136
2020-03-01 14.083565
2020-04-01 17.547765
2020-05-01 15.321236
2020-06-01 11.787999
2020-07-01 16.619211
2020-08-01 17.292133
2020-09-01 16.866571
2020-10-01 17.772687
2020-11-01 13.371602
2020-12-01 17.037126
2021-01-01 15.907105
2021-02-01 13.887159
2021-03-01 13.660123
2021-04-01 16.534306
2021-05-01 15.055836
2021-06-01 15.818617
dtype: float64
Try this:
df = df.close.resample('M').sum()
Alternatively, you can change ‘MS’ to ‘M’ in your code.
when resampling a time series into a monthly series, pandas changes the initial date of my time series with the start of the month. From:
2020-01-12 0.730439
2020-01-13 0.559328
...
2021-06-29 0.188461
2021-06-30 0.750668
To:
2020-01-01 8.613978
2020-02-01 14.614601
... ...
2021-05-01 11.936765
2021-06-01 13.758198
Instead of the desired result, where in the fisrt month the date is the first date of my time series:
2020-01-12 8.613978
2020-02-01 14.614601
... ...
2021-05-01 11.936765
2021-06-01 13.758198
Is there a way to perform a monthly resample without losing the initial date?
Currently I do correct it afterwards. I am asking if there is a way to do it on the fly. I have tried with all resample
‘s parameters without achieving the desired result. I had a look at pd.Grouper
but also without success.
Thank you,
Gonxo
PS: Small script to replicate the issue.
import pandas as pd
from numpy.random import random
index = pd.date_range('20200112', '20210630')
df = pd.Series(random(len(index)), index=index)
df.resample('MS').sum()
You can do a manual update:
s = df.resample('MS').sum()
s.index = [df.index.min()] + list(s.index[1:])
Output:
2020-01-12 7.345615
2020-02-01 15.873136
2020-03-01 14.083565
2020-04-01 17.547765
2020-05-01 15.321236
2020-06-01 11.787999
2020-07-01 16.619211
2020-08-01 17.292133
2020-09-01 16.866571
2020-10-01 17.772687
2020-11-01 13.371602
2020-12-01 17.037126
2021-01-01 15.907105
2021-02-01 13.887159
2021-03-01 13.660123
2021-04-01 16.534306
2021-05-01 15.055836
2021-06-01 15.818617
dtype: float64
Try this:
df = df.close.resample('M').sum()
Alternatively, you can change ‘MS’ to ‘M’ in your code.