DataFrame.resample does not include last row
Question:
So i wanted to downsampling my data using ffill method
I have a data:
2020-01-01 1.248310e+06
2021-01-01 1.259511e+06
2022-01-01 1.276312e+06
2023-01-01 1.298714e+06
The output should be:
2020-01-01 1.248310e+06
2020-02-01 1.248310e+06
2020-03-01 1.248310e+06
.... ...
2023-10-01 1.298714e+06
2023-11-01 1.298714e+06
2023-12-01 1.298714e+06
Here is what I tried
down_sampling = df.resample('MS', fill_method= 'ffill')
I get something like:
2020-01-01 1.248310e+06
2020-02-01 1.248310e+06
2020-03-01 1.248310e+06
.... ...
2022-11-01 1.276312e+06
2022-12-01 1.276312e+06
2023-01-01 1.298714e+06
The problem here is the year 2023 has only one month.
Can you suggest any idea on how to fixed it.
Thank you.
Answers:
You can do it like this:
index = pd.date_range('1/1/2020', periods=4, freq='YS')
series = pd.Series([1.248310e+06, 1.259511e+06, 1.276312e+06, 1.298714e+06], index=index)
series2 = pd.Series(1.298714e+06, pd.date_range('12/1/2023', periods=1))
series = series.append(series2)
down_sampling = series.resample('MS').ffill()
A hacky but pythonic solution:
df.append(df.iloc[[-1]].set_index(df.iloc[[-1]].index.shift(1, freq="D"))).resample("H").ffill()[:-1]
It picks the last row (as a df: df.iloc[[-1]]
),
increases its index by one step (here 1 day: .index.shift(1, freq="D")
),
then resamples: .resample("H").ffill()
And removes the single dummy row at the end of df ([:-1]
)
I’d actually expect closed:
parameter of resample
to do the job.
So i wanted to downsampling my data using ffill method
I have a data:
2020-01-01 1.248310e+06
2021-01-01 1.259511e+06
2022-01-01 1.276312e+06
2023-01-01 1.298714e+06
The output should be:
2020-01-01 1.248310e+06
2020-02-01 1.248310e+06
2020-03-01 1.248310e+06
.... ...
2023-10-01 1.298714e+06
2023-11-01 1.298714e+06
2023-12-01 1.298714e+06
Here is what I tried
down_sampling = df.resample('MS', fill_method= 'ffill')
I get something like:
2020-01-01 1.248310e+06
2020-02-01 1.248310e+06
2020-03-01 1.248310e+06
.... ...
2022-11-01 1.276312e+06
2022-12-01 1.276312e+06
2023-01-01 1.298714e+06
The problem here is the year 2023 has only one month.
Can you suggest any idea on how to fixed it.
Thank you.
You can do it like this:
index = pd.date_range('1/1/2020', periods=4, freq='YS')
series = pd.Series([1.248310e+06, 1.259511e+06, 1.276312e+06, 1.298714e+06], index=index)
series2 = pd.Series(1.298714e+06, pd.date_range('12/1/2023', periods=1))
series = series.append(series2)
down_sampling = series.resample('MS').ffill()
A hacky but pythonic solution:
df.append(df.iloc[[-1]].set_index(df.iloc[[-1]].index.shift(1, freq="D"))).resample("H").ffill()[:-1]
It picks the last row (as a df: df.iloc[[-1]]
),
increases its index by one step (here 1 day: .index.shift(1, freq="D")
),
then resamples: .resample("H").ffill()
And removes the single dummy row at the end of df ([:-1]
)
I’d actually expect closed:
parameter of resample
to do the job.