Need to replace Nan values of a timeseries dataframe with logic

Question:

df = pd.DataFrame({'date': ['3/10/2000', '3/11/2000', '3/12/2000','3/13/2000','3/14/2000','3/15/2000','3/16/2000','3/17/2000','3/18/2000'],
                   'value': [2,NaN,NaN,NaN,NaN,NaN,NaN,NaN,25]})

In this dataframe, I want to replace the Nan values as with the following logic:
In this case the difference between two dates in terms of days when the value column is not Nan is 8 days i.e. 3/18/2000 – 3/10/2000 = 8 days. And lets say the delta = 23 which we get from subtracting 25-2.
I want to replace the Nan values for all the other t day as 2+(delta)*(t/8) where t is any day with a nan value between the given two non nan value
My desired outcome of value column is :
[2,4.875,7.75,10.625,13.5,16.375,19.25,22.125,25]

Asked By: Parth Deven Gandhi

||

Answers:

You can set the date to timedelta, then as index and interpolate with the ‘index’ method:

df['value'] = (df
 .assign(date=pd.to_datetime(df['date']))
 .set_index('date')['value']
 .interpolate('index')
 .values
)

output:

        date   value
0  3/10/2000   2.000
1  3/11/2000   4.875
2  3/12/2000   7.750
3  3/13/2000  10.625
4  3/14/2000  13.500
5  3/15/2000  16.375
6  3/16/2000  19.250
7  3/17/2000  22.125
8  3/18/2000  25.000
Answered By: mozway
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.