how to fill missing timestamp values with mean value in pandas dataframe

Question:

I have large set of data here pasting piece of data in my data every 59 sec value is missing here 12:47:59 is missing how to append it and fill missing rpm value with mean rpm value
df = pd.DataFrame({ 'Time': ['12:47:56', '12:47:57', '12:47:58', '12:48:00', '12:48:01', '12:48:02', '12:48:03'], 'rpm': [5.5, 7.0, 9.0, 12.0, 16.0, 19.0, 20.0] })

here is my expected output
df = pd.DataFrame({ 'Time': ['12:47:56', '12:47:57', '12:47:58','12:47:59', '12:48:00', '12:48:01', '12:48:02', '12:48:03'], 'rpm': [5.5, 7.0, 9.0, 10.5,12.0, 16.0, 19.0, 20.0] })

Asked By: appu

||

Answers:

Create DatetimeIndex and add missing values by DataFrame.asfreq, then use DatetimeIndex.time and Series.interpolate:

out = df.set_index(pd.to_datetime(df['Time'], format='%H:%M:%S')).asfreq('S')

#alternative with resample, e.g. by aggregate first value
#out = df.set_index(pd.to_datetime(df['Time'], format='%H:%M:%S')).resample('S').first()

out['Time'] = out.index.time
out['rpm'] = out['rpm'].interpolate()

out = out.reset_index(drop=True)
print (out)
       Time   rpm
0  12:47:56   5.5
1  12:47:57   7.0
2  12:47:58   9.0
3  12:47:59  10.5
4  12:48:00  12.0
5  12:48:01  16.0
6  12:48:02  19.0
7  12:48:03  20.0
Answered By: jezrael
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.