Python: reduce precision pandas timestamp dataframe


Hello I have the following dataframe

df = 

       Record_ID       Time
        94704   2014-03-10 07:19:19.647342
        94705   2014-03-10 07:21:44.479363
        94706   2014-03-10 07:21:45.479581
        94707   2014-03-10 07:21:54.481588
        94708   2014-03-10 07:21:55.481804

Is it possible to the have following?

df1 = 

       Record_ID       Time
        94704   2014-03-10 07:19:19
        94705   2014-03-10 07:21:44
        94706   2014-03-10 07:21:45
        94707   2014-03-10 07:21:54
        94708   2014-03-10 07:21:55
Asked By: emax



If you really must remove the microsecond part of the datetime, you can use the Timestamp.replace method along with Series.apply method to apply it across the series , to replace the microsecond part with 0. Example –

df['Time'] = df['Time'].apply(lambda x: x.replace(microsecond=0))

Demo –

In [25]: df
   Record_ID                       Time
0      94704 2014-03-10 07:19:19.647342
1      94705 2014-03-10 07:21:44.479363
2      94706 2014-03-10 07:21:45.479581
3      94707 2014-03-10 07:21:54.481588
4      94708 2014-03-10 07:21:55.481804

In [26]: type(df['Time'][0])
Out[26]: pandas.tslib.Timestamp

In [27]: df['Time'] = df['Time'].apply(lambda x: x.replace(microsecond=0))

In [28]: df
   Record_ID                Time
0      94704 2014-03-10 07:19:19
1      94705 2014-03-10 07:21:44
2      94706 2014-03-10 07:21:45
3      94707 2014-03-10 07:21:54
4      94708 2014-03-10 07:21:55
Answered By: Anand S Kumar

You could convert the underlying datetime64[ns] values to datetime64[s] values using astype:

In [11]: df['Time'] = df['Time'].astype('datetime64[s]')

In [12]: df
   Record_ID                Time
0      94704 2014-03-10 07:19:19
1      94705 2014-03-10 07:21:44
2      94706 2014-03-10 07:21:45
3      94707 2014-03-10 07:21:54
4      94708 2014-03-10 07:21:55

Note that since Pandas Series and DataFrames store all datetime values as datetime64[ns] these datetime64[s] values are automatically converted back to datetime64[ns], so the end result is still stored as datetime64[ns] values, but the call to astype causes the fractional part of the seconds to be removed.

If you wish to have a NumPy array of datetime64[s] values, you could use df['Time'].values.astype('datetime64[s]').

Answered By: unutbu

For pandas of version 0.24.0 or upward, you can simply set the freq parameter in ceil() function to get the precison you want:

df['Time'] = df.Time.dt.ceil(freq='s')  

In [28]: df
   Record_ID                Time
0      94704 2014-03-10 07:19:19
1      94705 2014-03-10 07:21:44
2      94706 2014-03-10 07:21:45
3      94707 2014-03-10 07:21:54
4      94708 2014-03-10 07:21:55
Answered By: eric R
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.