Rounding milliseconds in Pandas datettime column
Question:
I have a pandas dataframe in which one column has datetime data. The format of the timestamps is like
2020-05-05 12:15:33.500000
I want to format/ round the milliseconds to the first decimal point. I am using the following code to format.
df['timestamp'].dt.strftime('%Y-%m-%d %H:%M:%S:%f')
Is there any modification I can do to round the milliseconds? Thank you in advance.
Answers:
You can’t directly. You can however post process it with slicing:
df['timestamp'].dt.strftime('%Y-%m-%d %H:%M:%S:%f').str[:-5]
Slight modification of @mozway’s answer:
>>> import datetime
>>> dt = datetime.datetime.now()
>>> dt
datetime.datetime(2022, 8, 18, 10, 42, 37, 714473)
>>> dt.strftime('%Y-%m-%d %H:%M:%S:%f')[:-5]
'2022-08-18 10:42:37:7'
If you are specifically interested in rounding (not merely truncating down), your Series to the nearest tenth of a second, then:
to Timestamps:
df['timestamp'].dt.round('100ms') # still a Series of Timestamps
To get a Series of strings (with controlled format) instead of Timestamps, then apply one of the other answers to the above, e.g.:
df['timestamp'].dt.round('100ms').dt.strftime('%Y-%m-%d %H:%M:%S.%f').str[:-5]
Or, faster (between 5x for short Series and 1.6x for 100K values or more):
df['timestamp'].apply(lambda t: t.round('100ms').strftime(
'%Y-%m-%d %H:%M:%S.%f')[:-5])
Example
df = pd.DataFrame({
'timestamp': pd.to_datetime([
'2022-08-18 15:30:00.440000',
'2022-08-18 15:30:00.46000',
'2022-08-18 15:30:00.500000',
])
})
>>> df
timestamp
0 2022-08-18 15:30:00.440
1 2022-08-18 15:30:00.460
2 2022-08-18 15:30:00.500
>>> df['timestamp'].dt.round('100ms')
0 2022-08-18 15:30:00.400
1 2022-08-18 15:30:00.500
2 2022-08-18 15:30:00.500
Name: timestamp, dtype: datetime64[ns]
>>> df['timestamp'].dt.round('100ms').dt.strftime(
'%Y-%m-%d %H:%M:%S.%f').str[:-5]
0 2022-08-18 15:30:00.4
1 2022-08-18 15:30:00.5
2 2022-08-18 15:30:00.5
Name: timestamp, dtype: object
>>> df['timestamp'].apply(lambda t: t.round('100ms').strftime(
... '%Y-%m-%d %H:%M:%S.%f')[:-5])
0 2022-08-18 15:30:00.4
1 2022-08-18 15:30:00.5
2 2022-08-18 15:30:00.5
Name: timestamp, dtype: object
I have a pandas dataframe in which one column has datetime data. The format of the timestamps is like
2020-05-05 12:15:33.500000
I want to format/ round the milliseconds to the first decimal point. I am using the following code to format.
df['timestamp'].dt.strftime('%Y-%m-%d %H:%M:%S:%f')
Is there any modification I can do to round the milliseconds? Thank you in advance.
You can’t directly. You can however post process it with slicing:
df['timestamp'].dt.strftime('%Y-%m-%d %H:%M:%S:%f').str[:-5]
Slight modification of @mozway’s answer:
>>> import datetime
>>> dt = datetime.datetime.now()
>>> dt
datetime.datetime(2022, 8, 18, 10, 42, 37, 714473)
>>> dt.strftime('%Y-%m-%d %H:%M:%S:%f')[:-5]
'2022-08-18 10:42:37:7'
If you are specifically interested in rounding (not merely truncating down), your Series to the nearest tenth of a second, then:
to Timestamps:
df['timestamp'].dt.round('100ms') # still a Series of Timestamps
To get a Series of strings (with controlled format) instead of Timestamps, then apply one of the other answers to the above, e.g.:
df['timestamp'].dt.round('100ms').dt.strftime('%Y-%m-%d %H:%M:%S.%f').str[:-5]
Or, faster (between 5x for short Series and 1.6x for 100K values or more):
df['timestamp'].apply(lambda t: t.round('100ms').strftime(
'%Y-%m-%d %H:%M:%S.%f')[:-5])
Example
df = pd.DataFrame({
'timestamp': pd.to_datetime([
'2022-08-18 15:30:00.440000',
'2022-08-18 15:30:00.46000',
'2022-08-18 15:30:00.500000',
])
})
>>> df
timestamp
0 2022-08-18 15:30:00.440
1 2022-08-18 15:30:00.460
2 2022-08-18 15:30:00.500
>>> df['timestamp'].dt.round('100ms')
0 2022-08-18 15:30:00.400
1 2022-08-18 15:30:00.500
2 2022-08-18 15:30:00.500
Name: timestamp, dtype: datetime64[ns]
>>> df['timestamp'].dt.round('100ms').dt.strftime(
'%Y-%m-%d %H:%M:%S.%f').str[:-5]
0 2022-08-18 15:30:00.4
1 2022-08-18 15:30:00.5
2 2022-08-18 15:30:00.5
Name: timestamp, dtype: object
>>> df['timestamp'].apply(lambda t: t.round('100ms').strftime(
... '%Y-%m-%d %H:%M:%S.%f')[:-5])
0 2022-08-18 15:30:00.4
1 2022-08-18 15:30:00.5
2 2022-08-18 15:30:00.5
Name: timestamp, dtype: object