Python3: Update date format

Question:

I have a tricky with date format with time series data. In my dataframe of over one hundred thousand rows I have a column datetime with date value but the format is %M:%S.%f.

Example:

datetime     
0  59:57.7 
1  00:09.7
2  00:21.8 

What I want in output is to convert this format to %m/%d/%Y %H:%M:%S.%f with 01/01/2023 00:59:57.7 as first date and then increment hours and day. It’s a time series data on few days.

Result:

datetime   ProcessTime
59:57.7     01/01/2023 00:59:57.7
00:09.7     01/01/2023 01:00:09.7
00:21.8     01/01/2023 01:00:21.8

I did this code to change the first date to try to have a referential and change the others.

import pandas as pd
from datetime import datetime

# Example dataframe
df = pd.DataFrame({'datetime': ['59:57.7', '00:09.7', '00:21.8']})

first_time_str = df['datetime'][0]
first_time_obj = datetime.strptime(first_time_str, '%M:%S.%f')
formatted_first_time = first_time_obj.replace(year=2023, month=1, day=1).strftime('%m/%d/%Y %H:%M:%S.%f')

df['datetime'][0] = formatted_first_time

Thanks for your help. Regards

Asked By: Laurent Cesaro

||

Answers:

The exact logic is unclear

assuming a cumulated time

You can convert to_timedelta (after adding the missing hours '00:'), then get the cumsum and add the reference date:

df['ProcessTime'] = (pd.to_timedelta('00:'+df['datetime']).cumsum()
                       .add(pd.Timestamp('2023-01-01 00:59:57.7'))
                       .dt.strftime('%m/%d/%Y %H:%M:%S.%f')
                    )

Output:

  datetime                 ProcessTime
0  59:57.7  01/01/2023 01:59:55.400000
1  00:09.7  01/01/2023 02:00:05.100000
2  00:21.8  01/01/2023 02:00:26.900000
assuming simple timedelta
df['ProcessTime'] = (pd.to_timedelta('00:'+df['datetime'])
                       .add(pd.Timestamp('2023-01-01'))
                       .dt.strftime('%m/%d/%Y %H:%M:%S.%f')
                    )

Output:

  datetime                 ProcessTime
0  59:57.7  01/01/2023 00:59:57.700000
1  00:09.7  01/01/2023 01:00:07.400000
2  00:21.8  01/01/2023 01:00:29.200000
assuming you want to add 1h when the timedelta gets smaller than the previous one
t = pd.to_timedelta('00:'+df['datetime'])
df['ProcessTime'] = (pd.to_timedelta(t.diff().lt('0').cumsum(), unit='h')
                      .add(t+pd.Timestamp('2023-01-01'))
                      .dt.strftime('%m/%d/%Y %H:%M:%S.%f')
                    )

Output:

  datetime                 ProcessTime
0  59:57.7  01/01/2023 00:59:57.700000
1  00:09.7  01/01/2023 01:00:09.700000
2  00:21.8  01/01/2023 01:00:21.800000
Answered By: mozway
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.