python strptime format with optional bits

Question:

Right now I have:

timestamp = datetime.strptime(date_string, '%Y-%m-%d %H:%M:%S.%f')

This works great unless I’m converting a string that doesn’t have the microseconds. How can I specify that the microseconds are optional (and should be considered 0 if they aren’t in the string)?

Asked By: Digant C Kasundra

||

Answers:

You could use a try/except block:

try:
    timestamp = datetime.strptime(date_string, '%Y-%m-%d %H:%M:%S.%f')
except ValueError:
    timestamp = datetime.strptime(date_string, '%Y-%m-%d %H:%M:%S')
Answered By: Alexander

What about just appending it if it doesn’t exist?

if '.' not in date_string:
    date_string = date_string + '.0'

timestamp = datetime.strptime(date_string, '%Y-%m-%d %H:%M:%S.%f')
Answered By: stevieb

I prefer using regex matches instead of try and except. This allows for many fallbacks of acceptable formats.

# full timestamp with milliseconds
match = re.match(r"d{4}-d{2}-d{2}Td{2}:d{2}:d{2}.d+Z", date_string)
if match:
    return datetime.strptime(date_string, "%Y-%m-%dT%H:%M:%S.%fZ")

# timestamp missing milliseconds
match = re.match(r"d{4}-d{2}-d{2}Td{2}:d{2}:d{2}Z", date_string)
if match:
    return datetime.strptime(date_string, "%Y-%m-%dT%H:%M:%SZ")

# timestamp missing milliseconds & seconds
match = re.match(r"d{4}-d{2}-d{2}Td{2}:d{2}Z", date_string)
if match:
    return datetime.strptime(date_string, "%Y-%m-%dT%H:%MZ")

# unknown timestamp format
return false

Don’t forget to import “re” as well as “datetime” for this method.

Answered By: fourfightingfoxes
datetime(*map(int, re.findall('d+', date_string)))

can parse both '%Y-%m-%d %H:%M:%S.%f' and '%Y-%m-%d %H:%M:%S'. It is too permissive if your input is not filtered.

It is quick-and-dirty but sometimes strptime() is too slow. It can be used if you know that the input has the expected date format.

Answered By: jfs

For my similar problem using jq I used the following:

|split("Z")[0]|split(".")[0]|strptime("%Y-%m-%dT%H:%M:%S")|mktime

As the solution to sort my list by time properly.

Answered By: DC Martin

using one regular expression and some list expressions

time_str = "12:34.567"
# time format is [HH:]MM:SS[.FFF]
sum([a*b for a,b in zip(map(lambda x: int(x) if x else 0, re.match(r"(?:(d{2}):)?(d{2}):(d{2})(?:.(d{3}))?", time_str).groups()), [3600, 60, 1, 1/1000])])
# result = 754.567
Answered By: milahu

I’m late to the party but I found if you don’t care about the optional bits this will lop off the .%f for you.

datestring.split('.')[0]
Answered By: user14608345

If you are using Pandas you can also filter the the Series and concatenate it. The index is automatically joined.

import pandas as pd

# Every other row has a different format
df = pd.DataFrame({"datetime_string": ["21-06-08 14:36:09", "21-06-08 14:36:09.50", "21-06-08 14:36:10", "21-06-08 14:36:10.50"]})
df["datetime"] = pd.concat([
    pd.to_datetime(df["datetime_string"].iloc[1::2], format="%y-%m-%d %H:%M:%S.%f"),
    pd.to_datetime(df["datetime_string"].iloc[::2], format="%y-%m-%d %H:%M:%S"),
])

datetime_string datetime
0 21-06-08 14:36:09 2021-06-08 14:36:09
1 21-06-08 14:36:09.50 2021-06-08 14:36:09.500000
2 21-06-08 14:36:10 2021-06-08 14:36:10
3 21-06-08 14:36:10.50 2021-06-08 14:36:10.500000
Answered By: JulianWgs
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.