Python regex to parse the datetime.datetime object from string

Question:

I have the following string:

"{'foo': datetime.datetime(2022, 5, 23, 0, 0, tzinfo=tzlocal()), 'bar': 'some data', 'foobar': datetime.datetime(2022, 8, 3, 13, 57, 41, tzinfo=<UTC>), 'barlist': ['hello', 'world']}"

I want to be able to match all the datetime.datetime(...) strings within this string and replace it with the numbers in a list form only. So this is the expected result:

"{'foo': [2022, 5, 23, 0, 0], 'bar': 'some data', 'foobar': [2022, 8, 3, 13, 57, 41], 'barlist': ['hello', 'world']}"

I have something like this:

DATETIME_PATTERN = r"datetime.datetime(((d+)(,s*d+)*), tzinfo=.*)"
modified_input_str = re.sub(DATETIME_PATTERN, r"[1]", input_str)

but it replaces a big chunk of data inbetween the matches. How can I modify the regex to accomplish what I want?

Conclusion:
I made a modification of the current best answer so it fits my particular usecase more:

DATETIME_PATTERN = r"datetime.datetime((d+(?:,s*d+)*), tzinfo=(?:[^sd])*)"

# The difference is that the string at the end of 'tzinfo=' can be anything but whitespace or numbers.
Asked By: avhhh

||

Answers:

You can use

datetime.datetime((d+(?:,s*d+)*), tzinfo=(?:()|[^()])*)

Details:

  • datetime.datetime( – a datetime.datetime( string
  • (d+(?:,s*d+)*) – Group 1: one or more digits and then zero or more repetitions of a comma + zero or more whitespaces and then one or more digits
  • , tzinfo= – a literal string
  • (?:()|[^()])* – zero or more repetitions of a () string or any char other than ( and )
  • ) – a ) char.

See the regex demo.

Answered By: Wiktor Stribiżew
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.