ValueError: time data 'Tue 28 Feb 2023 11:27:38 AM CET' does not match format '%a %d %b %Y %I:%M:%S %p %Z'

Question:

I got a strainge bug.

Got 2 the same servers. Both ubuntu 22.04
both running Python 3.10.6

First server I run my code all well:

Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from datetime import datetime
>>> date_time_str = 'Tue 28 Feb 2023 11:27:38 AM CET'
>>> date_time_obj = datetime.strptime(date_time_str, '%a %d %b %Y %I:%M:%S %p %Z')
>>> print ("The type of the date is now",  type(date_time_obj))
The type of the date is now <class 'datetime.datetime'>
>>> print ("The date is", date_time_obj)
The date is 2023-02-28 11:27:38
>>>

Second server I do the same:

Python 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from datetime import datetime
>>> date_time_str = 'Tue 28 Feb 2023 11:27:38 AM CET'
>>> date_time_obj = datetime.strptime(date_time_str, '%a %d %b %Y %I:%M:%S %p %Z')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.10/_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  File "/usr/lib/python3.10/_strptime.py", line 349, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data 'Tue 28 Feb 2023 11:27:38 AM CET' does not match format '%a %d %b %Y %I:%M:%S %p %Z'
>>>

What could be causing this issue? its cleary not down to the format as its correct.

Asked By: user2433624

||

Answers:

The Python strptime/strftime documentation is a bit secretive about %Z: It does not parse arbitrary time zone abbreviations1. If you scroll down to the technical detail section, you can find:

  1. […]
    %Z […]
    strptime() only accepts certain values for %Z:

    • any value in time.tzname for your machine’s locale
    • the hard-coded values UTC and GMT

The first point explains why your attempt works on some systems but not on others.


How to parse reliably

"CET" is an abbreviated tz name. Many of those are ambiguous, so parsers likely refuse to parse them2. A way around is to define which abbreviation maps to which IANA time zone name with dateutils parser:

from datetime import datetime
import dateutil # pip install python-dateutil

tzmapping = {"CET": dateutil.tz.gettz("Europe/Berlin")}

print(dateutil.parser.parse('Tue 28 Feb 2023 11:27:38 AM CET', tzinfos=tzmapping))

2023-02-28 11:27:38+01:00

If you want to have more control over the parsing process, you can implement something similar yourself, e.g.

from datetime import datetime
from zoneinfo import ZoneInfo # Python 3.9+ standard library

tzmapping = {"CET": ZoneInfo("Europe/Berlin")}

date_time_str = 'Tue 28 Feb 2023 11:27:38 AM CET'

# separate datetime part and timezone part:
dt, tz = date_time_str.rsplit(" ", maxsplit=1)

# now parse datetime part and set timezone.
date_time_obj = datetime.strptime(dt, '%a %d %b %Y %I:%M:%S %p').replace(tzinfo=tzmapping[tz])

print(date_time_obj)
# 2023-02-28 11:27:38+01:00

print(repr(date_time_obj))
# datetime.datetime(2023, 2, 28, 11, 27, 38, tzinfo=zoneinfo.ZoneInfo(key='Europe/Berlin'))

1 In fact, %Z doesn’t parse anything in a strict sense; it just makes the parser ignore strings like "GMT" or "UTC". The resulting datetime object will still be naive!

2 Besides, CET specifies a UTC offset, not a time zone in a geographical sense. For instance "Europe/Berlin" and "Europe/Paris" both experience CET but are different time zones.

Answered By: FObersteiner

@FObersteiner your remark with matches the time zone on the machine seems to be the reason.

server that has the ValueError.

server2:~$ timedatectl
               Local time: Tue 2023-02-28 15:06:10 UTC
           Universal time: Tue 2023-02-28 15:06:10 UTC
                 RTC time: Tue 2023-02-28 15:06:10
                Time zone: Etc/UTC (UTC, +0000)
System clock synchronized: yes
              NTP service: active
          RTC in local TZ: no

The server without the ValueError:

server1:~$ timedatectl
               Local time: Tue 2023-02-28 16:05:07 CET
           Universal time: Tue 2023-02-28 15:05:07 UTC
                 RTC time: Tue 2023-02-28 15:05:07
                Time zone: Europe/Amsterdam (CET, +0100)
System clock synchronized: yes
              NTP service: active

Since the code that was causing the issue is not mine change the system time zone made the application run without errors.

Answered By: user2433624