5-digit-year in date export (52164-01-19 00:00:00+00)

Question:

I received a data export from an unspecified source system, which includes dates in the format of:

  1. 52164-01-19 00:00:00+00
  2. 52992-08-12 04:29:36+00
  3. 52838-10-19 04:08:32.999936+00
  4. 54022-03-12 17:20:36.999936+00

I was told that the error is caused by a faulty conversion of unix to datetime (seconds and milliseconds).

We came up with a possible approach to refactor the date in python, by separating the year into a "normal" year (2164) and convert the rest into milliseconds.

import time
import math

d0 = '52164-01-19 00:00:00+00'
d0_y = 52164
multiplier = 5

# avg gregorian year in seconds
avg_greg = (365.2425 * 24 * 3600)

d1_modulus = 52164 % (1000 * multiplier)
d1_rest = d0_y - d1_modulus

# replace orginal year with modulus 
d1_time = time.strptime(str(d1_modulus) + '-10-19 04:08:32', '%Y-%m-%d %H:%M:%S')

#convert to milliseconds and add d1_rest in "seconds"
bigtime = time.mktime(d1_time) + (avg_greg * d1_rest) #in milliseconds
biggertime = bigtime / 1000 # in seconds
finaltime = time.ctime(biggertime)
# finaltime = 'Thu Mar 12 07:34:41 2020'
print(finaltime)

This code can break for different dates and result in multiplier/modulus combinations to create out of range values for time.mktime.

Can someone suggest an alternative or better approach to this?

Thanks in advance
Gabe

Asked By: Gabe Faucyt

||

Answers:

Python datetime only supports between years 1 and 9999.
So I installed astropy and it works fine :

import datetime
import re

import astropy.time as astropy_time  # installed with PIP


faulty_data = "52164-01-19 00:00:00+00"
timeformat = re.compile(r"(?P<year>d{5})-(?P<month>d{2})-(?P<day>d{2}) (?P<hour>d{2}):(?P<minute>d{2}):(?P<second>d{2})(?P<sign_tz>[+- ])(?P<hour_tz>d{2})")

match = timeformat.fullmatch(faulty_data)
assert match
assert len(match.group("year")) == 5
assert match.group("hour_tz") == "00"
missing_thousand_years = int(match.group("year")[0])

time = astropy_time.Time({"year": int(match.group("year")),
                          "month": int(match.group("month")),
                          "day": int(match.group("day")),
                          "hour": int(match.group("hour")),
                          "minute": int(match.group("minute")),
                          "second": int(match.group("second"))
                          },
                         scale="utc")
print(time)
milliseconds = time.unix
print(milliseconds)
actual_datetime = datetime.datetime.fromtimestamp(milliseconds / 1000)
print(actual_datetime)
(52164, 1, 19, 0, 0, 0.)
1583971200000.0
2020-03-12 01:00:00

So it seems that the original date was 2020-03-12 01:00:00, which is close to what you got with your method.

NB: it raises two warnings, that you can silence

Answered By: Lenormju
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.