Polars Python not converting to Datetime format when parsing
Question:
Like the title says, I don’t know why it’s not converting to datetime.
df = pl.read_csv(file, ignore_errors=True, parse_dates=True)
df.columns = list(map(lambda x: x.replace(' ', ''), df.columns))
df_new = df.select(pl.col(['Date_Convert','Region']))
print(df_new.head())
shape: (5, 2)
I tried doing it without parsing on load. Here’s the outcome:
df = df.with_columns(pl.col('Date_Convert').str.strptime(pl.Datetime, strict=False, fmt='%Y/%B/%d %H:%M'))
Answers:
The docs says
try_parse_dates
Try to automatically parse dates. Most ISO8601-like time zone naive formats can be inferred, as well as a handful of others. If this does not succeed, the column remains of data type pl.Utf8. If use_pyarrow=True, dates will always be parsed.
So, not all formats can be automatically inferred without a format.
You’re on the right track though – first just read the csv, and then convert to datetime using strptime
and passing the format explicitly
I solved this one by changing the region in my machine that has a date format of yyyy-mm-dd (Canada).
This is basically from:
mm-dd-yyyy
To:
yyyy-mm-dd
Thanks for the answers.
Like the title says, I don’t know why it’s not converting to datetime.
df = pl.read_csv(file, ignore_errors=True, parse_dates=True)
df.columns = list(map(lambda x: x.replace(' ', ''), df.columns))
df_new = df.select(pl.col(['Date_Convert','Region']))
print(df_new.head())
shape: (5, 2)
I tried doing it without parsing on load. Here’s the outcome:
df = df.with_columns(pl.col('Date_Convert').str.strptime(pl.Datetime, strict=False, fmt='%Y/%B/%d %H:%M'))
The docs says
try_parse_dates
Try to automatically parse dates. Most ISO8601-like time zone naive formats can be inferred, as well as a handful of others. If this does not succeed, the column remains of data type pl.Utf8. If use_pyarrow=True, dates will always be parsed.
So, not all formats can be automatically inferred without a format.
You’re on the right track though – first just read the csv, and then convert to datetime using strptime
and passing the format explicitly
I solved this one by changing the region in my machine that has a date format of yyyy-mm-dd (Canada).
This is basically from:
mm-dd-yyyy
To:
yyyy-mm-dd
Thanks for the answers.