Parse datetime from CSV, assign timezone and convert to another timezone – Polars Python
Question:
I have a column of timestamps in a CSV file, like 2022-01-03 17:59:16.254
. As an external information, I know this time is in JST.
I am trying to parse this string into datetime, assign JST timezone (without changing the timestamp), and convert it to CET.
An attempt:
new = pl.scan_csv('test.csv').with_columns(
[pl.col("timestamp").str.strptime(pl.Datetime, "%Y-%m-%d %H:%M:%S.%f", strict=True),
]
).select(
[pl.col("timestamp").cast(pl.Date).alias("Date"),
pl.col("timestamp").dt.with_time_zone("Asia/Tokyo").alias("WithTZ"),
pl.col("timestamp").dt.with_time_zone("Asia/Tokyo").dt.cast_time_zone("Europe/Berlin").alias("WithCastTZ"),
pl.all(),
]
)
new.fetch(10).write_csv("testOut.csv")
as a result, I was expecting the datetime part to not change in WithTZ. However, this is my first line. Casting also did not have any impact.
WithTZ |WithCastTZ |timestamp
2022-01-04 02:59:16.213 JST|2022-01-04 02:59:16.213 CET|2022-01-03T17:59:16.213000000
I think I am missing something obvious..
Answers:
The methods for dealing with time zones are:
dt.convert_time_zone
: convert from one time zone to another;
dt.replace_time_zone
: set/unset/change time zone;
So here, it sounds like you’re after the latter:
pl.col("timestamp").dt.replace_time_zone("Asia/Tokyo")
To then convert to Europe/Berlin:
pl.col("timestamp").dt.replace_time_zone("Asia/Tokyo").dt.convert_time_zone("Europe/Berlin")
I have a column of timestamps in a CSV file, like 2022-01-03 17:59:16.254
. As an external information, I know this time is in JST.
I am trying to parse this string into datetime, assign JST timezone (without changing the timestamp), and convert it to CET.
An attempt:
new = pl.scan_csv('test.csv').with_columns(
[pl.col("timestamp").str.strptime(pl.Datetime, "%Y-%m-%d %H:%M:%S.%f", strict=True),
]
).select(
[pl.col("timestamp").cast(pl.Date).alias("Date"),
pl.col("timestamp").dt.with_time_zone("Asia/Tokyo").alias("WithTZ"),
pl.col("timestamp").dt.with_time_zone("Asia/Tokyo").dt.cast_time_zone("Europe/Berlin").alias("WithCastTZ"),
pl.all(),
]
)
new.fetch(10).write_csv("testOut.csv")
as a result, I was expecting the datetime part to not change in WithTZ. However, this is my first line. Casting also did not have any impact.
WithTZ |WithCastTZ |timestamp
2022-01-04 02:59:16.213 JST|2022-01-04 02:59:16.213 CET|2022-01-03T17:59:16.213000000
I think I am missing something obvious..
The methods for dealing with time zones are:
dt.convert_time_zone
: convert from one time zone to another;dt.replace_time_zone
: set/unset/change time zone;
So here, it sounds like you’re after the latter:
pl.col("timestamp").dt.replace_time_zone("Asia/Tokyo")
To then convert to Europe/Berlin:
pl.col("timestamp").dt.replace_time_zone("Asia/Tokyo").dt.convert_time_zone("Europe/Berlin")