Converting string to DateTime Polars
Question:
I have a Polars dataframe with a column of type str
with the date and time in format 2020-03-02T13:10:42.550
. I want to convert this column to the polars.datetime type.
After reading this post Easily convert string column to pl.datetime in Polars, I came up with:
df = df.with_column(pl.col('EventTime').str.strptime(pl.Datetime, fmt="%Y-%m-%dT%H:%M:%f", strict=False))
However, the values my column "EventTime’ are all null.
Many Thanks!
Answers:
You were close. You forgot the seconds component of your format specifier:
(
df
.with_column(
pl.col('EventTime')
.str.strptime(pl.Datetime,
fmt="%Y-%m-%dT%H:%M:%S%.f",
strict=False)
.alias('parsed EventTime')
)
)
shape: (1, 2)
┌─────────────────────────┬─────────────────────────┐
│ EventTime ┆ parsed EventTime │
│ --- ┆ --- │
│ str ┆ datetime[ns] │
╞═════════════════════════╪═════════════════════════╡
│ 2020-03-02T13:10:42.550 ┆ 2020-03-02 13:10:42.550 │
└─────────────────────────┴─────────────────────────┘
BTW, the format you are using is standard, so you can eliminate the format specifier altogether.
(
df
.with_column(
pl.col('EventTime')
.str.strptime(pl.Datetime,
strict=False)
.alias('parsed EventTime')
)
)
shape: (1, 2)
┌─────────────────────────┬─────────────────────────┐
│ EventTime ┆ parsed EventTime │
│ --- ┆ --- │
│ str ┆ datetime[μs] │
╞═════════════════════════╪═════════════════════════╡
│ 2020-03-02T13:10:42.550 ┆ 2020-03-02 13:10:42.550 │
└─────────────────────────┴─────────────────────────┘
Edit
And what if I would like to ignore the miliseconds? so the "%.f", if I just leave it out it can’t interpret properly the dataframe
We need to allow Polars to parse the date string according to the actual format of the string.
That said, after the parsing, we can use dt.truncate
to throw away the fractional part.
(
df
.with_column(
pl.col('EventTime')
.str.strptime(pl.Datetime,
strict=False)
.dt.truncate('1s')
.alias('parsed EventTime')
)
)
shape: (1, 2)
┌─────────────────────────┬─────────────────────┐
│ EventTime ┆ parsed EventTime │
│ --- ┆ --- │
│ str ┆ datetime[μs] │
╞═════════════════════════╪═════════════════════╡
│ 2020-03-02T13:10:42.550 ┆ 2020-03-02 13:10:42 │
└─────────────────────────┴─────────────────────┘
I have a Polars dataframe with a column of type str
with the date and time in format 2020-03-02T13:10:42.550
. I want to convert this column to the polars.datetime type.
After reading this post Easily convert string column to pl.datetime in Polars, I came up with:
df = df.with_column(pl.col('EventTime').str.strptime(pl.Datetime, fmt="%Y-%m-%dT%H:%M:%f", strict=False))
However, the values my column "EventTime’ are all null.
Many Thanks!
You were close. You forgot the seconds component of your format specifier:
(
df
.with_column(
pl.col('EventTime')
.str.strptime(pl.Datetime,
fmt="%Y-%m-%dT%H:%M:%S%.f",
strict=False)
.alias('parsed EventTime')
)
)
shape: (1, 2)
┌─────────────────────────┬─────────────────────────┐
│ EventTime ┆ parsed EventTime │
│ --- ┆ --- │
│ str ┆ datetime[ns] │
╞═════════════════════════╪═════════════════════════╡
│ 2020-03-02T13:10:42.550 ┆ 2020-03-02 13:10:42.550 │
└─────────────────────────┴─────────────────────────┘
BTW, the format you are using is standard, so you can eliminate the format specifier altogether.
(
df
.with_column(
pl.col('EventTime')
.str.strptime(pl.Datetime,
strict=False)
.alias('parsed EventTime')
)
)
shape: (1, 2)
┌─────────────────────────┬─────────────────────────┐
│ EventTime ┆ parsed EventTime │
│ --- ┆ --- │
│ str ┆ datetime[μs] │
╞═════════════════════════╪═════════════════════════╡
│ 2020-03-02T13:10:42.550 ┆ 2020-03-02 13:10:42.550 │
└─────────────────────────┴─────────────────────────┘
Edit
And what if I would like to ignore the miliseconds? so the "%.f", if I just leave it out it can’t interpret properly the dataframe
We need to allow Polars to parse the date string according to the actual format of the string.
That said, after the parsing, we can use dt.truncate
to throw away the fractional part.
(
df
.with_column(
pl.col('EventTime')
.str.strptime(pl.Datetime,
strict=False)
.dt.truncate('1s')
.alias('parsed EventTime')
)
)
shape: (1, 2)
┌─────────────────────────┬─────────────────────┐
│ EventTime ┆ parsed EventTime │
│ --- ┆ --- │
│ str ┆ datetime[μs] │
╞═════════════════════════╪═════════════════════╡
│ 2020-03-02T13:10:42.550 ┆ 2020-03-02 13:10:42 │
└─────────────────────────┴─────────────────────┘