How to add a duration to datetime in Python polars

Question:

I want to add a duration in seconds to a date/time. My data looks like

import polars as pl

df = pl.DataFrame(
    {
        "dt": [
            "2022-12-14T00:00:00", "2022-12-14T00:00:00", "2022-12-14T00:00:00",
        ],
        "seconds": [
            1.0, 2.2, 2.4,
        ],
    }
)

df = df.with_column(pl.col("dt").str.strptime(pl.Datetime).cast(pl.Datetime))

Now my naive attempt was to to convert the float column to duration type to be able to add it to the datetime column (as I would do in pandas).

df = df.with_column(pl.col("seconds").cast(pl.Duration).alias("duration0"))

print(df.head())

┌─────────────────────┬─────────┬──────────────┐
│ dt                  ┆ seconds ┆ duration0    │
│ ---                 ┆ ---     ┆ ---          │
│ datetime[μs]        ┆ f64     ┆ duration[μs] │
╞═════════════════════╪═════════╪══════════════╡
│ 2022-12-14 00:00:00 ┆ 1.0     ┆ 0µs          │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2022-12-14 00:00:00 ┆ 2.2     ┆ 0µs          │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2022-12-14 00:00:00 ┆ 2.4     ┆ 0µs          │
└─────────────────────┴─────────┴──────────────┘

…gives the correct data type, however the values are all zero.

I also tried

df = df.with_column(
    pl.col("seconds")
    .apply(lambda x: pl.duration(nanoseconds=x * 1e9))
    .alias("duration1")
)
print(df.head())
shape: (3, 4)
┌─────────────────────┬─────────┬──────────────┬─────────────────────────────────────┐
│ dt                  ┆ seconds ┆ duration0    ┆ duration1                           │
│ ---                 ┆ ---     ┆ ---          ┆ ---                                 │
│ datetime[μs]        ┆ f64     ┆ duration[μs] ┆ object                              │
╞═════════════════════╪═════════╪══════════════╪═════════════════════════════════════╡
│ 2022-12-14 00:00:00 ┆ 1.0     ┆ 0µs          ┆ 0i64.duration([0i64, 1000000000f... │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2022-12-14 00:00:00 ┆ 2.2     ┆ 0µs          ┆ 0i64.duration([0i64, 2200000000f... │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2022-12-14 00:00:00 ┆ 2.4     ┆ 0µs          ┆ 0i64.duration([0i64, 2400000000f... │
└─────────────────────┴─────────┴──────────────┴─────────────────────────────────────┘

which gives an object type column which isn’t helpful either. The documentation is kind of sparse on the topic, any better options?

Asked By: FObersteiner

||

Answers:

Update: The values being zero is a repr formatting issue that has been fixed with this commit.

pl.duration() can be used in this way:

>>> df.with_column(
...    pl.col("dt").str.strptime(pl.Datetime)
...    + pl.duration(nanoseconds=pl.col("seconds") * 1e9)
... )
shape: (3, 2)
┌─────────────────────────┬─────────┐
│ dt                      | seconds │
│ ---                     | ---     │
│ datetime[μs]            | f64     │
╞═════════════════════════╪═════════╡
│ 2022-12-14 00:00:01     | 1.0     │
├─────────────────────────┼─────────┤
│ 2022-12-14 00:00:02.200 | 2.2     │
├─────────────────────────┼─────────┤
│ 2022-12-14 00:00:02.400 | 2.4     │
└─//──────────────────────┴─//──────┘
Answered By: jqurious

there’s another option as well; since datetime is represented internally as microseconds here, you can directly add the seconds as microseconds:

MICROSECONDS_PER_SECOND = 1e6
df = df.with_column((df["dt"]+df["seconds"]*MICROSECONDS_PER_SECOND)
                    .cast(pl.Datetime)
                    .alias("dt_new"))

print(df.head())
shape: (3, 3)
┌─────────────────────┬─────────┬─────────────────────────┐
│ dt                  ┆ seconds ┆ dt_new                  │
│ ---                 ┆ ---     ┆ ---                     │
│ datetime[μs]        ┆ f64     ┆ datetime[μs]            │
╞═════════════════════╪═════════╪═════════════════════════╡
│ 2022-12-14 00:00:00 ┆ 1.0     ┆ 2022-12-14 00:00:01     │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2022-12-14 00:00:00 ┆ 2.2     ┆ 2022-12-14 00:00:02.200 │
├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
│ 2022-12-14 00:00:00 ┆ 2.4     ┆ 2022-12-14 00:00:02.400 │
└─────────────────────┴─────────┴─────────────────────────┘

Answered By: FObersteiner