How to select rows between a certain date range in python-polars?

Question:

If a DataFrame is constructed like the following using polars-python:

import polars as pl
from polars import col
from datetime import datetime

df = pl.DataFrame({
    "dates": ["2016-07-02", "2016-08-10",  "2016-08-31", "2016-09-10"],
    "values": [1, 2, 3, 4]
})

How to select the rows between a certain date range, i.e. between between "2016-08-10" and "2016-08-31", so that the desired outcome is:

┌────────────┬────────┐
│ dates      ┆ values │
│ ---        ┆ ---    │
│ date       ┆ i64    │
╞════════════╪════════╡
│ 2016-08-10 ┆ 2      │
├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 2016-08-31 ┆ 3      │
└────────────┴────────┘
Asked By: pythonic833

||

Answers:

First you need transform the string values in dates to datetimes then filter:

# eager
(df.with_column(pl.col("dates").str.strptime(pl.Date)) 
 .filter(col("dates").is_between(datetime(2016, 8, 9), datetime(2016, 9, 1)))
)

# lazy
(df.lazy()
 .with_column(pl.col("dates").str.strptime(pl.Date)) 
 .filter(col("dates").is_between(datetime(2016, 8, 9), datetime(2016, 9, 1)))
 .collect()
)

both result in the desired output:

┌────────────┬────────┐
│ dates      ┆ values │
│ ---        ┆ ---    │
│ date       ┆ i64    │
╞════════════╪════════╡
│ 2016-08-10 ┆ 2      │
├╌╌╌╌╌╌╌╌╌╌╌╌┼╌╌╌╌╌╌╌╌┤
│ 2016-08-31 ┆ 3      │
└────────────┴────────┘
Answered By: pythonic833

Use the is_between expression on your dates column:

    df.filter(
            pl.col("dates").is_between(pl.date(2016, 8, 10), pl.date(2016, 8, 31)),
    )
Answered By: YouCanDo
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.