Creating a date range in python-polars with the last days of the months?

Question:

How do I create a date range in Polars (Python API) with only the last days of the months?

This is the code I have:

pl.date_range(datetime(2022,5,5), datetime(2022,8,10), "1mo", name="dtrange")

The result is: '2022-05-05', '2022-06-05', '2022-07-05', '2022-08-05'

I would like to get: '2022-05-31', '2022-06-30', '2022-07-31'

I know this is possible with Pandas with:

pd.date_range(start=datetime(2022,5,5), end=datetime(2022,8,10), freq='M')
Asked By: Luca

||

Answers:

I think you’d need to create the range of days and filter:

(pl.date_range(datetime(2022,5,5), datetime(2022,8,10), "1d", name="dtrange")
   .to_frame()   
   .filter(
      pl.col("dtrange").dt.month() < (pl.col("dtrange") + pl.duration(days=1)).dt.month()
   )
)
shape: (3, 1)
┌─────────────────────┐
│ dtrange             │
│ ---                 │
│ datetime[μs]        │
╞═════════════════════╡
│ 2022-05-31 00:00:00 │
├─────────────────────┤
│ 2022-06-30 00:00:00 │
├─────────────────────┤
│ 2022-07-31 00:00:00 │
└─────────────────────┘
Answered By: jqurious

Another easy way is use date_range to specify the first day of each month with 1-month intervals and use polars.duration to subtract one day.

For example:

import polars as pl
from datetime import date

(
    pl.DataFrame(
        {
            'date': pl.date_range(date(2000, 2, 1), date(2023, 1, 1), "1mo")
            + pl.duration(days=-1)
        }
    )
)
shape: (276, 1)
┌────────────┐
│ date       │
│ ---        │
│ date       │
╞════════════╡
│ 2000-01-31 │
│ 2000-02-29 │
│ 2000-03-31 │
│ 2000-04-30 │
│ ...        │
│ 2022-09-30 │
│ 2022-10-31 │
│ 2022-11-30 │
│ 2022-12-31 │
└────────────┘