performance of pandas custom business day offset

Question:

For a ton of dates, I need to compute the next business day, where I account for holidays.

Currently, I’m using something like the code below:

import pandas as pd
from pandas.tseries.holiday import USFederalHolidayCalendar

cal = USFederalHolidayCalendar()
bday_offset = lambda n: pd.datetools.offsets.CustomBusinessDay(n, calendar=cal)

mydate = pd.to_datetime("12/24/2014")
%timeit with_holiday = mydate + bday_offset(1)
%timeit without_holiday = mydate + pd.datetools.offsets.BDay(1)

On my computer, the with_holiday line runs in ~12 milliseconds; and the without_holiday line runs in ~15 microseconds.

Is there any way to make the bday_offset function faster?

Asked By: hahdawg

||

Answers:

I think the way you are implementing it via lambda is slowing it down. Consider this method (taken more or less straight from the documentaion )

from pandas.tseries.offsets import CustomBusinessDay
bday_us = CustomBusinessDay(calendar=USFederalHolidayCalendar())
mydate + bday_us

Out[13]: Timestamp('2014-12-26 00:00:00')

The first part is slow, but you only need to do it once. The second part is very fast though.

%timeit bday_us = CustomBusinessDay(calendar=USFederalHolidayCalendar())
10 loops, best of 3: 66.5 ms per loop

%timeit mydate + bday_us
10000 loops, best of 3: 44 µs per loop

To get apples to apples, here are the other timings on my machine:

%timeit with_holiday = mydate + bday_offset(1)
10 loops, best of 3: 23.1 ms per loop

%timeit without_holiday = mydate + pd.datetools.offsets.BDay(1)
10000 loops, best of 3: 36.6 µs per loop
Answered By: JohnE
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.