Convert a DataFrame with Periods ("from" and "to" date columns) to a Series

Question:

I have a DataFrame with school holidays. They have a "from" and "to" date column. Can you provide me with a neat and short way to convert it to a "is_holiday" Series for every day?

I have:

idx From To Name
0 2017-12-25 2018-01-05 Xmas holiday
1 2018-02-12 2018-02-23 Sport holidy
2 2018-03-29 2018-04-02 Easter holiday

I want:

Date is_holiday
..
2017-12-24 False
2017-12-25 True
2017-12-26 True
..
2018-01-04 False
2018-01-05 True
..

and so on..

Example DataFrame for your convenience:

import pandas as pd
df = pd.DataFrame({
    "From": ["2017-12-25", "2018-02-12", "2018-03-29"],
    "To": ["2018-01-05","2018-02-23","2018-04-02"],
})
df.From = pd.to_datetime(df.From)
df.To = pd.to_datetime(df.To)
Asked By: tturbo

||

Answers:

This range all dates from the lowest From to the highest To, but you can tune the interval as you wish:

df = pd.DataFrame({"From": ["2017-12-25", "2018-02-12", "2018-03-29"],"To": ["2018-01-05","2018-02-23","2018-04-02"],
})
df.From = pd.to_datetime(df.From)
df.To = pd.to_datetime(df.To)

holidays = []
for ix,row in df.iterrows():
    holidays += pd.date_range(row.From,row.To).tolist()

all_dates = pd.DataFrame({'dates':pd.date_range(df.From.min(),df.To.max())})
all_dates['is_holiday'] = False
all_dates.loc[all_dates.dates.isin(holidays),'is_holiday'] = True

EDIT, cleaner code:

holidays = []

def holidays(x):
    return pd.date_range(x.From,x.To).tolist()

holidays = df.apply(lambda x:holidays(x), axis=1).sum()
all_dates = pd.DataFrame({'dates':pd.date_range(df.From.min(),df.To.max())})
all_dates['is_holiday'] = False
all_dates.loc[all_dates.dates.isin(holidays),'is_holiday'] = True
Answered By: imburningbabe

This is the smallest solution i came up with in the end. It is based on @imburningbabe first solution. Many thanks for the inspiration! I wouldn’t have been able to do it without your answer

df = pd.DataFrame({"From": ["2017-12-25", "2018-02-12", "2018-03-29"],"To": ["2018-01-05","2018-02-23","2018-04-02"],
})
df.From = pd.to_datetime(df.From); df.To = pd.to_datetime(df.To)


all_dates = pd.DataFrame(index=pd.date_range(df.From.min(),df.To.max()))
all_dates['is_holiday'] = False

for (from_, to) in df.itertuples(index=False):
    all_dates.loc[from_:to, 'is_holiday'] = True
Answered By: tturbo
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.