Creating day after long weekend flag in pandas

Question:

I want to create a new column in pandas data frame ‘day_after_long_weekend’ based on the conditions shown in the image.

Sat and Sun are default holidays

  • condition 1 – if Fri is a holiday (national/provincial) then Monday is the day after long weekend.
  • condition 2 – if Fri and following Monday is a holiday then Tuesday is the day after long weekend.
  • condition 3 – if Mon and Tuesday is a holiday then Wednesday will be 1, etc.

df = pd.DataFrame([[0, 1, 'Fri'], [0, 0, 'Sat'], [0, 0, 'Sun'], [0, 0, 'Mon'],[0, 0, 'Tue'],[0, 0, 'Wed'],[0, 0, 'Thu'],[0, 1, 'Fri'],[0, 0, 'Sat'],[0, 0, 'Sun'],[1, 0, 'Mon'],[0, 0, 'Tue']] , columns=['national_holiday', 'provincial_holiday','day_of week'])

How can I do this? Any help will be appreciated.

enter image description here

Asked By: User771

||

Answers:

I first added columns here for is_weekend and is_holiday like so:

df['is_weekend'] = df['day_of_week'].isin(['Saturday','Sunday'])
df['is_holiday'] = df['national_holiday'] | df['provincial_holiday'] | df['is_weekend'] # i set up the columns to be True/False instead of 1/0, so check that if you have an issue

So now we have df['is_holiday'] that is True/False for if work is cancelled that day.

Now we want to find days where: (1) it is not a holiday, (2) yesterday was a holiday, and (3) there were more than two holidays in a row. We can do that like this:

df['long_weekend'] = (df['holiday'] == False) & (df['holiday'].shift(1) == True) & ((df.groupby((df['holiday'] != df['holiday'].shift(1)).cumsum()).cumcount()+1).shift(1) > 2)

I did pull the logic for the cumulative count of previous days from this other SO question/answer: Pandas: conditional rolling count

Answered By: scotscotmcc

Use masks and a rolling sum:

# is a weekend?
m1 = df['day_of week'].isin(['Sat', 'Sun'])
# is a holiday?
m2 = df[['national_holiday', 'provincial_holiday']].eq(1).any(axis=1)

# weekend or holiday
m = (m1|m2)

# is there 3 weekend days or holidays in the last 3 days
# and today is a working day?
df['day_after_long_weekend'] = m.rolling(3).sum().shift().eq(3) & ~m

Output:

Code
 
In [19]:
m1 = df['day_of week'].isin(['Sat', 'Sun'])
​
m2 = df[['national_holiday', 'provincial_holiday']].eq(1).any(axis=1)
​
m = (m1|m2)
​
df['day_after_long_weekend'] = (m.rolling(3).sum().shift().eq(3) & ~m).astype(int)
​
print(df)
    national_holiday  provincial_holiday day_of week  day_after_long_weekend
0                  0                   1         Fri                       0
1                  0                   0         Sat                       0
2                  0                   0         Sun                       0
3                  0                   0         Mon                       1
4                  0                   0         Tue                       0
5                  0                   0         Wed                       0
6                  0                   0         Thu                       0
7                  0                   1         Fri                       0
8                  0                   0         Sat                       0
9                  0                   0         Sun                       0
10                 1                   0         Mon                       0
11                 0                   0         Tue                       1
Answered By: mozway
df.assign(col1=((df.iloc[:,:2]==1).any(axis=1))|(df.day_of_week.isin(['Sat','Sun'])))
    .assign(col1=lambda dd:np.where((dd.col1==False)&(dd.col1.shift(1)==True),1,0))

  national_holiday  provincial_holiday day_of_week  col1
0                  0                   1         Fri     0
1                  0                   0         Sat     0
2                  0                   0         Sun     0
3                  0                   0         Mon     1
4                  0                   0         Tue     0
5                  0                   0         Wed     0
6                  0                   0         Thu     0
7                  0                   1         Fri     0
8                  0                   0         Sat     0
9                  0                   0         Sun     0
10                 1                   0         Mon     0
11                 0                   0         Tue     1
Answered By: G.G
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.