generate date features from timeseries

Question:

I have a dataframe that has columns like these:

Date          temp_data        holiday           day   

01.01.2000    10000              0                1
02.01.2000    0                  1                2
03.01.2000    2000               0                3
..
..
..
30.01.2000    200                0                30
31.01.2000     0                 1                31
01.02.2000     0                 1                 1
02.02.2000    2500               0                 2

holiday = 0 when there is data present – indicates a working day

holiday = 1 when there is no data present – indicated a non-working day

I am trying to extract two new columns
first_working_day_of_month
and
last_working_day_of_month

the dataframe should look like this

Date          temp_data        holiday           day     first_wd_of_month  last_wd_of_month

01.01.2000    10000              0                1             1                0
02.01.2000    0                  1                2             0                0
03.01.2000    2000               0                3             0                0
..
..
..
30.01.2000    200                0                30            0                1
31.01.2000     0                 1                31            0                0
01.02.2000     0                 1                 1            0                0
02.02.2000    2500               0                 2            1                0

Can anyone help me with this?

Asked By: Bella_18

||

Answers:

Solution

# Convert the column to datetime 
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)

# mask the dates when there is holdiday
w = df['Date'].mask(df['holiday'] == 1)

# group the working dates by monthly frequency
g = w.groupby(df['Date'].dt.to_period('M'))

# transform each group to get the first and last working day per month
# then compare with current date to identify whether the current
# day is the first or last working day
df['first_wd_of_month'] = df['Date'] == g.transform('first')
df['last_wd_of_month' ] = (df['Date'] == g.transform('last')) & ~df['first_wd_of_month']

Result

        Date  temp_data  holiday  day  first_wd_of_month  last_wd_of_month
0 2000-01-01      10000        0    1               True             False
1 2000-01-02          0        1    2              False             False
2 2000-01-03       2000        0    3              False             False
3 2000-01-30        200        0   30              False              True
...
4 2000-01-31          0        1   31              False             False
5 2000-02-01          0        1    1              False             False
6 2000-02-02       2500        0    2               True             False
Answered By: Shubham Sharma

try:

import datetime

df
    Date
0   01.01.2000
1   02.01.2000
2   03.01.2000
3   30.01.2000
4   31.01.2000
5   01.02.2000
6   02.02.2000

df['Date'] = pd.to_datetime(df['Date'])

def f(x):
    next_day = x + datetime.timedelta(days=1)
    if x.month != next_day.month:
        return 1
    return 0

def g(x):
    prev_day = x - datetime.timedelta(days=1)
    if x.month != prev_day.month:
        return 1
    return 0

df['is_last_day_of_month'] = df['Date'].map(f)
df['is_first_day_of_month'] = df['Date'].map(g)

    Date        is_last_day_of_month    is_first_day_of_month
0   2000-01-01  0                       1
1   2000-02-01  0                       1
2   2000-03-01  0                       1
3   2000-01-30  0                       0
4   2000-01-31  1                       0
5   2000-01-02  0                       0
6   2000-02-02  0                       0
Answered By: khaled koubaa