Marking valid days on time series

Question:

The following is my code:

x = ts.loc[::-1, "validday"].eq(0)
x = x.groupby(x.index.to_period('M'), sort=False).cumsum().head(35)
x.head(35)

Current Output:

    Date
    2022-11-14     1
    2022-11-13     1
    2022-11-12     1
    2022-11-11     2
    2022-11-10     3
    2022-11-09     4
    2022-11-08     5
    2022-11-07     6
    2022-11-06     6
    2022-11-05     7
    2022-11-04     7
    2022-11-03     8
       .           .
       .           .
       .           .
       .           .
    2019-09-14     .

The reason is to detect the last valid day, second last valid day and third last valid day based on the available dataset

Can someone please tell me, how can i achieve this goal?

Asked By: Bella_18

||

Answers:

First select rows with no holidays, so possible use counter by GroupBy.cumcount – for add missing holiday datetimes is use Series.reindex:

mask = ts.loc[::-1, "public.holiday"].eq(0)
x = mask.groupby(x.index.to_period('M'), sort=False).cumsum()
#print (x)

s1 = x[mask]

s = (s1.groupby(s1.index.to_period('M'), sort=False).cumcount()
       .reindex(x.index, fill_value=-1))
print (s)
Date
2022-11-14    0
2022-11-13   -1
2022-11-12   -1
2022-11-11    1
2022-11-10    2
2022-11-09    3
2022-11-08    4
2022-11-07    5
2022-11-06   -1
2022-11-05    6
2022-11-04   -1
2022-11-03    7
2022-11-02    8
2022-11-01    9
2022-10-31    0
2022-10-30   -1
2022-10-29    1
2022-10-28    2
2022-10-27    3
2022-10-26   -1
2022-10-25    4
2022-10-24   -1
dtype: int64

And then convert Series to one column DataFrame and compare by 0,1,2 with casting to integers for expected ouput:

out = x.to_frame().assign(**{'last Working day': s.eq(0).astype(int),
                              'last second working day': s.eq(1).astype(int),
                              'last third working day': s.eq(2).astype(int)})
print (out.head())
            public.holiday  last Working day  last second working day  
Date                                                                    
2022-11-14               1                 1                        0   
2022-11-13               1                 0                        0   
2022-11-12               1                 0                        0   
2022-11-11               2                 0                        1   
2022-11-10               3                 0                        0   

            last third working day  
Date                                
2022-11-14                       0  
2022-11-13                       0  
2022-11-12                       0  
2022-11-11                       0  
2022-11-10                       1 

print (out.tail(8))
            public.holiday  last Working day  last second working day  
Date                                                                    
2022-10-31               1                 1                        0   
2022-10-30               1                 0                        0   
2022-10-29               2                 0                        1   
2022-10-28               3                 0                        0   
2022-10-27               4                 0                        0   
2022-10-26               4                 0                        0   
2022-10-25               5                 0                        0   
2022-10-24               5                 0                        0   

            last third working day  
Date                                
2022-10-31                       0  
2022-10-30                       0  
2022-10-29                       0  
2022-10-28                       1  
2022-10-27                       0  
2022-10-26                       0  
2022-10-25                       0  
2022-10-24                       0  
Answered By: jezrael