Pandas. delete rows between two dates groupby person

Question:

I have two dataframes; available_df and delete_df.

<available_df>
    Person  start_day   end_day    available
    1      2012-07-13   2012-07-27     0
    1      2012-07-20   2012-08-03     0
    1      2012-07-27   2012-08-10     0
    2      2012-05-06   2012-05-20     0
    2      2012-05-13   2012-05-27     0
    2      2012-06-20   2012-07-03     0
    2      2012-06-27   2012-07-10     0
    2      2012-07-04   2012-07-11     0

<delete_df>
    Person  start_day     end_day   
    1      2012-05-18   2012-05-24  
    1      2012-07-13   2012-07-20  
    2      2012-05-18   2012-06-23  

<wanted_results>
    Person  start_day   end_day    available
    1      2012-07-27   2012-08-10     0
    2      2012-06-27   2012-07-10     0
    2      2012-07-04   2012-07-11     0

What I want to do is,
among available_df, I want to groupby person’s records and remain rows which are not included in delete_df periods.
If a person’s start_day and end_day from delete_df is included in available_df, then delete.

I tried to use ‘enumerate’ methods but I failed it..
Is there anyone who can help me?

Thank you.

Asked By: ymmy

||

Answers:

I would use pandas.merge_asof here:

out= (pd
   .merge_asof(available_df.sort_values('end_day'),
               delete_df.sort_values('start_day'),
               by='Person', left_on='end_day', right_on='start_day',
               suffixes=(None, '_'),
              )
  .loc[lambda d: d['start_day'].gt(d['end_day_'])]
  .drop(columns=['start_day_', 'end_day_'])
)

output:

   Person  start_day    end_day  available
3       2 2012-06-27 2012-07-10          0
4       2 2012-07-04 2012-07-11          0
7       1 2012-07-27 2012-08-10          0
Answered By: mozway
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.