Function for ids based on sowing date-harvest date duration

Question:

I want to keep only the rows from column for the period between the <sowing_date> and <harvest_date> based on ID because every id has different sowing date and harvest date

ID time NDVI sowing_date harvesting_date
106 2020-03-01 0.307967 2020-04-21 2020-11-01
106 2020-03-02 0.299089 2020-04-21 2020-11-01
106 2020-03-03 0.290211 2020-04-21 2020-11-01

I tried through groupby but it doesn’t work properly and I think only through a function or a for loop this can work. Please any thoughts?

The expected outcome should be like below

ID time NDVI sowing_date harvesting_date
106 2020-04-21 0.307967 2020-04-21 2020-11-01
106 2020-04-22 0.299089 2020-04-21 2020-11-01

106 2020-11-01 0.290211 2020-04-21 2020-11-01

Asked By: Sak

||

Answers:

This is basically just a filter then. One common way of filtering a dataframe is to create a list of True/False for each line and then filter on that. This looks something like:

filter = [(df.time <= df.harvesting_date) & (df.time >= df.sowing_date)]
filtered_df = df[filter]

You could also do that in one line, but this makes it easier to see what "filter" is doing, if you are interested.

A word of caution though! Be sure those dates are datetime objects; dates often show as as strings, so you’d need to use something like strptime() to change them.

Hope this helps!

Answered By: Vincent Rupp
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.