Is there a way to do last_valid_index() in a rolling window?

Question

last_valid_index() only applies to the entire dataframe and rolling() does not allow last_valid_index(). Is there a way to find the last valid index in a column of booleans in a window?

For instance:

d = {'col': [True, False, True, True, False, False]}

df = pd.DataFrame(data=d)

The expected outcome for a rolling window of 3 is:

Asked By: j riv

||

Source

Answer 1

We have some work around

df['new'] = df.index
df['new'].mask(df.youcol.isnull()).ffill().rolling(3).max()

From the comment

df['new'] = df.index
df['new'] = df['new'].where(df.col).ffill().rolling(3).max()
0    NaN
1    NaN
2    2.0
3    3.0
4    3.0
5    3.0
Name: new, dtype: float64

Answered By: BENY

Answer 2

As I mentioned in a comment here, I think the current accepted solution has a bug. A lot of the beginning of this post is taken word-for-word from my comment there.

Change the example to be

d = {'col': [True, False, True, True, False, False, False]}

df = pd.DataFrame(data=d)

Then the last 3 entries compose the entire rolling window of 3, and all are False. But the current accepted solution returns index 3 for the last entry, even though I am assuming it should be NaN (otherwise what’s the point of having the rolling window at all, other than to set the first 2 observations as NaN?).

Here is my proposed fix:

df['new'] = df.index
df['new'] = df['new'].where(df['col'], -1).rolling(3).max().replace(-1, np.nan)

What it does is instead of replacing values where df['col'] is False with NaNs, then using ffill() to replace those indices with the previous index, it replaces those indices with -1. Then at the end, if all the indices in a window have value -1, it means the entire window has df['col'] as False, so that index is replaced with np.nan.

Answered By: Adam Oppenheimer

Is there a way to do last_valid_index() in a rolling window?

Question:

Answers: