Is there a way to do last_valid_index() in a rolling window?

Question:

last_valid_index() only applies to the entire dataframe and rolling() does not allow last_valid_index(). Is there a way to find the last valid index in a column of booleans in a window?

For instance:

d = {'col': [True, False, True, True, False, False]}

df = pd.DataFrame(data=d)

The expected outcome for a rolling window of 3 is:

0    NaN
1    NaN
2    2.0
3    3.0
4    3.0
5    3.0
Asked By: j riv

||

Answers:

We have some work around

df['new'] = df.index
df['new'].mask(df.youcol.isnull()).ffill().rolling(3).max()

From the comment

df['new'] = df.index
df['new'] = df['new'].where(df.col).ffill().rolling(3).max()
0    NaN
1    NaN
2    2.0
3    3.0
4    3.0
5    3.0
Name: new, dtype: float64
Answered By: BENY

As I mentioned in a comment here, I think the current accepted solution has a bug. A lot of the beginning of this post is taken word-for-word from my comment there.

Change the example to be

d = {'col': [True, False, True, True, False, False, False]}

df = pd.DataFrame(data=d)

Then the last 3 entries compose the entire rolling window of 3, and all are False. But the current accepted solution returns index 3 for the last entry, even though I am assuming it should be NaN (otherwise what’s the point of having the rolling window at all, other than to set the first 2 observations as NaN?).

Here is my proposed fix:

df['new'] = df.index
df['new'] = df['new'].where(df['col'], -1).rolling(3).max().replace(-1, np.nan)

What it does is instead of replacing values where df['col'] is False with NaNs, then using ffill() to replace those indices with the previous index, it replaces those indices with -1. Then at the end, if all the indices in a window have value -1, it means the entire window has df['col'] as False, so that index is replaced with np.nan.

Answered By: Adam Oppenheimer
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.