How to implement third Nelson's rule with Pandas?

Question:

I am trying to implement Nelson’s rules using Pandas. One of them is giving me grief, specifically number 3:

nelson_rule_3

Using some example data:

data = pd.DataFrame({"values":[1,2,3,4,5,6,7,5,6,5,3]})

values
0 1
1 2
2 3
3 4
4 5
5 6
6 7
7 5
8 6
9 5
10 3

My first approach was to use a rolling window to check if they are in/decreasing with diff()>0 and use this to identify "hits" on the rule:

(data.diff()>0).rolling(6).sum()==6

This correctly identifies the end values (1=True, 0=False):

values correct /desired
0 0 0
1 0 1
2 0 1
3 0 1
4 0 1
5 0 1
6 1 1
7 0 0
8 0 0
9 0 0
10 0 0

This misses the first points (which are part of the run) because rolling is a look behind. Given this rule requires 6 points in a row, I essentially need to evaluate for a given point the 6 possible windows it can fall in and then mark it as true if it is part of any window in which the points are consecutively in/decreasing.

I can think of how I could do this with some custom Python code with iterrows() or apply. I am, however keen to keep this performant, so want to limit myself to the Panda’s API.

How can this be achieved ?

Asked By: Oliver Cohen

||

Answers:

With the following toy dataframe (an extended version of yours):

import pandas as pd


df = pd.DataFrame({"values": [1, 2, 3, 4, 5, 6, 7, 5, 6, 5, 3, 11, 12, 13, 14, 15, 16, 4, 3, 8, 9, 10, 2]})

Here is one way to do it with Pandas rolling and interpolate:

# Find consecutive values
df["check"] = (df.diff() > 0).rolling(6).sum()
df["check"] = df["check"].mask(df["check"] < 6).mask(df["check"] >= 6, 1)

# Mark values
df = df.interpolate(limit_direction="backward", limit=5).fillna(0)

Then:

print(df)
# Output
    values  check
0        1      0
1        2      1
2        3      1
3        4      1
4        5      1
5        6      1
6        7      1
7        5      0
8        6      0
9        5      0
10       3      0
11      11      1
12      12      1
13      13      1
14      14      1
15      15      1
16      16      1
17       4      0
18       3      0
19       8      0
20       9      0
21      10      0
22       2      0
Answered By: Laurent
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.