Find the min/max in a pandas data column wth nans in between

Question:

I have a pandas data-frame with a column named "Outside Dead Band", the data is recorded at 32Hz (32 data points per second).

I want to follow the follwing algorithm.

  1. Between 2 nans in the data column, check
    1. The duration of no nans

      1. if duration is more than 2 seconds

        1. take the max, if values between the nans are positive, and append to a list named maneuvers.

        2. take the min, if the values between nans are negative, and append to a list named maneuvers.

      2. if the duration is less than 2 seconds

        1. take the max, if values between the nans are positive, and append to a list named gusts.

        2. take the min, if the values between nans are negative, and append to a list named gusts.

Examples:

Example 1

Data Snippet

 NaN 
 NaN 
 NaN 
 NaN 
0.935829
 NaN 
 NaN 
0.9468344
 NaN 
0.9352744
 NaN 
0.9299145
 NaN 
0.9159902
 NaN 
0.9189067
0.9447504
 NaN 
 NaN 
0.9488161

Expected Outputs

gusts = [0.935829, 0.9468344, 0.9352744, 0.9299145, 0.9159902, 0.9159902, 0.9447504, 0.9488161]
Example 2

Data Snippet

 NaN 
 NaN 
1.066175
1.108567
1.103931
1.098653
1.094846
1.062542
1.053064
 NaN 
 NaN
0.9460738
0.931207
0.9161806
0.9083371
0.9201323
0.9272887
0.9176005
0.9021356
0.9303108
0.9178913
0.8911541
0.8558757
0.8634101
0.828901
0.8187609
0.8117134
0.8005729
0.7740957
0.7548033
0.7564046
0.7697771
0.7818314
0.7997488
0.8270378
0.8616151
0.8802456
0.9116527
0.9257826
0.9388146
0.945994
0.9453149
0.9454532
0.9426287
0.928901
0.9325082
0.9312031
0.9289232
0.916741
0.9420649
0.9212928
0.922505
0.9238197
0.9236084
0.8717794
0.8492894
0.8158376
0.7905051
0.7699976
0.747136
0.7314162
0.7468339
0.7403114
0.7393804
0.7492437
0.7990298
0.818364
0.8724768
0.947295
0.9460738
0.931207
0.9161806
0.9083371
0.9201323
0.9272887
0.9176005
0.9021356
0.9303108
0.9178913
0.8911541
0.8558757
 NaN
 NaN 
 NaN 
1.055898
 NaN

Expected Outputs

gusts = [1.108567, 1.055898]
maneuvers = [0.947295]
Example 3

Data Snippet

 NaN 
 NaN 
-1.066175
-1.108567
-1.103931
-1.098653
-1.094846
-1.062542
-1.053064
 NaN 
 NaN
-0.9460738
-0.931207
-0.9161806
-0.9083371
-0.9201323
-0.9272887
-0.9176005
-0.9021356
-0.9303108
-0.9178913
-0.8911541
-0.8558757
-0.8634101
-0.828901
-0.8187609
-0.8117134
-0.8005729
-0.7740957
-0.7548033
-0.7564046
-0.7697771
-0.7818314
-0.7997488
-0.8270378
-0.8616151
-0.8802456
-0.9116527
-0.9257826
-0.9388146
-0.945994
-0.9453149
-0.9454532
-0.9426287
-0.928901
-0.9325082
-0.9312031
-0.9289232
-0.916741
-0.9420649
-0.9212928
-0.922505
-0.9238197
-0.9236084
-0.8717794
-0.8492894
-0.8158376
-0.7905051
-0.7699976
-0.747136
-0.7314162
-0.7468339
-0.7403114
-0.7393804
-0.7492437
-0.7990298
-0.818364
-0.8724768
-0.947295
-0.9460738
-0.931207
-0.9161806
-0.9083371
-0.9201323
-0.9272887
-0.9176005
-0.9021356
-0.9303108
-0.9178913
-0.8911541
-0.8558757
 NaN
 NaN 
 NaN 
-1.055898
 NaN

Expected Outputs

gusts = [-1.108567, -1.055898]
maneuvers = [-0.947295]

I tried to isolate the loop and use a for loop and a series of if and else statements, but i seem to have my logic incorrect in that. would really appreciate some help on this within the dataframe itself if possible.

norm_accel = flight["Outside Dead Band"].tolist()
gusts = []
maneuvers = []
while i <= (len(norm_accel)):
    if norm_accel[i] != numpy.nan:
        if norm_accel[i+1] == numpy.nan:
            gusts.append(norm_accel(i))
        else:
            j = i
            counter = 0
        while norm_accel[j] != numpy.nan:
            counter =+ 1
            j =+ 1
        if counter >= 64:
            maneuvers.append(max(norm_accel[i:j]))
        else:
            gusts.append(max(norm_accel[i:j]))
        i = j
    i = i + 1

I do know that this doesnt account for the max min condition, i am not sure how to incorporate that.

Asked By: Benny

||

Answers:

I would put this into a pandas dataframe, and use the occurrence of NaNs to create an id column, which you can then use to do a groupby and calculate the relevant statistics. Assuming data is a dataframe with the values in a val column, it could look like:

data["id"] = data["val"].isna().cumsum()
data = data.dropna()
grps = data.groupby("id").agg(
    counts=("val", "count"),
    min=("val", "min"),
    max=("val", "max"),
)
grps

Which using your Example 2 gives you:

    counts       min       max
id                            
2        7  1.053064  1.108567
4       70  0.731416  0.947295
7        1  1.055898  1.055898

You can then use simple rules to create your lists:

grps["val"] = np.where(grps["max"] > 0, grps["max"], grps["min"])
manuevers = grps.loc[grps.counts >= 64, "val"].tolist()
gusts = grps.loc[grps.counts < 64, "val"].tolist()
Answered By: Robert Robison
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.