How to reset a Sum after it reaches a milestone and then continue it in Pandas DataFrame in Python?

Question:

I’ve got one DataFrame that I would like to Sum the "Total_time" column and if it reaches a certain valeu the sum resets. The Sum need to be separated by different "ac"(using groupby) and when it reaches more than 185, the Sum is reseted.

The DataFrame:

index ac flight_date Total_time
0 PR-AKA 2023-03-13 00:00:00 87
1 PR-AKA 2023-03-13 00:00:00 55
2 PR-AKA 2023-03-13 00:00:00 59
3 PR-AKA 2023-03-13 00:00:00 71
4 PR-AKA 2023-03-14 00:00:00 70
5 PR-AKA 2023-03-14 00:00:00 58
6 PR-AKA 2023-03-14 00:00:00 45
7 PR-AKA 2023-03-14 00:00:00 41
8 PR-AKA 2023-03-14 00:00:00 33
9 PR-AKA 2023-03-14 00:00:00 42
10 PR-AKA 2023-03-15 00:00:00 70
11 PR-AKB 2023-03-13 00:00:00 109
12 PR-AKB 2023-03-13 00:00:00 100
13 PR-AKB 2023-03-13 00:00:00 92
14 PR-AKB 2023-03-13 00:00:00 102
15 PR-AKB 2023-03-14 00:00:00 71
16 PR-AKB 2023-03-14 00:00:00 74
17 PR-AKB 2023-03-14 00:00:00 80
18 PR-AKB 2023-03-14 00:00:00 64
19 PR-AKB 2023-03-14 00:00:00 66
20 PR-AKB 2023-03-14 00:00:00 70
21 PR-AKB 2023-03-14 00:00:00 31
22 PR-AKB 2023-03-14 00:00:00 45
23 PR-AKB 2023-03-15 00:00:00 72
24 PR-AKB 2023-03-15 00:00:00 70
25 PR-AKB 2023-03-15 00:00:00 64
26 PR-AKB 2023-03-15 00:00:00 75
27 PR-AKB 2023-03-15 00:00:00 52
28 PR-AKB 2023-03-15 00:00:00 58
29 PR-AKB 2023-03-16 00:00:00 36
30 PR-AKB 2023-03-16 00:00:00 47
31 PR-AKB 2023-03-16 00:00:00 14
32 PR-AKC 2023-03-13 00:00:00 86
33 PR-AKC 2023-03-13 00:00:00 82
34 PR-AKC 2023-03-13 00:00:00 83

Expected result:

index ac flight_date Total_time Sum
0 PR-AKA 2023-03-13 00:00:00 87 87
1 PR-AKA 2023-03-13 00:00:00 55 142
2 PR-AKA 2023-03-13 00:00:00 59 201
3 PR-AKA 2023-03-13 00:00:00 71 71
4 PR-AKA 2023-03-14 00:00:00 70 141
5 PR-AKA 2023-03-14 00:00:00 58 199
6 PR-AKA 2023-03-14 00:00:00 45 45
7 PR-AKA 2023-03-14 00:00:00 41 86
8 PR-AKA 2023-03-14 00:00:00 33 119
9 PR-AKA 2023-03-14 00:00:00 42 161
10 PR-AKA 2023-03-15 00:00:00 70 231
11 PR-AKB 2023-03-13 00:00:00 109 109
12 PR-AKB 2023-03-13 00:00:00 100 209
13 PR-AKB 2023-03-13 00:00:00 92 92
14 PR-AKB 2023-03-13 00:00:00 102 194
15 PR-AKB 2023-03-14 00:00:00 71 71
16 PR-AKB 2023-03-14 00:00:00 74 145
17 PR-AKB 2023-03-14 00:00:00 80 225
18 PR-AKB 2023-03-14 00:00:00 64 64
19 PR-AKB 2023-03-14 00:00:00 66 131
20 PR-AKB 2023-03-14 00:00:00 70 201
21 PR-AKB 2023-03-14 00:00:00 31 31
22 PR-AKB 2023-03-14 00:00:00 45 76
23 PR-AKB 2023-03-15 00:00:00 72 148
24 PR-AKB 2023-03-15 00:00:00 70 218
25 PR-AKB 2023-03-15 00:00:00 64 64
26 PR-AKB 2023-03-15 00:00:00 75 139
27 PR-AKB 2023-03-15 00:00:00 52 191
28 PR-AKB 2023-03-15 00:00:00 58 58
29 PR-AKB 2023-03-16 00:00:00 36 94
30 PR-AKB 2023-03-16 00:00:00 47 141
31 PR-AKB 2023-03-16 00:00:00 14 155
32 PR-AKC 2023-03-13 00:00:00 86 86
33 PR-AKC 2023-03-13 00:00:00 82 83
34 PR-AKC 2023-03-13 00:00:00 83 166

….

I’ve marked in black where the sum must be reseted.
In one question in here I’ve found one code that almost did it, but instead of reset the sum it put a 0 as bellow but I don’t know how to fix it:

def my_accumulate(maxval):
    val = 0
    yield
    while True:
        if val < maxval:
            val += yield val
        else:
            yield val
            val = 0


def fn(x):
    a = my_accumulate(185)
    next(a)
    x["TIME_LIMIT"] = [a.send(v) for v in x["Total_time"]]
    return x


ATR = ATR.groupby('ac').apply(fn)
ATR

Result:

index ac flight_date Total_time TIME_LIMIT
0 PR-AKA 2023-03-13 00:00:00 87 0
1 PR-AKA 2023-03-13 00:00:00 55 55
2 PR-AKA 2023-03-13 00:00:00 59 114
3 PR-AKA 2023-03-13 00:00:00 71 185
4 PR-AKA 2023-03-14 00:00:00 70 0
5 PR-AKA 2023-03-14 00:00:00 58 58
6 PR-AKA 2023-03-14 00:00:00 45 103
7 PR-AKA 2023-03-14 00:00:00 41 144
8 PR-AKA 2023-03-14 00:00:00 33 177
9 PR-AKA 2023-03-14 00:00:00 42 219
10 PR-AKA 2023-03-15 00:00:00 70 0
11 PR-AKB 2023-03-13 00:00:00 109 0
12 PR-AKB 2023-03-13 00:00:00 100 100
13 PR-AKB 2023-03-13 00:00:00 92 192
14 PR-AKB 2023-03-13 00:00:00 102 0
15 PR-AKB 2023-03-14 00:00:00 71 71
16 PR-AKB 2023-03-14 00:00:00 74 145
17 PR-AKB 2023-03-14 00:00:00 80 225
18 PR-AKB 2023-03-14 00:00:00 64 0
19 PR-AKB 2023-03-14 00:00:00 66 66
20 PR-AKB 2023-03-14 00:00:00 70 136
21 PR-AKB 2023-03-14 00:00:00 31 167
22 PR-AKB 2023-03-14 00:00:00 45 212
23 PR-AKB 2023-03-15 00:00:00 72 0
24 PR-AKB 2023-03-15 00:00:00 70 70
25 PR-AKB 2023-03-15 00:00:00 64 134
26 PR-AKB 2023-03-15 00:00:00 75 209
27 PR-AKB 2023-03-15 00:00:00 52 0
28 PR-AKB 2023-03-15 00:00:00 58 58
29 PR-AKB 2023-03-16 00:00:00 36 94
30 PR-AKB 2023-03-16 00:00:00 47 141
31 PR-AKB 2023-03-16 00:00:00 14 155
32 PR-AKC 2023-03-13 00:00:00 86 0
33 PR-AKC 2023-03-13 00:00:00 82 82
34 PR-AKC 2023-03-13 00:00:00 83 165
Asked By: Hygor

||

Answers:

You could create your own numpy ufunc to handle your problem, using np.frompyfunc

import numpy as np
import pandas as pd

df = pd.read_clipboard() # Your df here

f = np.frompyfunc(lambda x, y: x + y if x < 185 else y, nin=2, nout=1)
out = df.groupby("ac")["Total_time"].transform(f.accumulate).astype(int)

out:

0      87
1     142
2     201
3      71
4     141
5     199
6      45
7      86
8     119
9     161
10    231
11    109
12    209
13     92
14    194
15     71
16    145
17    225
18     64
19    130
20    200
21     31
22     76
23    148
24    218
25     64
26    139
27    191
28     58
29     94
30    141
31    155
32     86
33    168
34    251
Name: Total_time, dtype: int32
Answered By: Chrysophylaxs

You can change your generator/function to:

def my_accumulate(s, maxval=float('inf')):
    curr = 0
    for x in s:
        curr += x         # add previous value
        yield curr
        if curr > maxval: # if above threshold
            curr = 0      # reset the count

def fn(g):
    return list(my_accumulate(g, maxval=185))
        
ATR['TIME_LIMIT'] = ATR.groupby('ac')['Total_time'].transform(fn)

Output:

    index       ac           flight_date  Total_time  TIME_LIMIT
0       0  PR-AKA   2023-03-13 00:00:00           87          87
1       1  PR-AKA   2023-03-13 00:00:00           55         142
2       2  PR-AKA   2023-03-13 00:00:00           59         201
3       3  PR-AKA   2023-03-13 00:00:00           71          71
4       4  PR-AKA   2023-03-14 00:00:00           70         141
5       5  PR-AKA   2023-03-14 00:00:00           58         199
6       6  PR-AKA   2023-03-14 00:00:00           45          45
7       7  PR-AKA   2023-03-14 00:00:00           41          86
8       8  PR-AKA   2023-03-14 00:00:00           33         119
9       9  PR-AKA   2023-03-14 00:00:00           42         161
10     10  PR-AKA   2023-03-15 00:00:00           70         231
11     11  PR-AKB   2023-03-13 00:00:00          109         109
12     12  PR-AKB   2023-03-13 00:00:00          100         209
13     13  PR-AKB   2023-03-13 00:00:00           92          92
14     14  PR-AKB   2023-03-13 00:00:00          102         194
15     15  PR-AKB   2023-03-14 00:00:00           71          71
16     16  PR-AKB   2023-03-14 00:00:00           74         145
17     17  PR-AKB   2023-03-14 00:00:00           80         225
18     18  PR-AKB   2023-03-14 00:00:00           64          64
19     19  PR-AKB   2023-03-14 00:00:00           66         130
20     20  PR-AKB   2023-03-14 00:00:00           70         200
21     21  PR-AKB   2023-03-14 00:00:00           31          31
22     22  PR-AKB   2023-03-14 00:00:00           45          76
23     23  PR-AKB   2023-03-15 00:00:00           72         148
24     24  PR-AKB   2023-03-15 00:00:00           70         218
25     25  PR-AKB   2023-03-15 00:00:00           64          64
26     26  PR-AKB   2023-03-15 00:00:00           75         139
27     27  PR-AKB   2023-03-15 00:00:00           52         191
28     28  PR-AKB   2023-03-15 00:00:00           58          58
29     29  PR-AKB   2023-03-16 00:00:00           36          94
30     30  PR-AKB   2023-03-16 00:00:00           47         141
31     31  PR-AKB   2023-03-16 00:00:00           14         155
32     32  PR-AKC   2023-03-13 00:00:00           86          86
33     33  PR-AKC   2023-03-13 00:00:00           82         168
34     34  PR-AKC   2023-03-13 00:00:00           83         251
Answered By: mozway
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.