Conducting Equations by Group in Python/Pandas

Question

I have the data below:

df = pd.DataFrame({'Group': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C', 'C'], 
'W': [10, 0, 4, 0, 0, 8, 2, 0, 1, 4, 0, 0],
'X': [0, 1, 0, 0, 0, 0, 4, 2, 0, 0, 5, 0],
'Y': [0, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0],
'Z': [0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 0, 0]}, 
                  columns=['Group', 'W', 'X', 'Y', 'Z'])

I would like to perform this basic equation: df['W'] = (df['W'] - (df['X'] + df['Y'] + df['Z'])).cumsum().fillna(0) by ‘Group’

I have tried two different methods below they do not seem to be working.

Lambda Method:

df['W'] = (df.groupby('Group', as_index = False)
        .apply(lambda g: g['W'] - (g['X'] + g['Y'] + g['Z']).cumsum().fillna(0)))

Basic Group By:

df['W'] = df.groupby(['Group'])[('W' - ('X' + 'Y' + 'Z'))].cumsum().fillna(0)

Am I formatting them equation incorrectly? Any help would be appreciated.

Asked By: John Tayson

||

Source

Answer 1

IIUC, Try this:

df['W_new'] = df.eval('W_new = W - X + Y + Z').groupby('Group')['W_new'].cumsum()
print(df)

Output:

   Group   W  X  Y  Z  W_new
0      A  10  0  0  0     10
1      A   0  1  2  0     11
2      A   4  0  3  0     18
3      A   0  0  0  1     19
4      B   0  0  0  0      0
5      B   8  0  0  0      8
6      B   2  4  0  0      6
7      B   0  2  0  2      6
8      C   1  0  0  0      1
9      C   4  0  0  0      5
10     C   0  5  0  0      0
11     C   0  0  0  0      0

Well, I may have the equation wrong:

df['W_new'] = df.eval('W_new = W - (X + Y + Z)').groupby('Group')['W_new'].cumsum()
print(df)

Output:

   Group   W  X  Y  Z  W_new
0      A  10  0  0  0     10
1      A   0  1  2  0      7
2      A   4  0  3  0      8
3      A   0  0  0  1      7
4      B   0  0  0  0      0
5      B   8  0  0  0      8
6      B   2  4  0  0      6
7      B   0  2  0  2      2
8      C   1  0  0  0      1
9      C   4  0  0  0      5
10     C   0  5  0  0      0
11     C   0  0  0  0      0

Answered By: Scott Boston

Answer 2

Try running this in 2 steps. Here I’m creating a new column called ‘W2’ just so we can see what happens. Note the use of axis=1 argument in apply to make sure we’re operating on rows and not columns

df = pd.DataFrame({'Group': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C', 'C'], 
'W': [10, 0, 4, 0, 0, 8, 2, 0, 1, 4, 0, 0],
'X': [0, 1, 0, 0, 0, 0, 4, 2, 0, 0, 5, 0],
'Y': [0, 2, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0],
'Z': [0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 0, 0]}, 
                  columns=['Group', 'W', 'X', 'Y', 'Z'])

df['W2'] = df.apply(lambda g: g['W'] - (g['X'] + g['Y'] + g['Z']),axis=1).fillna(0)
df['W2'] = df.groupby('Group')['W2'].cumsum()

produces

df
   Group   W  X  Y  Z  W2
0      A  10  0  0  0  10
1      A   0  1  2  0   7
2      A   4  0  3  0   8
3      A   0  0  0  1   7
4      B   0  0  0  0   0
5      B   8  0  0  0   8
6      B   2  4  0  0   6
7      B   0  2  0  2   2
8      C   1  0  0  0   1
9      C   4  0  0  0   5
10     C   0  5  0  0   0
11     C   0  0  0  0   0

Answered By: Joe Carboni

Conducting Equations by Group in Python/Pandas

Question:

Answers: