Create column with positive and negative adding up index based on certain row in Pandas

Question:

I have the following dataframe (sample):

import pandas as pd

data = [['A', False, 2], ['A', True, 8], ['A', False, 25], ['A', False, 30], ['B', False, 4], ['B', False, 8], ['B', True, 2], ['B', True, 1], ['B', False, 3]]
df = pd.DataFrame(data = data, columns = ['group', 'indicator', 'val'])

  group  indicator  val
0     A      False    2
1     A       True    8
2     A      False   25
3     A      False   30
4     B      False    4
5     B      False    8
6     B       True    2
7     B       True    1
8     B      False    3

I would like to add a column called idx, which has values 0 for the rows with True and a negative adding up values before (backward filling) the 0 and a positive adding up values after (forward filling) the 0 per group. I started with using the following which does the job of adding a 0 for certain rows, but I don’t know how to do the cumulative filling part:

# add zero index to rows with True
df.loc[df['indicator'] == True, 'idx'] = 0
# cumulative filling per group
df.groupby('group')['idx'].fillna(1).cumsum()

The desired output should look like this:

data = [['A', False, 2, -1], ['A', True, 8, 0], ['A', False, 25, 1], ['A', False, 30, 2], ['B', False, 4, -2], ['B', False, 8, -1], ['B', True, 2, 0], ['B', True, 1, 0], ['B', False, 3, 1]]
df_desired = pd.DataFrame(data = data, columns = ['group', 'indicator', 'val', 'idx'])

  group  indicator  val  idx
0     A      False    2   -1
1     A       True    8    0
2     A      False   25    1
3     A      False   30    2
4     B      False    4   -2
5     B      False    8   -1
6     B       True    2    0
7     B       True    1    0
8     B      False    3    1

So I was wondering if anyone knows how to do the positive negative cumulative filling part per group using pandas?

Asked By: Quinten

||

Answers:

I made an extra column named sign and used .cumsum() and a reversed version of .cumsum() to create the column idx:

# create column sign
df.loc[df['indicator'] == True, 'sign'] = -1
df['sign'] = df.groupby(['group'])['sign'].bfill()

df.loc[df['indicator'] == True, 'sign'] = 1
df['sign'] = df.groupby(['group'])['sign'].ffill()

df.loc[df['indicator'] == True, 'sign'] = 0

# create column idx
df.loc[df.sign == 1, 'idx'] = df[df.sign == 1].groupby(['group']).sign.cumsum()
df.loc[df.sign == 0, 'idx'] = 0
df.loc[df.sign == -1, 'idx'] = df[df.sign == -1][::-1].groupby(['group']).sign.cumsum()[::-1]
# remove column sign and make column idx integer value
df['idx'] = df['idx'].astype(int)
df = df.drop(columns='sign')

Output:

  group  indicator  val  idx
0     A      False    2   -1
1     A       True    8    0
2     A      False   25    1
3     A      False   30    2
4     B      False    4   -2
5     B      False    8   -1
6     B       True    2    0
7     B       True    1    0
8     B      False    3    1
Answered By: T C Molenaar
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.