How to calulate difference between two columns and flag based on condition?
Question:
I have dataframe
Group Required stock
0 A 10 5
1 A 10 8
2 A 10 7
3 B 13 6
4 B 13 5
5 C 8 4
6 C 8 5
7 C 8 8
8 D 16 NaN
Here required for A, B, C, D is [10,13,8,16]
and my respective stock is mentioned above in table. I need to flag rows what all need to be moved and how many quantity need to be moved
Output should be
Group Required stock to_move flag
0 A 10 5.0 5.0 yes
1 A 10 8.0 5.0 yes
2 A 10 7.0 0.0 no
3 B 13 6.0 6.0 yes
4 B 13 5.0 5.0 yes
5 C 8 4.0 4.0 yes
6 C 8 5.0 4.0 yes
7 C 8 8.0 0.0 no
8 D 16 NaN NaN no
Answers:
You can just assign new columns in pandas:
>>> df = pd.DataFrame({'Group': ['A', 'A', 'A', 'B', 'B', 'C', 'C', 'C', 'D']})
>>> df
Group
0 A
1 A
2 A
3 B
4 B
5 C
6 C
7 C
8 D
>>> df['to_move'] = ['Yes']*2+['No']+['Yes']*4+['No']*2
>>> df
Group to_move
0 A Yes
1 A Yes
2 A No
3 B Yes
4 B Yes
5 C Yes
6 C Yes
7 C No
8 D No
Use:
#create cumulative sum per groups
s = df.groupby('Group')['stock'].cumsum()
#get difference with Required
diff = df['Required'].rsub(s)
#comapre if difference is less or equal like Stock
m = diff.le(df['stock'])
#subtract stock if diffrence less 0
df['to_move'] = df['stock'].sub(diff.where(diff.gt(0), 0)).where(m, 0)
#create Flag column
df['Flag'] = np.where(m, 'Yes', 'No')
print (df)
Group Required stock to_move Flag
0 A 10 5.0 5.0 Yes
1 A 10 8.0 5.0 Yes
2 A 10 7.0 0.0 No
3 B 13 6.0 6.0 Yes
4 B 13 5.0 5.0 Yes
5 C 8 4.0 4.0 Yes
6 C 8 5.0 4.0 Yes
7 C 8 8.0 0.0 No
8 D 16 NaN 0.0 No
You can use a groupby.cumsum
with clip
to compute the cumulated values to move without overflow, then groupby.diff
to back-calculate the individual values:
# compute the cumsum per group
# clip it to not go over the required value
s = df.groupby('Group')['stock'].cumsum().clip(upper=df['Required'].values)
# back calculate the incremental values
df['to_move'] = s.groupby(df['Group']).diff().fillna(s)
# assign the flag if a strictly positive value was moved
df['flag'] = np.where(df['to_move'].gt(0), 'yes', 'no')
Output:
Group Required stock to_move flag
0 A 10 5.0 5.0 yes
1 A 10 8.0 5.0 yes
2 A 10 7.0 0.0 no
3 B 13 6.0 6.0 yes
4 B 13 5.0 5.0 yes
5 C 8 4.0 4.0 yes
6 C 8 5.0 4.0 yes
7 C 8 8.0 0.0 no
8 D 16 NaN NaN no
I have dataframe
Group Required stock
0 A 10 5
1 A 10 8
2 A 10 7
3 B 13 6
4 B 13 5
5 C 8 4
6 C 8 5
7 C 8 8
8 D 16 NaN
Here required for A, B, C, D is [10,13,8,16]
and my respective stock is mentioned above in table. I need to flag rows what all need to be moved and how many quantity need to be moved
Output should be
Group Required stock to_move flag
0 A 10 5.0 5.0 yes
1 A 10 8.0 5.0 yes
2 A 10 7.0 0.0 no
3 B 13 6.0 6.0 yes
4 B 13 5.0 5.0 yes
5 C 8 4.0 4.0 yes
6 C 8 5.0 4.0 yes
7 C 8 8.0 0.0 no
8 D 16 NaN NaN no
You can just assign new columns in pandas:
>>> df = pd.DataFrame({'Group': ['A', 'A', 'A', 'B', 'B', 'C', 'C', 'C', 'D']})
>>> df
Group
0 A
1 A
2 A
3 B
4 B
5 C
6 C
7 C
8 D
>>> df['to_move'] = ['Yes']*2+['No']+['Yes']*4+['No']*2
>>> df
Group to_move
0 A Yes
1 A Yes
2 A No
3 B Yes
4 B Yes
5 C Yes
6 C Yes
7 C No
8 D No
Use:
#create cumulative sum per groups
s = df.groupby('Group')['stock'].cumsum()
#get difference with Required
diff = df['Required'].rsub(s)
#comapre if difference is less or equal like Stock
m = diff.le(df['stock'])
#subtract stock if diffrence less 0
df['to_move'] = df['stock'].sub(diff.where(diff.gt(0), 0)).where(m, 0)
#create Flag column
df['Flag'] = np.where(m, 'Yes', 'No')
print (df)
Group Required stock to_move Flag
0 A 10 5.0 5.0 Yes
1 A 10 8.0 5.0 Yes
2 A 10 7.0 0.0 No
3 B 13 6.0 6.0 Yes
4 B 13 5.0 5.0 Yes
5 C 8 4.0 4.0 Yes
6 C 8 5.0 4.0 Yes
7 C 8 8.0 0.0 No
8 D 16 NaN 0.0 No
You can use a groupby.cumsum
with clip
to compute the cumulated values to move without overflow, then groupby.diff
to back-calculate the individual values:
# compute the cumsum per group
# clip it to not go over the required value
s = df.groupby('Group')['stock'].cumsum().clip(upper=df['Required'].values)
# back calculate the incremental values
df['to_move'] = s.groupby(df['Group']).diff().fillna(s)
# assign the flag if a strictly positive value was moved
df['flag'] = np.where(df['to_move'].gt(0), 'yes', 'no')
Output:
Group Required stock to_move flag
0 A 10 5.0 5.0 yes
1 A 10 8.0 5.0 yes
2 A 10 7.0 0.0 no
3 B 13 6.0 6.0 yes
4 B 13 5.0 5.0 yes
5 C 8 4.0 4.0 yes
6 C 8 5.0 4.0 yes
7 C 8 8.0 0.0 no
8 D 16 NaN NaN no