Casting a value based on a trigger in pandas
Question:
I would like to create a new column every time I get 1 in the ‘Signal’ column that will cast the corresponding value from the ‘Value’ column (please see the expected output below).
Initial data:
Index
Value
Signal
0
3
0
1
8
0
2
8
0
3
7
1
4
9
0
5
10
0
6
14
1
7
10
0
8
10
0
9
4
1
10
10
0
11
10
0
Expected Output:
Index
Value
Signal
New_Col_1
New_Col_2
New_Col_3
0
3
0
0
0
0
1
8
0
0
0
0
2
8
0
0
0
0
3
7
1
7
0
0
4
9
0
7
0
0
5
10
0
7
0
0
6
14
1
7
14
0
7
10
0
7
14
0
8
10
0
7
14
0
9
4
1
7
14
4
10
10
0
7
14
4
11
10
0
7
14
4
What would be a way to do it?
Answers:
df['new_col']='new_col_'+df['Signal'].cumsum().astype(str)
df['new_col'] = df['new_col'].mask(df['Signal']==0, '0')
df.pivot(index=['Index','Signal', 'Value'], columns='new_col', values='Value').reset_index().ffill().fillna(0).drop(columns='0').astype(int)
new_col Index Signal Value new_col_1 new_col_2 new_col_3
0 0 0 3 0 0 0
1 1 0 8 0 0 0
2 2 0 8 0 0 0
3 3 1 7 7 0 0
4 4 0 9 7 0 0
5 5 0 10 7 0 0
6 6 1 14 7 14 0
7 7 0 10 7 14 0
8 8 0 10 7 14 0
9 9 1 4 7 14 4
10 10 0 10 7 14 4
11 11 0 10 7 14 4
You can use a pivot:
out = df.join(df
# keep only the values where signal is 1
# and get Signal's cumsum
.assign(val=df['Value'].where(df['Signal'].eq(1)),
col=df['Signal'].cumsum()
)
# pivot cumsumed Signal to columns
.pivot(index='Index', columns='col', values='val')
# ensure column 0 is absent (using loc to avoid KeyError)
.loc[:, 1:]
# forward fill the values
.ffill()
# rename columns
.add_prefix('New_Col_')
)
output:
Index Value Signal New_Col_1 New_Col_2 New_Col_3
0 0 3 0 NaN NaN NaN
1 1 8 0 NaN NaN NaN
2 2 8 0 NaN NaN NaN
3 3 7 1 7.0 NaN NaN
4 4 9 0 7.0 NaN NaN
5 5 10 0 7.0 NaN NaN
6 6 14 1 7.0 14.0 NaN
7 7 10 0 7.0 14.0 NaN
8 8 10 0 7.0 14.0 NaN
9 9 4 1 7.0 14.0 4.0
10 10 10 0 7.0 14.0 4.0
11 11 10 0 7.0 14.0 4.0
I would like to create a new column every time I get 1 in the ‘Signal’ column that will cast the corresponding value from the ‘Value’ column (please see the expected output below).
Initial data:
Index | Value | Signal |
---|---|---|
0 | 3 | 0 |
1 | 8 | 0 |
2 | 8 | 0 |
3 | 7 | 1 |
4 | 9 | 0 |
5 | 10 | 0 |
6 | 14 | 1 |
7 | 10 | 0 |
8 | 10 | 0 |
9 | 4 | 1 |
10 | 10 | 0 |
11 | 10 | 0 |
Expected Output:
Index | Value | Signal | New_Col_1 | New_Col_2 | New_Col_3 |
---|---|---|---|---|---|
0 | 3 | 0 | 0 | 0 | 0 |
1 | 8 | 0 | 0 | 0 | 0 |
2 | 8 | 0 | 0 | 0 | 0 |
3 | 7 | 1 | 7 | 0 | 0 |
4 | 9 | 0 | 7 | 0 | 0 |
5 | 10 | 0 | 7 | 0 | 0 |
6 | 14 | 1 | 7 | 14 | 0 |
7 | 10 | 0 | 7 | 14 | 0 |
8 | 10 | 0 | 7 | 14 | 0 |
9 | 4 | 1 | 7 | 14 | 4 |
10 | 10 | 0 | 7 | 14 | 4 |
11 | 10 | 0 | 7 | 14 | 4 |
What would be a way to do it?
df['new_col']='new_col_'+df['Signal'].cumsum().astype(str)
df['new_col'] = df['new_col'].mask(df['Signal']==0, '0')
df.pivot(index=['Index','Signal', 'Value'], columns='new_col', values='Value').reset_index().ffill().fillna(0).drop(columns='0').astype(int)
new_col Index Signal Value new_col_1 new_col_2 new_col_3
0 0 0 3 0 0 0
1 1 0 8 0 0 0
2 2 0 8 0 0 0
3 3 1 7 7 0 0
4 4 0 9 7 0 0
5 5 0 10 7 0 0
6 6 1 14 7 14 0
7 7 0 10 7 14 0
8 8 0 10 7 14 0
9 9 1 4 7 14 4
10 10 0 10 7 14 4
11 11 0 10 7 14 4
You can use a pivot:
out = df.join(df
# keep only the values where signal is 1
# and get Signal's cumsum
.assign(val=df['Value'].where(df['Signal'].eq(1)),
col=df['Signal'].cumsum()
)
# pivot cumsumed Signal to columns
.pivot(index='Index', columns='col', values='val')
# ensure column 0 is absent (using loc to avoid KeyError)
.loc[:, 1:]
# forward fill the values
.ffill()
# rename columns
.add_prefix('New_Col_')
)
output:
Index Value Signal New_Col_1 New_Col_2 New_Col_3
0 0 3 0 NaN NaN NaN
1 1 8 0 NaN NaN NaN
2 2 8 0 NaN NaN NaN
3 3 7 1 7.0 NaN NaN
4 4 9 0 7.0 NaN NaN
5 5 10 0 7.0 NaN NaN
6 6 14 1 7.0 14.0 NaN
7 7 10 0 7.0 14.0 NaN
8 8 10 0 7.0 14.0 NaN
9 9 4 1 7.0 14.0 4.0
10 10 10 0 7.0 14.0 4.0
11 11 10 0 7.0 14.0 4.0