Using np.where with a pandas column. How do you fill the column with the previous value until condition is met again
Question:
I have this code:
df['cross'] = np.where((df['mean_close_spread'].shift(1) < 0) & (df['mean_close_spread'] > 0), 'cross', 'none')
df['cross_price'] = np.where((df['cross'] == 'cross'), df['close'], 'none')
The above code gives me the below dataframe
close
cross
cross_price
0.3434
none
none
0.3435
none
none
0.3433
none
none
0.3434
cross
0.3434
0.3433
none
none
0.3432
none
none
0.3431
cross
0.3431
0.4330
none
none
Instead of using the "none" string in the cross_price column when there is no cross in the cross column I would like to use the last price from the last cross until the next cross happens.
Here is an example of what i want:
close
cross
cross_price
0.3434
none
none
0.3435
none
none
0.3433
none
none
0.3434
cross
0.3434
0.3433
none
0.3434
0.3432
none
0.3434
0.3431
cross
0.3431
0.4330
none
0.3431
Answers:
try this one:
df['cross'] = np.where((df['mean_close_spread'].shift(1) < 0) & (df['mean_close_spread'] > 0), 'cross', 'none')
df['cross_price'] = np.where((df['cross'] == 'cross'), df['close'], None) # not 'none'
df['cross_price'] = df['cross_price'].ffill().fillna('none')
Use Series.where
for NaN
s if not match cross
, so possible forward missing values:
df['cross_price'] = df['close'].where(df['cross'] == 'cross').ffill()
print (df)
close cross cross_price
0 0.3434 none NaN
1 0.3435 none NaN
2 0.3433 none NaN
3 0.3434 cross 0.3434
4 0.3433 none 0.3434
5 0.3432 none 0.3434
6 0.3431 cross 0.3431
7 0.4330 none 0.3431
Then replacement to none
not recommended, because get mixed numeric and strings values.
Alternative without cross
column:
m = (df['mean_close_spread'].shift(1) < 0) & (df['mean_close_spread'] > 0)
df['cross_price'] = df['close'].where(m).ffill()
I have this code:
df['cross'] = np.where((df['mean_close_spread'].shift(1) < 0) & (df['mean_close_spread'] > 0), 'cross', 'none')
df['cross_price'] = np.where((df['cross'] == 'cross'), df['close'], 'none')
The above code gives me the below dataframe
close | cross | cross_price |
---|---|---|
0.3434 | none | none |
0.3435 | none | none |
0.3433 | none | none |
0.3434 | cross | 0.3434 |
0.3433 | none | none |
0.3432 | none | none |
0.3431 | cross | 0.3431 |
0.4330 | none | none |
Instead of using the "none" string in the cross_price column when there is no cross in the cross column I would like to use the last price from the last cross until the next cross happens.
Here is an example of what i want:
close | cross | cross_price |
---|---|---|
0.3434 | none | none |
0.3435 | none | none |
0.3433 | none | none |
0.3434 | cross | 0.3434 |
0.3433 | none | 0.3434 |
0.3432 | none | 0.3434 |
0.3431 | cross | 0.3431 |
0.4330 | none | 0.3431 |
try this one:
df['cross'] = np.where((df['mean_close_spread'].shift(1) < 0) & (df['mean_close_spread'] > 0), 'cross', 'none')
df['cross_price'] = np.where((df['cross'] == 'cross'), df['close'], None) # not 'none'
df['cross_price'] = df['cross_price'].ffill().fillna('none')
Use Series.where
for NaN
s if not match cross
, so possible forward missing values:
df['cross_price'] = df['close'].where(df['cross'] == 'cross').ffill()
print (df)
close cross cross_price
0 0.3434 none NaN
1 0.3435 none NaN
2 0.3433 none NaN
3 0.3434 cross 0.3434
4 0.3433 none 0.3434
5 0.3432 none 0.3434
6 0.3431 cross 0.3431
7 0.4330 none 0.3431
Then replacement to none
not recommended, because get mixed numeric and strings values.
Alternative without cross
column:
m = (df['mean_close_spread'].shift(1) < 0) & (df['mean_close_spread'] > 0)
df['cross_price'] = df['close'].where(m).ffill()