Creating a new column in a DataFrame based on multiple condition using values from another column in Python
Question:
I would like to create a new column base on conditions from the Spread column.
The logic is:
if Spread is <= 4 then New_column = 4
if Spread is >4 and <=8 then New_column = 8
if Spread is >8 and <=12 Then new New_column = 12
if Spread is >12 Then new New_column = 16 (16 is the cut off.)
I have tried using .where but I have not any luck.
I would like the final output to look like below
Spread
New_Column
1
4
2
4
5
8
6
8
9
12
11
12
13
16
Answers:
Perhaps you can use pd.cut
:
df['New_Column_2'] = (pd.cut(df['Spread'], [0, 4, 8, 12, np.inf]).cat.codes + 1) * 4
print(df)
Prints:
Spread New_Column New_Column_2
0 1 4 4
1 2 4 4
2 5 8 8
3 6 8 8
4 9 12 12
5 11 12 12
6 13 16 16
7 24 16 16
You can use apply
and create a function
:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Spread': [1,2,5,6,9,11,13]
})
def my_func(x): # Your function with your conditions
if x <= 4:
return 4
elif x > 4 and x <= 8:
return 8
elif x > 8 and x <= 12:
return 12
elif x > 12:
return 16
else:
return np.nan # In case none of your criteria is met
df['New_Column'] = df['Spread'].apply(my_func)
df
is now:
Spread New_Column
0 1 4
1 2 4
2 5 8
3 6 8
4 9 12
5 11 12
6 13 16
The equal spacing of your conditions allows for:
df = pd.DataFrame({'Spread':[1,2,5,6,9,11,13,20]})
df['NewCol'] = df['Spread'].apply(lambda x: min( 4*(1+(x//4)), 16) )
I would like to create a new column base on conditions from the Spread column.
The logic is:
if Spread is <= 4 then New_column = 4
if Spread is >4 and <=8 then New_column = 8
if Spread is >8 and <=12 Then new New_column = 12
if Spread is >12 Then new New_column = 16 (16 is the cut off.)
I have tried using .where but I have not any luck.
I would like the final output to look like below
Spread | New_Column |
---|---|
1 | 4 |
2 | 4 |
5 | 8 |
6 | 8 |
9 | 12 |
11 | 12 |
13 | 16 |
Perhaps you can use pd.cut
:
df['New_Column_2'] = (pd.cut(df['Spread'], [0, 4, 8, 12, np.inf]).cat.codes + 1) * 4
print(df)
Prints:
Spread New_Column New_Column_2
0 1 4 4
1 2 4 4
2 5 8 8
3 6 8 8
4 9 12 12
5 11 12 12
6 13 16 16
7 24 16 16
You can use apply
and create a function
:
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Spread': [1,2,5,6,9,11,13]
})
def my_func(x): # Your function with your conditions
if x <= 4:
return 4
elif x > 4 and x <= 8:
return 8
elif x > 8 and x <= 12:
return 12
elif x > 12:
return 16
else:
return np.nan # In case none of your criteria is met
df['New_Column'] = df['Spread'].apply(my_func)
df
is now:
Spread New_Column
0 1 4
1 2 4
2 5 8
3 6 8
4 9 12
5 11 12
6 13 16
The equal spacing of your conditions allows for:
df = pd.DataFrame({'Spread':[1,2,5,6,9,11,13,20]})
df['NewCol'] = df['Spread'].apply(lambda x: min( 4*(1+(x//4)), 16) )