Create Column with ELIF in Pandas
Question:
Question
I am having trouble figuring out how to create new DataFrame column based on the values in two other columns. I need to use if/elif/else logic. But all of the documentation and examples I have found only show if/else logic. Here is a sample of what I am trying to do:
Code
df['combo'] = 'mobile' if (df['mobile'] == 'mobile') elif (df['tablet'] =='tablet') 'tablet' else 'other')
I am open to using where() also. Just having trouble finding the right syntax.
Answers:
In cases where you have multiple branching statements it’s best to create a function that accepts a row and then apply it along the axis=1
. This is usually much faster then iteration through rows.
def func(row):
if row['mobile'] == 'mobile':
return 'mobile'
elif row['tablet'] =='tablet':
return 'tablet'
else:
return 'other'
df['combo'] = df.apply(func, axis=1)
I tried the following and the result was much faster. Hope it’s helpful for others.
df['combo'] = 'other'
df.loc[df['mobile'] == 'mobile', 'combo'] = 'mobile'
df.loc[df['tablet'] == 'tablet', 'combo'] = 'tablet'
ELIF
logic can be implemented with np.select
or nested np.where
:
import numpy as np
df['combo'] = np.select([df.mobile == 'mobile', df.tablet == 'tablet'],
['mobile', 'tablet'],
default='other')
# or
df['combo'] = np.where(df.mobile == 'mobile', 'mobile',
np.where(df.tablet == 'tablet', 'tablet', 'other'))
Sample Data + Output:
mobile tablet combo
0 mobile bar mobile
1 foo tablet tablet
2 foo nan other
3 mobile tablet mobile
4 mobile nan mobile
5 foo tablet tablet
6 mobile bar mobile
7 mobile tablet mobile
8 mobile bar mobile
9 mobile nan mobile
Adding to np.where solution :
df['col1']= np.where(df['col'] < 3, 1,np.where( (df['col'] >3 )& (df['col'] <5),2,3))
Overall Logic is :
np.where(Condition, 'true block','false block').
With each true/false block can in turn again be nested.
Also, Notice the &
for ANDing! (not 'and')
Adding to apply lambda solution :
df['combo'] = df.apply(lambda x: 'mobile' if x['mobile'] == 'mobile' else
('tablet' if x[''mobile']== 'tablet' else 'other'), axis=1)
Structure :
df['column_name'] = df.apply(lambda x: 'value if true 1' if x['column_name_check_1'] == 'condition_1' else
('value if true 2' if x['column_name_check_2'] == 'condition_2' else
('value if true 3' if x['column_name_check_3'] == 'condition_3' else 'default_value')),axis=1)
Note :
axis 1 for format value as column / series
Question
I am having trouble figuring out how to create new DataFrame column based on the values in two other columns. I need to use if/elif/else logic. But all of the documentation and examples I have found only show if/else logic. Here is a sample of what I am trying to do:
Code
df['combo'] = 'mobile' if (df['mobile'] == 'mobile') elif (df['tablet'] =='tablet') 'tablet' else 'other')
I am open to using where() also. Just having trouble finding the right syntax.
In cases where you have multiple branching statements it’s best to create a function that accepts a row and then apply it along the axis=1
. This is usually much faster then iteration through rows.
def func(row):
if row['mobile'] == 'mobile':
return 'mobile'
elif row['tablet'] =='tablet':
return 'tablet'
else:
return 'other'
df['combo'] = df.apply(func, axis=1)
I tried the following and the result was much faster. Hope it’s helpful for others.
df['combo'] = 'other'
df.loc[df['mobile'] == 'mobile', 'combo'] = 'mobile'
df.loc[df['tablet'] == 'tablet', 'combo'] = 'tablet'
ELIF
logic can be implemented with np.select
or nested np.where
:
import numpy as np
df['combo'] = np.select([df.mobile == 'mobile', df.tablet == 'tablet'],
['mobile', 'tablet'],
default='other')
# or
df['combo'] = np.where(df.mobile == 'mobile', 'mobile',
np.where(df.tablet == 'tablet', 'tablet', 'other'))
Sample Data + Output:
mobile tablet combo
0 mobile bar mobile
1 foo tablet tablet
2 foo nan other
3 mobile tablet mobile
4 mobile nan mobile
5 foo tablet tablet
6 mobile bar mobile
7 mobile tablet mobile
8 mobile bar mobile
9 mobile nan mobile
Adding to np.where solution :
df['col1']= np.where(df['col'] < 3, 1,np.where( (df['col'] >3 )& (df['col'] <5),2,3))
Overall Logic is :
np.where(Condition, 'true block','false block').
With each true/false block can in turn again be nested.
Also, Notice the &
for ANDing! (not 'and')
Adding to apply lambda solution :
df['combo'] = df.apply(lambda x: 'mobile' if x['mobile'] == 'mobile' else
('tablet' if x[''mobile']== 'tablet' else 'other'), axis=1)
Structure :
df['column_name'] = df.apply(lambda x: 'value if true 1' if x['column_name_check_1'] == 'condition_1' else
('value if true 2' if x['column_name_check_2'] == 'condition_2' else
('value if true 3' if x['column_name_check_3'] == 'condition_3' else 'default_value')),axis=1)
Note :
axis 1 for format value as column / series