How to create a column as function of other two?

Question:

I have a dataframe with two columns. I want to create a third column such that, if Col1 is null, then Col3 = Col2, else Col3 = Col1 * 2

I have tried:

    def myf(col1,col2):
       if pd.isnull(col1):
          return col2
       else:
          return col1 * 2

    df['col3'] = df.apply(lambda x: myf(df['col1'], df['col2']), axis= 1)

but I get an error that "’The truth value of a Series is ambiguous".

How can I fix this? My tiny, used-to-SQL brain still struggles to understand how pandas works; maybe I’m very dumb, maybe pandas’ documentation is very poor, maybe both 🙂

I understand that apply works on a row / column basis of a DataFrame, applymap works element-wise on a DataFrame, and map works element-wise on a Series, and I understand the error arises because pd.isnull returns a T/F array.

However, I’m not sure how I’d use applymap or map in a case like this, where two other columns are my input.

Answers:

Need change df to x in lambda function for scalars instead Series as input in function:

df['col3'] = df.apply(lambda x: myf(x['col1'], x['col2']), axis= 1)

Another faster solution is with combine_first or Series.where:

df['col3'] = df['col1'].mul(2).combine_first(df['col2'])

df['Col3'] = df['col2'].where(df['col1'].isnull(), df['col1']*2)
Answered By: jezrael

You can use fillna:

df.col1.mul(2).fillna(df.col2)

df = pd.DataFrame({
    'col1': [1, 2, pd.np.nan, 3, pd.np.nan],
    'col2': [2, pd.np.nan, 3, 2, pd.np.nan]
})

df['col3'] = df.col1.mul(2).fillna(df.col2)
df
#  col1   col2  col3
#0  1.0    2.0  2.0
#1  2.0    NaN  4.0
#2  NaN    3.0  3.0
#3  3.0    2.0  6.0
#4  NaN    NaN  NaN
Answered By: Psidom
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.