In Pandas df, apply function skiping NaN

Question:

Confianza
2.0
4.0
7.0
NaN

Expected Output:

Confianza
Baja
Media
Alta
NaN

In a pandas DataFrame, I want to apply this function for a column but skip NaN

def condiciones(df5):
    if ( df5['Confianza'] > 4 ):
        return "Alta"
    elif (df5['Confianza'] == 4 ):
        return "Media"
    else:
        return "Baja"


df5['Confianza']= df5.apply(condiciones, axis= 1)

The actual problem is I dont want to drop NaN rows, I tried this but returns error when apply
"The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()."

elif ( df5['Confianza'] != notnull):
      return NaN
Asked By: SERGIO

||

Answers:

import numpy as np
import pandas as pd
df = pd.DataFrame({"Confianza": [2.0, 4.0, 7.0, None]})

def condiciones(row):
    if (row["Confianza"] is None) or (np.isnan(row["Confianza"])):
        return np.nan
    elif (row["Confianza"] > 4):
        return "Alta"
    elif (row["Confianza"] == 4):
        return "Media"
    else:
        return "Baja"


df['Confianza']= df.apply(condiciones, axis= 1)
print(df)
Answered By: Lowin Li

here is one way to do it using np.selecgt

import numpy as np

df['Confianza']=np.select( [(df['Confianza'].notna() & (df['Confianza']> 4.0)),
                            (df['Confianza'].notna() & (df['Confianza']== 4.0)),
                           (df['Confianza'].notna() & (df['Confianza']< 4.0))],
          ['Alta', 'Media','Baja'],
          df['Confianza'])
df
Confianza
0   Baja
1   Media
2   Alta
3   NaN
Answered By: Naveed

First of all, when ‘axis’ is set to 1 then the function applies to each row

1 or ‘columns’: apply function to each row.
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.apply.html?highlight=apply#pandas.DataFrame.apply

So, your mistake is getting the whole series referencing to the dataframe ‘df5’ in your function.

You can get the same result without utilizing ‘apply’ actually, but using ‘loc’.
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html?highlight=loc#pandas.DataFrame.loc

df5.loc[df5["Confianza"] > 4, 'Confianza2']= "Alta"
df5.loc[df5["Confianza"] == 4, 'Confianza2']= "Media"
df5.loc[df5["Confianza"].notna() & (df5["Confianza"] < 4), 'Confianza2']= "Baja"
Answered By: Raibek
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.