Pandas Dataframe – replace NaN with 0 if column value condition

Question:

I have searched all around the internet and tried many methods before making this post, I have a dataframe where I want to:

  • Replace NaN value of TGT_COLUMN_SCALE to 0 If TGT_COLUMN_DATA_TYPE is equals to NUMERIC.

my dataframe

Kindly help me out with this issue.

I tried this code but it’s not working:

df["TGT_COLUMN_SCALE"] = np.where(df["TGT_COLUMN_DATA_TYPE"] == "NUMERIC", 'NaN', 0)
Asked By: Avi Thour

||

Answers:

df['TGT_COLUMN_SCALE'].loc[df['TGT_COLUMN_DATA_TYPE'] == "NUMERIC"] = df['TGT_COLUMN_SCALE'].loc[df['TGT_COLUMN_DATA_TYPE'] == "NUMERIC"].fillna(0)
Answered By: Lorenzo Bonetti

Sample:

df = pd.DataFrame({
    "TGT_COLUMN_DATA_TYPE" : ["DATE", "NUMERIC", "STRING", "NUMERIC"],
    "TGT_COLUMN_SCALE" : [np.NaN, np.NaN, 4.0, 5.0]
})

Replace

df.loc[(df.TGT_COLUMN_DATA_TYPE == "NUMERIC") & (df.TGT_COLUMN_SCALE.isnull()), "TGT_COLUMN_SCALE"] = 0

Result:

    TGT_COLUMN_DATA_TYPE    TGT_COLUMN_SCALE
0   DATE    NaN
1   NUMERIC 0.0
2   STRING  4.0
3   NUMERIC 5.0
Answered By: srinath

You just need to use loc to select the columns and then you use fillna to replace values:

df.loc[df.TGT_COLUMN_SCALE == "NUMERIC",
       "TGT_COLUMN_DATA_TYPE"] = df.loc[df.TGT_COLUMN_SCALE == "NUMERIC", "TGT_COLUMN_DATA_TYPE"].fillna(0)

Full code

TGT_COLUMN_SCALE = ('DATE', 'TIMESTAMP', 'NUMERIC', 'NUMERIC')
TGT_COLUMN_DATA_TYPE = (np.nan, np.nan, np.nan, np.nan)
df = pd.DataFrame(list(zip(TGT_COLUMN_SCALE, TGT_COLUMN_DATA_TYPE)),
                  columns=['TGT_COLUMN_SCALE', 'TGT_COLUMN_DATA_TYPE'])
df.loc[df.TGT_COLUMN_SCALE == "NUMERIC",
       "TGT_COLUMN_DATA_TYPE"] = df.loc[df.TGT_COLUMN_SCALE == "NUMERIC", "TGT_COLUMN_DATA_TYPE"].fillna(0)
Answered By: Jock
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.