Python Pandas replace multiple columns zero to Nan
Question:
List with attributes of persons loaded into pandas dataframe df2
. For cleanup I want to replace value zero (0
or '0'
) by np.nan
.
df2.dtypes
ID object
Name object
Weight float64
Height float64
BootSize object
SuitSize object
Type object
dtype: object
Working code to set value zero to np.nan
:
df2.loc[df2['Weight'] == 0,'Weight'] = np.nan
df2.loc[df2['Height'] == 0,'Height'] = np.nan
df2.loc[df2['BootSize'] == '0','BootSize'] = np.nan
df2.loc[df2['SuitSize'] == '0','SuitSize'] = np.nan
Believe this can be done in a similar/shorter way:
df2[["Weight","Height","BootSize","SuitSize"]].astype(str).replace('0',np.nan)
However the above does not work. The zero’s remain in df2. How to tackle this?
Answers:
I think you need replace
by dict
:
cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace({'0':np.nan, 0:np.nan})
data['amount']=data['amount'].replace(0, np.nan)
data['duration']=data['duration'].replace(0, np.nan)
You could use the ‘replace’ method and pass the values that you want to replace in a list as the first parameter along with the desired one as the second parameter:
cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace(['0', 0], np.nan)
Try:
df2.replace(to_replace={
'Weight':{0:np.nan},
'Height':{0:np.nan},
'BootSize':{'0':np.nan},
'SuitSize':{'0':np.nan},
})
Another alternative way:
cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].mask(df2[cols].eq(0) | df2[cols].eq('0'))
in column "age", replace zero with blanks
df['age'].replace(['0', 0'], '', inplace=True)
Replace zero with nan for single column
df['age'] = df['age'].replace(0, np.nan)
Replace zero with nan for multiple columns
cols = ["Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI"]
df[cols] = df[cols].replace(['0', 0], np.nan)
Replace zero with nan for dataframe
df.replace(0, np.nan, inplace=True)
If you just want to o replace the zeros in whole dataframe, you can directly replace them without specifying any columns:
df = df.replace({0:pd.NA})
List with attributes of persons loaded into pandas dataframe df2
. For cleanup I want to replace value zero (0
or '0'
) by np.nan
.
df2.dtypes
ID object
Name object
Weight float64
Height float64
BootSize object
SuitSize object
Type object
dtype: object
Working code to set value zero to np.nan
:
df2.loc[df2['Weight'] == 0,'Weight'] = np.nan
df2.loc[df2['Height'] == 0,'Height'] = np.nan
df2.loc[df2['BootSize'] == '0','BootSize'] = np.nan
df2.loc[df2['SuitSize'] == '0','SuitSize'] = np.nan
Believe this can be done in a similar/shorter way:
df2[["Weight","Height","BootSize","SuitSize"]].astype(str).replace('0',np.nan)
However the above does not work. The zero’s remain in df2. How to tackle this?
I think you need replace
by dict
:
cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace({'0':np.nan, 0:np.nan})
data['amount']=data['amount'].replace(0, np.nan)
data['duration']=data['duration'].replace(0, np.nan)
You could use the ‘replace’ method and pass the values that you want to replace in a list as the first parameter along with the desired one as the second parameter:
cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].replace(['0', 0], np.nan)
Try:
df2.replace(to_replace={
'Weight':{0:np.nan},
'Height':{0:np.nan},
'BootSize':{'0':np.nan},
'SuitSize':{'0':np.nan},
})
Another alternative way:
cols = ["Weight","Height","BootSize","SuitSize","Type"]
df2[cols] = df2[cols].mask(df2[cols].eq(0) | df2[cols].eq('0'))
in column "age", replace zero with blanks
df['age'].replace(['0', 0'], '', inplace=True)
Replace zero with nan for single column
df['age'] = df['age'].replace(0, np.nan)
Replace zero with nan for multiple columns
cols = ["Glucose", "BloodPressure", "SkinThickness", "Insulin", "BMI"]
df[cols] = df[cols].replace(['0', 0], np.nan)
Replace zero with nan for dataframe
df.replace(0, np.nan, inplace=True)
If you just want to o replace the zeros in whole dataframe, you can directly replace them without specifying any columns:
df = df.replace({0:pd.NA})