How can I fill null values with a mean using Pandas?

Question:

Having a hard time understanding why the apply function isn’t working here. I’m trying to fill the null values for SalePrice with the mean sales price of their corresponding quality ratings (OverallQual)

I expected the function to itterate through each row and return the mean SalePrice for the coresponding OverallQual feature where SalePrice is a null, else return the original SalePrice.

sale_price_by_qual = df.groupby('OverallQual').mean()['SalePrice']

def fill_sales_price(SalePrice, OverallQual):
   if np.isnan(SalePrice):
      return sale_price_by_qual[SalePrice]
   else:
      return SalePrice

df[SalePrice] = df.apply(lambda x: fill_sales_price(x['SalePrice], x['OverallQaul]), axis=1)
  

KeyError: nan

Asked By: rileylivingston

||

Answers:

could you maybe save the mean value into a variable and then do the .fillna()?

x = your mean value

df[SalePrice] = df[SalePrice].fillna(x)
Answered By: codingrainha

Try this,

def fill_sales_price(SalePrice, OverallQual):
  if np.isnan(SalePrice):
     return sale_price_by_qual[OverallQual]
  else:
     return SalePrice

df['SalePrice'] = df.apply(lambda x: fill_sales_price(x['SalePrice'], x['OverallQual']), axis=1)
Answered By: CyberPhoenix
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.