Pandas dataframe describe gives nan value for mean & std for float data type

Question:

I have a dataframe with 8M rows, when I use df.describe() or np.mean(df[col]) or df[col].mean() I get nan as output.

But, when I check np.mean(df[col].values), It is working. I can able to get the mean value.
There are no nan values in that column. I have tested using df[col].isna().sum() and df[col].isnull().sum()

Not sure how to reproduce the bug.

Update:

>>> df.head()
    col1        col2
0   2.289062    290
1   2.289062    290
2   2.289062    290
3   2.289062    290
4   2.289062    290

>>> df[col1].dtype
dtype('float16')

Is there a way to debug or resolve this error?

>>> pd.__version__
'1.3.4'
Asked By: Python coder

||

Answers:

I think you need convert column to float64/float32, because no hardware support for float16 on a typical processor:

df[col].astype('float64').mean()
df[col].astype('float32').mean()
Answered By: jezrael
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.