Get mean value avoiding nan using numpy in python

Question:

How to calculate mean value of an array (A) avoiding nan?

import numpy as np 
A = [5    nan    nan    nan    nan  10]
M = np.mean(A[A!=nan]) does not work
Any idea?
Asked By: 2964502

||

Answers:

Use numpy.isnan:

>>> import numpy as np 
>>> A = np.array([5, np.nan, np.nan, np.nan, np.nan, 10])
>>> np.isnan(A)
array([False,  True,  True,  True,  True, False], dtype=bool)
>>> ~np.isnan(A)
array([ True, False, False, False, False,  True], dtype=bool)
>>> A[~np.isnan(A)]
array([  5.,  10.])
>>> A[~np.isnan(A)].mean()
7.5

because you cannot compare nan with nan:

>>> np.nan == np.nan
False
>>> np.nan != np.nan
True
>>> np.isnan(np.nan)
True
Answered By: falsetru

An other possibility is the following:

import numpy
from scipy.stats import nanmean # nanmedian exists too, if you need it
A = numpy.array([5, numpy.nan, numpy.nan, numpy.nan, numpy.nan, 10])
print nanmean(A) # gives 7.5 as expected

i guess this looks more elegant (and readable) than the other solution already given

edit: apparently (@Jaime) reports that this functionality already exists directly in the latest numpy (1.8) as well, so no need to import scipy.stats anymore if you have that version of numpy:

import numpy
A = numpy.array([5, numpy.nan, numpy.nan, numpy.nan, numpy.nan, 10])
print numpy.nanmean(A) 

the first solution works also for people who dont have the latest version of numpy (like me)

Answered By: usethedeathstar
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.