Remove NaN value from a set
Question:
Is it possible to easily remove NaN values for a Python Set object? Given that NaN values do not equal anything (and float('nan') is float('nan')
is also False
), you can end up with many NaN values in a Set.
>>> a = set( (float('nan'), float('nan'), 'a') )
>>> a
{nan, nan, 'a'}
The best I can come up with it to define a function like math.isnan, but that is tolerant of non-float types like:
def my_isnan(x):
try:
return math.isnan(x)
except TypeError:
return False
Then you can use a set comprehension like this:
>>> {x for x in a if not my_isnan(x)}
{'a'}
Answers:
In practice, you could look at the fact that nan != nan
as a feature, not a bug:
>>> a = {float('nan'), float('nan'), 'a'}
>>> a
{nan, nan, 'a'}
>>> {x for x in a if x==x}
{'a'}
On the positive side, no need for a helper function. On the negative side, if you have a non-nan object which is also not equal to itself, you’ll remove that too.
Also you can use filter
:
In[75]: a = set((float('nan'), float('nan'), 'a'))
In[76]: set(filter(lambda x: x == x , a))
Out[76]: {'a'}
Use pd.notna() from pandas, e.g.:
In [219]: import pandas as pd
In [220]: a = set((float('nan'), float('nan'), 'a'))
In [221]: a = {x for x in a if pd.notna(x)}
In [222]: a
Out[222]: {'a'}
We can simply use the .remove() method
In[1]: a = set([np.nan, "A"])
In[2]: a
Out: {'A', nan}
In[3]: a.remove(np.nan)
In[4]: a
Out: {'A'}
Is it possible to easily remove NaN values for a Python Set object? Given that NaN values do not equal anything (and float('nan') is float('nan')
is also False
), you can end up with many NaN values in a Set.
>>> a = set( (float('nan'), float('nan'), 'a') )
>>> a
{nan, nan, 'a'}
The best I can come up with it to define a function like math.isnan, but that is tolerant of non-float types like:
def my_isnan(x):
try:
return math.isnan(x)
except TypeError:
return False
Then you can use a set comprehension like this:
>>> {x for x in a if not my_isnan(x)}
{'a'}
In practice, you could look at the fact that nan != nan
as a feature, not a bug:
>>> a = {float('nan'), float('nan'), 'a'}
>>> a
{nan, nan, 'a'}
>>> {x for x in a if x==x}
{'a'}
On the positive side, no need for a helper function. On the negative side, if you have a non-nan object which is also not equal to itself, you’ll remove that too.
Also you can use filter
:
In[75]: a = set((float('nan'), float('nan'), 'a'))
In[76]: set(filter(lambda x: x == x , a))
Out[76]: {'a'}
Use pd.notna() from pandas, e.g.:
In [219]: import pandas as pd
In [220]: a = set((float('nan'), float('nan'), 'a'))
In [221]: a = {x for x in a if pd.notna(x)}
In [222]: a
Out[222]: {'a'}
We can simply use the .remove() method
In[1]: a = set([np.nan, "A"])
In[2]: a
Out: {'A', nan}
In[3]: a.remove(np.nan)
In[4]: a
Out: {'A'}