filter None value from python list returns: TypeError: boolean value of NA is ambiguous
Question:
I used to filter out None values from a python (3.9.5) list using the "filter" method.
Currently while upgrading several dependencies (pandas 1.3.1, numpy 1.23.5, etc.) I get the following:
import pandas as pd
x = pd.array(['This is', 'some text', None, 'data.'], dtype="string")
x = list(x.unique())
list(filter(None, x))
returns: TypeError: boolean value of NA is ambiguous
I’ll appreciate any good explanation of what was changed and how to solve it, please.
Answers:
Filter is expecting a function and an iterable
You are providing a value and an iterable.
Try this:
list( filter( lambda value: not(pd.isna(value)), x ))
Don’t rely on using None
as the function, if your data is Pandas
That is a shortcut if your iterable contains plain Python values, and you are trying to remove falsy ones from that, as pointed out by @buran below.
However, once your iterable is a pandas array, None
s have been converted into pd.NA
s, and therefore will not be removed.
This code is helps you to remove None
value with dropna()
from a list and get available list values.
import pandas as pd
x = pd.array(['This is', 'some text', None, 'data.'], dtype="string")
x = list(x.dropna().unique())
print(x)
Output:
['This is', 'some text', 'data.']
This error can also be reproduced by doing just this
>>> bool(x[2])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pandas/_libs/missing.pyx", line 446, in pandas._libs.missing.NAType.__bool__
TypeError: boolean value of NA is ambiguous
As it seems by looking at the source code this is intentional as NA isn’t really True or False, its boolean value is ambiguous as it is a "missing value indicator".
def __bool__(self):
raise TypeError("boolean value of NA is ambiguous")
So basically you can’t compare it by calling functions that access the method bool method of a class.
Easiest way to solve this is by @NIKUNJ PATEL
>>> x = list(x.dropna().unique())
I used to filter out None values from a python (3.9.5) list using the "filter" method.
Currently while upgrading several dependencies (pandas 1.3.1, numpy 1.23.5, etc.) I get the following:
import pandas as pd
x = pd.array(['This is', 'some text', None, 'data.'], dtype="string")
x = list(x.unique())
list(filter(None, x))
returns: TypeError: boolean value of NA is ambiguous
I’ll appreciate any good explanation of what was changed and how to solve it, please.
Filter is expecting a function and an iterable
You are providing a value and an iterable.
Try this:
list( filter( lambda value: not(pd.isna(value)), x ))
Don’t rely on using None
as the function, if your data is Pandas
That is a shortcut if your iterable contains plain Python values, and you are trying to remove falsy ones from that, as pointed out by @buran below.
However, once your iterable is a pandas array, None
s have been converted into pd.NA
s, and therefore will not be removed.
This code is helps you to remove None
value with dropna()
from a list and get available list values.
import pandas as pd
x = pd.array(['This is', 'some text', None, 'data.'], dtype="string")
x = list(x.dropna().unique())
print(x)
Output:
['This is', 'some text', 'data.']
This error can also be reproduced by doing just this
>>> bool(x[2])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "pandas/_libs/missing.pyx", line 446, in pandas._libs.missing.NAType.__bool__
TypeError: boolean value of NA is ambiguous
As it seems by looking at the source code this is intentional as NA isn’t really True or False, its boolean value is ambiguous as it is a "missing value indicator".
def __bool__(self):
raise TypeError("boolean value of NA is ambiguous")
So basically you can’t compare it by calling functions that access the method bool method of a class.
Easiest way to solve this is by @NIKUNJ PATEL
>>> x = list(x.dropna().unique())