Pandas dataframe "ValueError: The truth value of a Series is ambiguous" when using .apply
Question:
I have a dataframe which has 3884 rows × 4458 columns and filled with numbers. I’m trying to equate numbers greater than 1 to 1. I have tried
df.apply(lambda x: 1 if x >= 1 else 0)
Or I tried to make function but I’m getting this error.
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I checked previous topics and see many questions about this but I really don’t understand.
Answers:
It is because x >= 1
returns a series of True/False values based on the original numeric value, because x
is a series representing a column inside your lambda.
You could use (x >= 1).all()
or any()
or such but that won’t suit your needs.
Instead you may use the below to transform each value in the df:
df.apply(lambda x : [1 if e >= 1 else 0 for e in x])
pandas.DataFrame.apply
will apply a function along an axis of the DataFrame. Thus x
in your lambda is a Series which couldn’t be feed into if
after comparing.
I’m trying to equate numbers greater than 1 to 1.
You can use pandas.DataFrame.applymap
which apply a function to a Dataframe elementwise.
df.applymap(lambda x: 1 if x >= 1 else 0)
Ynjxsjmh’s answer is good, but in this situation, you don’t actually need to use .apply()
in the first place. Pandas and NumPy have more powerful tools for doing this sort of thing. Here are some examples.
Firstly, some example data:
>>> df = pd.DataFrame({'a': [-1, 0, 1, 2], 'b': [0, 0, 5, -2]})
>>> df
a b
0 -1 0
1 0 0
2 1 5
3 2 -2
If like you say, you want to cap numbers at 1, you could use:
-
>>> df.clip(upper=1)
a b
0 -1 0
1 0 0
2 1 1
3 1 -2
-
>>> df.mask(df>=1, 1)
a b
0 -1 0
1 0 0
2 1 1
3 1 -2
Or if like your code says, you also want to make numbers 0 if they’re less than 1, you could use:
-
A comparison on the whole dataframe, then convert the bools to int:
>>> df.ge(1).astype(int)
a b
0 0 0
1 0 0
2 1 1
3 1 0
- Docs:
.ge()
-
>>> df[:] = np.where(df>=1, 1, 0)
>>> df
a b
0 0 0
1 0 0
2 1 1
3 1 0
I have a dataframe which has 3884 rows × 4458 columns and filled with numbers. I’m trying to equate numbers greater than 1 to 1. I have tried
df.apply(lambda x: 1 if x >= 1 else 0)
Or I tried to make function but I’m getting this error.
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
I checked previous topics and see many questions about this but I really don’t understand.
It is because x >= 1
returns a series of True/False values based on the original numeric value, because x
is a series representing a column inside your lambda.
You could use (x >= 1).all()
or any()
or such but that won’t suit your needs.
Instead you may use the below to transform each value in the df:
df.apply(lambda x : [1 if e >= 1 else 0 for e in x])
pandas.DataFrame.apply
will apply a function along an axis of the DataFrame. Thus x
in your lambda is a Series which couldn’t be feed into if
after comparing.
I’m trying to equate numbers greater than 1 to 1.
You can use pandas.DataFrame.applymap
which apply a function to a Dataframe elementwise.
df.applymap(lambda x: 1 if x >= 1 else 0)
Ynjxsjmh’s answer is good, but in this situation, you don’t actually need to use .apply()
in the first place. Pandas and NumPy have more powerful tools for doing this sort of thing. Here are some examples.
Firstly, some example data:
>>> df = pd.DataFrame({'a': [-1, 0, 1, 2], 'b': [0, 0, 5, -2]})
>>> df
a b
0 -1 0
1 0 0
2 1 5
3 2 -2
If like you say, you want to cap numbers at 1, you could use:
-
>>> df.clip(upper=1) a b 0 -1 0 1 0 0 2 1 1 3 1 -2
-
>>> df.mask(df>=1, 1) a b 0 -1 0 1 0 0 2 1 1 3 1 -2
Or if like your code says, you also want to make numbers 0 if they’re less than 1, you could use:
-
A comparison on the whole dataframe, then convert the bools to int:
>>> df.ge(1).astype(int) a b 0 0 0 1 0 0 2 1 1 3 1 0
- Docs:
.ge()
- Docs:
-
>>> df[:] = np.where(df>=1, 1, 0) >>> df a b 0 0 0 1 0 0 2 1 1 3 1 0