lambda function on a numpy array. What's wrong with this piece of code?
Question:
What’s wrong with this code :
import numpy as np
A = np.array([[-0.5, 0.2, 0.0],
[4.2, 3.14, -2.7]])
asign = lambda t: 0 if t<0 else 1
asign(A)
print(A)
expected out:
[[0. 1. 0.]
[ 1. 1. 0.]]
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Answers:
This worked for me:
A = A.clip(min=0, max=1)
Well the lambda on its own will not go through the whole array. For that you will need a higher order function. In this case: map.
A = np.array([[-0.5, 0.2, 0.0],
[4.2, 3.14, -2.7]])
asign = lambda t: 0 if t<0 else 1
A = list(map(asign, A))
Map will iterate through every element and pass it through the function.
I wrapped map in a list because it returns an object of type map but you can convert it that way.
You can use the lambda, but the numpy
datatypes allow you to do many “matlab-type” operations (for those who are used to that):
-
python:
a = np.array([1, 2, 3, 4, 5])
((a > 1) & (a < 3)).astype(int)
# array([0, 1, 0, 0, 0])
-
octave/matlab
a = [1,2,3,4,5];
a>1 & a<3
% ans =
%
% 0 1 0 0 0
Not an answer, sorry, but more data. I also can’t find the explanation for the fact that some functions/lambdas act over the array, while the others treat it as a whole.
See this testing:
string_arr = [ "a", "bb", "ccc", "dddd" ]
ndstr_arr = np.array(string_arr)
l1 = lambda x: x == "ccc"
l2 = lambda x: len(x) > 2
print("nDirect lambda over array:")
print (l1(string_arr)) # fails
print (l2(string_arr)) # fails
print (l1(ndstr_arr)) # WORKS
print (l2(ndstr_arr)) # fails
print("nList(map(lambda over array)): ")
print (list(map(l1,string_arr))) # WORKS
print (list(map(l2,string_arr))) # WORKS
print (list(map(l1,ndstr_arr))) # WORKS
print (list(map(l2,ndstr_arr))) # WORKS
for this result:
Direct lambda over array:
False
True
[False False True False]
True
List(map(lambda over array)):
[False, False, True, False]
[False, False, True, True]
[False, False, True, False]
[False, False, True, True]
Both lambda’s are boolean functions, but for some reason the first one x==...
gets mapped over the array (and note: only over the ndarray – the regular list, string_arr is never mapped) whereas len(x) > 2
acts on the array as single object.
What’s the difference between these lambdas?
(Also note that list(map) is not a real substitute since it doesn’t return an ndarray, so we have to use it to build a new ndarray or use vectorize or some other method… that’s not really the point here though)
Answering my own follow-on question which sort of answers the original.
To recap / generalize:
The OP is asking "Why is the lambda applied to the numpy array as a whole object when I expected it to be applied element-wise?"
My followup asks "Why are some lambda applied as a whole while others are applied element_wise?"
The TL;DR answer is that the lambda always treats the numpy array as a whole object – a regular argument to a regular function – but the operator used inside the body of the lambda (or function, or wherever) may be overridden by numpy ndarray to work element-wise, and the ==
operator is such and operator.
In my example it’s the ==
operator. I tracked down the override for this and unfortunately the official numpy documentation of the override is not much help:
numpy.ndarray.eq method
ndarray.eq(value, /) Return self==value.
(fwiw, I know this is the documenation for == because the equivalency of operators to special method names is defined in this python reference)
The correct answer required more digging – I found it in the numpy documentation of the numpy.equal
function:
numpy.equal numpy.equal(x1, x2, /, out=None, *, where=True,
casting=’same_kind’, order=’K’, dtype=None, subok=True[, signature,
extobj]) = <ufunc ‘equal’>
Return (x1 == x2) element-wise.
The ==
operator is applied element-wise!
Hence in my first lambda
lambda x: x == "ccc"
, x
does indeed hold the entire ndarray, but the ==
is applied to each element returning a new ndarray
Again the numpy.equal
doc makes this clear:
Returns: out: ndarray or scalar
Output array, element-wise comparison
of x1 and x2. Typically of type bool, unless dtype=object is passed.
This is a scalar if both x1 and x2 are scalars.
x1
and x2
are the args, so that we’re comparing x1 == x2
. In our case, x1
is x
so the full ndarray – not scalar – so the result is an ndarray.
You may be wondering why how it treats "ccc"
(which is assigned to the x2
parameter), the doc says:
Parameters
x1, x2 array_like
Input arrays. If x1.shape != x2.shape,
they must be broadcastable to a common shape (which becomes the shape
of the output).
So x2
(our "ccc"
) is supposed to be array_like, but numpy will, if it can, broadcast it to the same shape as x1
. And it will, because it can, as is documented in Numpy Broadcasting:
The simplest broadcasting example occurs when an array and a scalar
value are combined in an operation…
The result is equivalent to the previous example where b was an array.
We can think of the scalar b being stretched during the arithmetic
operation into an array with the same shape as a.
QED.
What’s wrong with this code :
import numpy as np
A = np.array([[-0.5, 0.2, 0.0],
[4.2, 3.14, -2.7]])
asign = lambda t: 0 if t<0 else 1
asign(A)
print(A)
expected out:
[[0. 1. 0.]
[ 1. 1. 0.]]
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
This worked for me:
A = A.clip(min=0, max=1)
Well the lambda on its own will not go through the whole array. For that you will need a higher order function. In this case: map.
A = np.array([[-0.5, 0.2, 0.0],
[4.2, 3.14, -2.7]])
asign = lambda t: 0 if t<0 else 1
A = list(map(asign, A))
Map will iterate through every element and pass it through the function.
I wrapped map in a list because it returns an object of type map but you can convert it that way.
You can use the lambda, but the numpy
datatypes allow you to do many “matlab-type” operations (for those who are used to that):
-
python:
a = np.array([1, 2, 3, 4, 5]) ((a > 1) & (a < 3)).astype(int) # array([0, 1, 0, 0, 0])
-
octave/matlab
a = [1,2,3,4,5]; a>1 & a<3 % ans = % % 0 1 0 0 0
Not an answer, sorry, but more data. I also can’t find the explanation for the fact that some functions/lambdas act over the array, while the others treat it as a whole.
See this testing:
string_arr = [ "a", "bb", "ccc", "dddd" ]
ndstr_arr = np.array(string_arr)
l1 = lambda x: x == "ccc"
l2 = lambda x: len(x) > 2
print("nDirect lambda over array:")
print (l1(string_arr)) # fails
print (l2(string_arr)) # fails
print (l1(ndstr_arr)) # WORKS
print (l2(ndstr_arr)) # fails
print("nList(map(lambda over array)): ")
print (list(map(l1,string_arr))) # WORKS
print (list(map(l2,string_arr))) # WORKS
print (list(map(l1,ndstr_arr))) # WORKS
print (list(map(l2,ndstr_arr))) # WORKS
for this result:
Direct lambda over array:
False
True
[False False True False]
True
List(map(lambda over array)):
[False, False, True, False]
[False, False, True, True]
[False, False, True, False]
[False, False, True, True]
Both lambda’s are boolean functions, but for some reason the first one x==...
gets mapped over the array (and note: only over the ndarray – the regular list, string_arr is never mapped) whereas len(x) > 2
acts on the array as single object.
What’s the difference between these lambdas?
(Also note that list(map) is not a real substitute since it doesn’t return an ndarray, so we have to use it to build a new ndarray or use vectorize or some other method… that’s not really the point here though)
Answering my own follow-on question which sort of answers the original.
To recap / generalize:
The OP is asking "Why is the lambda applied to the numpy array as a whole object when I expected it to be applied element-wise?"
My followup asks "Why are some lambda applied as a whole while others are applied element_wise?"
The TL;DR answer is that the lambda always treats the numpy array as a whole object – a regular argument to a regular function – but the operator used inside the body of the lambda (or function, or wherever) may be overridden by numpy ndarray to work element-wise, and the ==
operator is such and operator.
In my example it’s the ==
operator. I tracked down the override for this and unfortunately the official numpy documentation of the override is not much help:
numpy.ndarray.eq method
ndarray.eq(value, /) Return self==value.
(fwiw, I know this is the documenation for == because the equivalency of operators to special method names is defined in this python reference)
The correct answer required more digging – I found it in the numpy documentation of the numpy.equal
function:
numpy.equal numpy.equal(x1, x2, /, out=None, *, where=True,
casting=’same_kind’, order=’K’, dtype=None, subok=True[, signature,
extobj]) = <ufunc ‘equal’>Return (x1 == x2) element-wise.
The ==
operator is applied element-wise!
Hence in my first lambda
lambda x: x == "ccc"
, x
does indeed hold the entire ndarray, but the ==
is applied to each element returning a new ndarray
Again the numpy.equal
doc makes this clear:
Returns: out: ndarray or scalar
Output array, element-wise comparison
of x1 and x2. Typically of type bool, unless dtype=object is passed.
This is a scalar if both x1 and x2 are scalars.
x1
and x2
are the args, so that we’re comparing x1 == x2
. In our case, x1
is x
so the full ndarray – not scalar – so the result is an ndarray.
You may be wondering why how it treats "ccc"
(which is assigned to the x2
parameter), the doc says:
Parameters
x1, x2 array_like
Input arrays. If x1.shape != x2.shape,
they must be broadcastable to a common shape (which becomes the shape
of the output).
So x2
(our "ccc"
) is supposed to be array_like, but numpy will, if it can, broadcast it to the same shape as x1
. And it will, because it can, as is documented in Numpy Broadcasting:
The simplest broadcasting example occurs when an array and a scalar
value are combined in an operation…The result is equivalent to the previous example where b was an array.
We can think of the scalar b being stretched during the arithmetic
operation into an array with the same shape as a.
QED.