NumPy: apply vector-valued function to mesh grid
Question:
I am trying to do something like the following in NumPy:
import numpy as np
def f(x):
return x[0] + x[1]
X1 = np.array([0, 1, 2])
X2 = np.array([0, 1, 2])
X = np.meshgrid(X1, X2)
result = np.vectorize(f)(X)
with the expected result being array([[0, 1, 2], [1, 2, 3], [2, 3, 4]])
, but it returns the following error:
2
3 def f(x):
----> 4 return x[0] + x[1]
5
6 X1 = np.array([0, 1, 2])
IndexError: invalid index to scalar variable
This is because it tries to apply f
to all 18 scalar elements of the mesh grid, whereas I want it applied to 9 pairs of 2 scalars. What is the correct way to do this?
Note: I am aware this code will work if I do not vectorize f
, but this is important because f
can be any function, e.g. it could contain an if statement which throws value error without vectorizing.
Answers:
If you persist to use numpy.vectorize
you need to define signature
when defining vectorize on function.
import numpy as np
def f(x):
return x[0] + x[1]
# Or
# return np.add.reduce(x, axis=0)
X1 = np.array([0, 1, 2])
X2 = np.array([0, 1, 2])
X = np.meshgrid(X1, X2)
# np.asarray(X).shape -> (2, 3, 3)
# shape of the desired result is (3, 3)
f_vec = np.vectorize(f, signature='(n,m,m)->(m,m)')
result = f_vec(X)
print(result)
Output:
[[0 1 2]
[1 2 3]
[2 3 4]]
For the function you mentioned in the comments:
f = lambda x: x[0] + x[1] if x[0] > 0 else 0
You can use np.where
:
def f(x):
return np.where(x > 0, x[0] + x[1], 0)
# np.where(some_condition, value_if_true, value_if_false)
Numpy was designed with vectorization in mind — unless you have some crazy edge-case there’s almost always a way to take advantage of Numpy’s broadcasting and vectorization. I strongly recommend seeking out vectorized solutions before giving up so easily and resorting to using for
loops.
If you are too lazy, or ignorant, to do are "proper" ‘vectorization’, you can use np.vectorize
. But you need to take time to really read its docs. It isn’t magic. It can be useful, especially if you need to take advantage of broadcasting, and the function, some reason or other, only accepts scalars.
Rewriting your function to work with scalar inputs (though it also works fine with arrays, in this case):
In [91]: def foo(x,y): return x+y
...: f = np.vectorize(foo)
With scalar inputs:
In [92]: f(1,2)
Out[92]: array(3)
With 2 arrays (a (2,1) and (3,)), returning a (2,3):
In [93]: f(np.array([1,2])[:,None], np.arange(1,4))
Out[93]:
array([[2, 3, 4],
[3, 4, 5]])
Samething with meshgrid
:
In [94]: I,J = np.meshgrid(np.array([1,2]), np.arange(1,4),indexing='ij')
In [95]: I
Out[95]:
array([[1, 1, 1],
[2, 2, 2]])
In [96]: J
Out[96]:
array([[1, 2, 3],
[1, 2, 3]])
In [97]: f(I,J)
Out[97]:
array([[2, 3, 4],
[3, 4, 5]])
Or meshgrid arrays as defined in [93]:
In [98]: I,J = np.meshgrid(np.array([1,2]), np.arange(1,4),indexing='ij', sparse=True)
In [99]: I,J
Out[99]:
(array([[1],
[2]]),
array([[1, 2, 3]]))
But in a true vectorized sense, you can just add the 2 arrays:
In [100]: I+J
Out[100]:
array([[2, 3, 4],
[3, 4, 5]])
The first paragraph of np.vectorize
docs (my emphasis):
Define a vectorized function which takes a nested sequence of objects or
numpy arrays as inputs and returns a single numpy array or a tuple of numpy
arrays. The vectorized function evaluates pyfunc
over successive tuples
of the input arrays like the python map function, except it uses the
broadcasting rules of numpy.
edit
Starting with a function that expects a 2 element tuple, we could add a cover that splits it into two, and apply vectorize
to that:
In [103]: def foo1(x): return x[0]+x[1]
...: def foo2(x,y): return foo1((x,y))
...: f = np.vectorize(foo2)
In [104]: f(1,2)
Out[104]: array(3)
X
is a 2d element tuple:
In [105]: X = np.meshgrid(np.array([1,2]), np.arange(1,4),indexing='ij')
In [106]: X
Out[106]:
[array([[1, 1, 1],
[2, 2, 2]]),
array([[1, 2, 3],
[1, 2, 3]])]
which can be passed to f
as:
In [107]: f(X[0],X[1])
Out[107]:
array([[2, 3, 4],
[3, 4, 5]])
But there’s no need to slow things down with that iteration. Just pass the tuple to foo1
:
In [108]: foo1(X)
Out[108]:
array([[2, 3, 4],
[3, 4, 5]])
In f = lambda x: x[0] + x[1] if x[0] > 0 else 0
you get the ‘ambiguity’ valueerror because if
only works with scalars. But there are plenty of faster numpy ways of replacing such an if
step.
ChatGPT to the rescue! As it turns out, the better option here is np.apply_along_axis
. The following code solved the problem:
import numpy as np
def f(x):
return x[0] + x[1]
X1 = np.array([0, 1, 2])
X2 = np.array([0, 1, 2])
X = np.meshgrid(X1, X2)
result = np.apply_along_axis(f, 0, X)
I am trying to do something like the following in NumPy:
import numpy as np
def f(x):
return x[0] + x[1]
X1 = np.array([0, 1, 2])
X2 = np.array([0, 1, 2])
X = np.meshgrid(X1, X2)
result = np.vectorize(f)(X)
with the expected result being array([[0, 1, 2], [1, 2, 3], [2, 3, 4]])
, but it returns the following error:
2
3 def f(x):
----> 4 return x[0] + x[1]
5
6 X1 = np.array([0, 1, 2])
IndexError: invalid index to scalar variable
This is because it tries to apply f
to all 18 scalar elements of the mesh grid, whereas I want it applied to 9 pairs of 2 scalars. What is the correct way to do this?
Note: I am aware this code will work if I do not vectorize f
, but this is important because f
can be any function, e.g. it could contain an if statement which throws value error without vectorizing.
If you persist to use numpy.vectorize
you need to define signature
when defining vectorize on function.
import numpy as np
def f(x):
return x[0] + x[1]
# Or
# return np.add.reduce(x, axis=0)
X1 = np.array([0, 1, 2])
X2 = np.array([0, 1, 2])
X = np.meshgrid(X1, X2)
# np.asarray(X).shape -> (2, 3, 3)
# shape of the desired result is (3, 3)
f_vec = np.vectorize(f, signature='(n,m,m)->(m,m)')
result = f_vec(X)
print(result)
Output:
[[0 1 2]
[1 2 3]
[2 3 4]]
For the function you mentioned in the comments:
f = lambda x: x[0] + x[1] if x[0] > 0 else 0
You can use np.where
:
def f(x):
return np.where(x > 0, x[0] + x[1], 0)
# np.where(some_condition, value_if_true, value_if_false)
Numpy was designed with vectorization in mind — unless you have some crazy edge-case there’s almost always a way to take advantage of Numpy’s broadcasting and vectorization. I strongly recommend seeking out vectorized solutions before giving up so easily and resorting to using for
loops.
If you are too lazy, or ignorant, to do are "proper" ‘vectorization’, you can use np.vectorize
. But you need to take time to really read its docs. It isn’t magic. It can be useful, especially if you need to take advantage of broadcasting, and the function, some reason or other, only accepts scalars.
Rewriting your function to work with scalar inputs (though it also works fine with arrays, in this case):
In [91]: def foo(x,y): return x+y
...: f = np.vectorize(foo)
With scalar inputs:
In [92]: f(1,2)
Out[92]: array(3)
With 2 arrays (a (2,1) and (3,)), returning a (2,3):
In [93]: f(np.array([1,2])[:,None], np.arange(1,4))
Out[93]:
array([[2, 3, 4],
[3, 4, 5]])
Samething with meshgrid
:
In [94]: I,J = np.meshgrid(np.array([1,2]), np.arange(1,4),indexing='ij')
In [95]: I
Out[95]:
array([[1, 1, 1],
[2, 2, 2]])
In [96]: J
Out[96]:
array([[1, 2, 3],
[1, 2, 3]])
In [97]: f(I,J)
Out[97]:
array([[2, 3, 4],
[3, 4, 5]])
Or meshgrid arrays as defined in [93]:
In [98]: I,J = np.meshgrid(np.array([1,2]), np.arange(1,4),indexing='ij', sparse=True)
In [99]: I,J
Out[99]:
(array([[1],
[2]]),
array([[1, 2, 3]]))
But in a true vectorized sense, you can just add the 2 arrays:
In [100]: I+J
Out[100]:
array([[2, 3, 4],
[3, 4, 5]])
The first paragraph of np.vectorize
docs (my emphasis):
Define a vectorized function which takes a nested sequence of objects or
numpy arrays as inputs and returns a single numpy array or a tuple of numpy
arrays. The vectorized function evaluatespyfunc
over successive tuples
of the input arrays like the python map function, except it uses the
broadcasting rules of numpy.
edit
Starting with a function that expects a 2 element tuple, we could add a cover that splits it into two, and apply vectorize
to that:
In [103]: def foo1(x): return x[0]+x[1]
...: def foo2(x,y): return foo1((x,y))
...: f = np.vectorize(foo2)
In [104]: f(1,2)
Out[104]: array(3)
X
is a 2d element tuple:
In [105]: X = np.meshgrid(np.array([1,2]), np.arange(1,4),indexing='ij')
In [106]: X
Out[106]:
[array([[1, 1, 1],
[2, 2, 2]]),
array([[1, 2, 3],
[1, 2, 3]])]
which can be passed to f
as:
In [107]: f(X[0],X[1])
Out[107]:
array([[2, 3, 4],
[3, 4, 5]])
But there’s no need to slow things down with that iteration. Just pass the tuple to foo1
:
In [108]: foo1(X)
Out[108]:
array([[2, 3, 4],
[3, 4, 5]])
In f = lambda x: x[0] + x[1] if x[0] > 0 else 0
you get the ‘ambiguity’ valueerror because if
only works with scalars. But there are plenty of faster numpy ways of replacing such an if
step.
ChatGPT to the rescue! As it turns out, the better option here is np.apply_along_axis
. The following code solved the problem:
import numpy as np
def f(x):
return x[0] + x[1]
X1 = np.array([0, 1, 2])
X2 = np.array([0, 1, 2])
X = np.meshgrid(X1, X2)
result = np.apply_along_axis(f, 0, X)