Parameters to numpy's fromfunction

Question:

I haven’t grokked the key concepts in numpy yet.

I would like to create a 3-dimensional array and populate each cell with the result of a function call – i.e. the function would be called many times with different indices and return different values.

Note: Since writing this question, the documentation has been updated to be clearer.

I could create it with zeros (or empty), and then overwrite every value with a for loop, but it seems cleaner to populate it directly from the function.

fromfunction sounds perfect. Reading the documentation it sounds like the function gets called once per cell.

But when I actually try it…

from numpy import *

def sum_of_indices(x, y, z):
    # What type are X, Y and Z ? Expect int or duck-type equivalent.
    # Getting 3 individual arrays
    print "Value of X is:"
    print x

    print "Type of X is:", type(x)
    return x + y + z

a = fromfunction(sum_of_indices, (2, 2, 2))

I expect to get something like:

Value of X is:
0
Type of X is: int
Value of X is:
1
Type of X is: int

repeated 4 times.

I get:

Value of X is:
[[[ 0.  0.]
  [ 0.  0.]]

 [[ 1.  1.]
  [ 1.  1.]]]
[[[ 0.  0.]
  [ 1.  1.]]

 [[ 0.  0.]
  [ 1.  1.]]]
[[[ 0.  1.]
  [ 0.  1.]]

 [[ 0.  1.]
  [ 0.  1.]]]
Type of X is: <type 'numpy.ndarray'>

The function is only called once, and seems to return the entire array as result.

What is the correct way to populate an array based on multiple calls to a function of the indices?

Asked By: Oddthinking

||

Answers:

I think you are misunderstanding what fromfunction does.

From numpy source code.

def fromfunction(function, shape, **kwargs):
    dtype = kwargs.pop('dtype', float)
    args = indices(shape, dtype=dtype)
    return function(*args,**kwargs)

Where indices is fairly equivalent to meshgrid where each variable is np.arange(x).

>>> side = np.arange(2)
>>> side
array([0, 1])
>>> x,y,z = np.meshgrid(side,side,side)
>>> x
array([[[0, 0],
        [1, 1]],

       [[0, 0],
        [1, 1]]])
>>> x+y+z #Result of your code.
array([[[0, 1],
        [1, 2]],

       [[1, 2],
        [2, 3]]])
Answered By: Daniel

Does this give you an incorrect result? a should be as expected (and is when I tested it) and seems like a fine way to do what you want.

>>> a
array([[[ 0.,  1.],    # 0+0+0, 0+0+1
        [ 1.,  2.]],   # 0+1+0, 0+1+1

       [[ 1.,  2.],    # 1+0+0, 1+0+1
        [ 2.,  3.]]])  # 1+1+0, 1+1+1

Since fromfunction works on array indices for input,
you can see that it only needs to be called once. The documentation does not make this clear, but you can see that the function is being called on arrays of indices in the source code (from numeric.py):

def fromfunction(function, shape, **kwargs):
    . . .
    args = indices(shape, dtype=dtype)
    return function(*args,**kwargs)

sum_of_indices is called on array inputs where each array holds the index values for that
dimension.

array([[[ 0.,  0.],
        [ 1.,  1.]],

       [[ 1.,  1.],
        [ 1.,  1.]]])

+

array([[[ 0.,  0.],
        [ 1.,  1.]],

       [[ 0.,  0.],
        [ 1.,  1.]]])

+
array([[[ 0.,  1.],
        [ 0.,  1.]],

       [[ 0.,  1.],
        [ 0.,  1.]]])

=

array([[[ 1.,  1.],
        [ 1.,  2.]],

       [[ 1.,  2.],
        [ 2.,  3.]]])
Answered By: A.E. Drew

I obviously didn’t made myself clear. I am getting responses that fromfunc actually works as my test code demonstrates, which I already knew because my test code demonstrated it.

The answer I was looking for seems to be in two parts:


The fromfunc documentation is misleading. It works to populate the entire array at once.

Note: Since writing this question, the documentation has been updated to be clearer.

In particular, this line in the documentation was incorrect (or at the very minimum, misleading)

For example, if shape were (2, 2), then the parameters in turn be (0, 0), (0, 1), (1, 0), (1, 1).

No. If shape (i.e. from context, the second parameter to the fromfunction) were (2,2), the parameters would be (not ‘in turn’, but in the only call):

(array([[ 0.,  0.], [ 1.,  1.]]), array([[ 0.,  1.], [ 0.,  1.]]))

The documentation has been updated, and currently reads more accurately:

The function is called with N parameters, where N is the rank of shape. Each parameter represents the coordinates of the array varying along a specific axis. For example, if shape were (2, 2), then the parameters would be array([[0, 0], [1, 1]]) and array([[0, 1], [0, 1]])

(My simple example, derived from the examples in the manual, may have been misleading, because + can operate on arrays as well as indices. This ambiguity is another reason why the documentation is unclear. I want to ultimately use a function that isn’t array based, but is cell-based – e.g. each value might be fetched from a URL or database based on the indices, or even input from the user.)


Returning to the problem – which is how can I populate an array from a function that is called once per element, the answer appears to be:

You cannot do this in a functional style.

You can do it in an imperative/iterative style – i.e. writing nested for-loops, and managing the index lengths yourself.

You could also do it as an iterator, but the iterator still needs to track its own indices.

Answered By: Oddthinking

The documentation is very misleading in that respect. It’s just as you note: instead of performing f(0,0), f(0,1), f(1,0), f(1,1), numpy performs

f([[0., 0.], [0., 1.]], [[1., 0.], [1., 1.]])

Using ndarrays rather than the promised integer coordinates is quite frustrating when you try and use something likelambda i: l[i], where l is another array or list (though really, there are probably better ways to do this in numpy).

The numpy vectorize function fixes this. Where you have

m = fromfunction(f, shape)

Try using

g = vectorize(f)
m = fromfunction(g, shape)
Answered By: Chris Jones

Here’s my take on your problem:

As mentioned by Chris Jones the core of the solution is to use np.vectorize.

# Define your function just like you would
def sum_indices(x, y, z):
    return x + y + z

# Then transform it into a vectorized lambda function
f = sum_indices
fv = np.vectorize(f)

If you now do np.fromfunction(fv, (3, 3, 3)) you get this:

array([[[0., 1., 2.],
        [1., 2., 3.],
        [2., 3., 4.]],

       [[1., 2., 3.],
        [2., 3., 4.],
        [3., 4., 5.]],

       [[2., 3., 4.],
        [3., 4., 5.],
        [4., 5., 6.]]])

Is this what you wanted?

Answered By: pfabri

I think it is a little confusing that most examples of fromfunction use square arrays.

Perhaps looking at a non-square array could be helpful?

def f(x,y):
    print(f'x=n{x}')
    print(f'y=n{y}')
    return x+y

z = np.fromfunction(f,(4,3))
print(f'z=n{z}')

Results in:

x=
[[0 0 0]
 [1 1 1]
 [2 2 2]
 [3 3 3]]
y=
[[0 1 2]
 [0 1 2]
 [0 1 2]
 [0 1 2]]
z=
[[0 1 2]
 [1 2 3]
 [2 3 4]
 [3 4 5]]
Answered By: Tony H

If you set parameter dtype to int you can get the desired output:

a = fromfunction(sum_of_indices, (2, 2, 2), dtype=int)

https://numpy.org/doc/stable/reference/generated/numpy.fromfunction.html

Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.