apply numpy.histogram to multidimensional array

Question:

I want to apply numpy.histogram() to a multi-dimensional array along an axis.

Say, for example I have a 2D array and I want to apply histogram() along axis=1.

Code:

import numpy

array = numpy.array([[0.6, 0.7, -0.3, 1.0, -0.8], [0.2, -1.0, -0.5, 0.5, 0.8], 
                    [0.25, 0.3, -0.1, -0.8, 1.0]])
bins = [-1.0, -0.5, 0, 0.5, 1.0, 1.0]
hist, bin_edges = numpy.histogram(array, bins)
print(hist)

Output:

[3 3 3 4 2]

Expected Output:

[[1 1 0 2 1],
 [1 1 1 2 0],
 [1 1 2 0 1]]

How can I get my expected output?

I tried to use the solution suggested in this post, but it doesn’t get me to the expected output.

Asked By: Wasi Ahmad

||

Answers:

For n-d cases, you can do this with np.histogram2d just by making a dummy x-axis (i):

def vec_hist(a, bins):
    i = np.repeat(np.arange(np.product(a.shape[:-1]), a.shape[-1]))
    return np.histogram2d(i, a.flatten(), (a.shape[0], bins)).reshape(a.shape[:-1], -1)

Output

vec_hist(array, bins)
Out[453]: 
(array([[ 1.,  1.,  0.,  2.,  1.],
        [ 1.,  1.,  1.,  2.,  0.],
        [ 1.,  1.,  2.,  0.,  1.]]),
 array([ 0.        ,  0.66666667,  1.33333333,  2.        ]),
 array([-1.       , -0.5      ,  0.       ,  0.5      ,  0.9999999,  1.       ]))

For histograms over arbitrary axis, you’ll probably need to create i using np.meshgrid and np.ravel_multi_axis and then use that to reshape the resulting histogram.

Answered By: Daniel F

Numpy code changed and the answer proposed by Daniel F is not working anymore.
Here is a simplified version for 2D arrays:

def vec_hist(a, bins):
    i = np.tile(np.arange(a.shape[0]), (a.shape[1], 1)).T.flatten()
    H, _, _ = np.histogram2d(i, a.flatten(), (a.shape[0], bins))
    return H

So basically first we build a list of coordinates "i" that is simply the index of the data along the last axis.
Then we flatten it, along with the data array "a" before sending it to np.histogram2d. The bin argument of histogram2d are the bin the dummy and the actually desired dimension.

Code:

array = np.array([[0.6, 0.7, -0.3, 1.0, -0.8], 
                  [0.2, -1.0, -0.5, 0.5, 0.8], 
                  [0.25, 0.3, -0.1, -0.8, 1.0]])
bins = [-1.0, -0.5, 0, 0.5, 1.0, 1.0]
vec_hist(array, bins)

Output

array([[1., 1., 0., 2., 1.],
       [1., 1., 1., 2., 0.],
       [1., 1., 2., 0., 1.]])
Answered By: Cunningham