Is there a vectorized way to find maxes within labeled areas in NumPy?

Question:

I have a 2D array representing tree heights, where 0 is the ground. I have another array that’s always the same size showing segmented and labeled trees, where a 0 label means ground, and a positive integer value represents a unique tree. Here are some slices of the data:

heights = array([[37.5 , 41.82, 42.18, 42.18, 42.18, 39.23, 40.68, 40.71, 40.71,
        40.19, 35.03, 41.41, 41.41, 41.41, 40.77, 32.23, 32.23, 32.23,
        31.45, 25.6 , 25.63, 30.12, 30.78, 30.78, 30.92],
       [37.5 , 37.5 , 41.82, 42.18, 41.78, 41.78, 40.68, 40.68, 40.68,
        40.19, 41.04, 41.41, 41.41, 41.41, 41.03, 32.23, 32.23, 32.23,
        31.25, 25.6 , 25.6 , 30.12, 30.12, 21.08, 30.88],
       [37.5 , 37.5 , 34.61, 41.78, 41.78, 25.6 , 39.14, 40.68, 38.79,
        38.79, 41.04, 41.04, 41.8 , 41.8 , 41.8 , 24.66, 24.66, 31.25,
        25.63, 26.24, 26.2 , 25.2 , 24.93, 21.03, 21.03],
       [34.53, 34.61, 34.61, 35.23, 35.23, 25.32, 25.32, 33.17, 33.17,
        38.86, 39.4 , 40.31, 41.8 , 41.8 , 41.8 , 41.17, 25.37, 26.77,
        27.32, 27.39, 27.39, 26.96, 25.2 , 28.68, 28.68],
       [34.53, 34.52, 36.5 , 36.58, 36.67, 36.67, 25.15, 33.17, 38.65,
        38.86, 39.4 , 39.53, 40.78, 41.17, 41.17,  0.  , 26.77, 27.09,
        27.39, 27.6 , 27.6 , 28.  , 28.16, 28.68, 28.68],
       [32.22, 36.45, 37.1 , 37.28, 37.28, 38.07, 30.98, 31.12, 38.65,
        38.65, 39.12, 39.4 , 40.78, 40.78,  0.  ,  0.  , 27.41, 27.72,
        27.72, 28.49, 28.49, 28.16, 28.34, 28.87, 28.68],
       [36.45, 37.1 , 37.1 , 37.28, 38.23, 38.23, 38.23, 33.61, 32.31,
        38.65, 38.65, 38.62, 39.01, 33.75, 34.65, 34.65, 27.41, 27.72,
        27.72, 28.49, 28.49, 28.49, 28.87, 30.31, 30.31],
       [35.71, 36.45, 37.1 , 30.96, 38.23, 38.23, 38.23, 33.61, 33.28,
        33.42, 33.5 , 33.5 , 33.51, 34.07, 34.65, 34.65, 27.36, 27.83,
        27.83, 28.49, 28.49, 28.43, 28.87, 31.82, 31.68],
       [14.44,  0.  ,  0.  ,  0.  , 21.41, 32.98, 33.61, 33.61, 34.27,
        34.8 , 34.8 , 33.5 , 33.4 , 34.07, 34.65, 34.65,  0.  , 27.83,
        27.83, 28.7 , 29.18, 29.18, 31.82, 31.82, 31.98],
       [13.46,  0.  ,  0.  , 21.41, 21.73, 31.36, 33.33, 33.33, 34.89,
        34.99, 34.99, 32.72, 33.4 , 33.8 , 33.8 ,  0.  ,  0.  ,  0.  ,
        28.7 , 28.7 , 29.64, 29.64, 31.82, 31.82, 35.82],
       [13.46,  0.  ,  0.  ,  0.  , 21.73, 31.36, 31.46, 35.81, 36.33,
        36.33, 36.33, 32.72, 33.37, 33.71, 33.71,  0.  ,  0.  ,  0.  ,
        28.7 , 29.64, 29.64, 29.77, 29.77, 29.77, 35.95],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  , 24.07, 31.57, 35.9 , 36.33,
        36.33, 36.33, 21.97, 32.72, 33.37, 33.37,  0.  ,  0.  ,  0.  ,
        28.36, 29.04, 29.64, 29.77, 29.77, 29.77, 35.95],
       [ 0.  ,  0.  ,  0.  ,  0.  , 22.09, 24.07, 23.92, 31.57, 35.9 ,
        36.33,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
        28.38, 29.53, 28.96, 28.96, 28.69, 29.19, 35.49],
       [ 0.  ,  0.  ,  0.  ,  0.  , 22.09, 22.09, 22.09,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
        29.53, 29.53, 29.82, 28.96, 28.73, 29.19, 29.19],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
        29.53, 30.12, 30.12, 29.82, 28.73,  0.  , 28.89],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  , 30.12, 30.12, 30.12, 28.94,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  , 30.12, 30.12, 29.82,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  , 28.65, 28.65,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ]], dtype=float32)
labeled_trees = array([[33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37,
        37, 37, 37, 37, 39, 39, 39, 39, 39],
       [33, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37,
        37, 37, 37, 37, 39, 39, 39, 39, 39],
       [33, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37,
        37, 37, 37, 39, 39, 39, 39, 39, 39],
       [33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37, 37,
        37, 37, 39, 39, 39, 39, 39, 39, 39],
       [33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37, 37,  0,
        39, 39, 39, 39, 39, 39, 39, 39, 39],
       [33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37,  0,  0,
        39, 39, 39, 39, 39, 39, 39, 39, 39],
       [33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37, 37,
        37, 39, 39, 39, 39, 39, 39, 39, 39],
       [33, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37,
        37, 39, 39, 39, 39, 39, 39, 39, 39],
       [33,  0,  0,  0, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37,
         0, 39, 39, 39, 39, 39, 39, 39, 39],
       [33,  0,  0, 33, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37,  0,
         0,  0, 39, 39, 39, 39, 39, 39, 39],
       [33,  0,  0,  0, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37,  0,
         0,  0, 39, 39, 39, 39, 39, 39, 39],
       [ 0,  0,  0,  0,  0, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37,  0,
         0,  0, 39, 39, 39, 39, 39, 39, 39],
       [ 0,  0,  0,  0, 33, 33, 33, 33, 33, 33,  0,  0,  0,  0,  0,  0,
         0,  0, 39, 39, 39, 39, 39, 39, 39],
       [ 0,  0,  0,  0, 33, 33, 33,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0, 39, 39, 39, 39, 39, 39, 39],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0, 39, 39, 39, 39, 39,  0, 39],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0, 39, 39, 39, 39,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0, 39, 39, 39,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0, 39, 39,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0]], dtype=int32)

I’d like to find the max height within each labeled region. I have done this successfully with a for loop, but it’s slow.

max_heights = {}
for label in list(np.unique(labeled_trees))[1:]:
    tree_height = np.amax(heights[labeled_trees == label])
    max_heights[str(label)] = tree_height

# max_heights = {'33': 42.18, '37': 41.8, '39': 35.95}

Is there a faster/vectorized/more efficient way of finding the max values within labeled regions of a numpy array? The ideal output would be a boolean array where the location of each max is True.

[EDIT]

The maximum_position function from scipy.ndimage is promising, but it looks like the it only returns the first location where the pixel equals the local max. I need every location within a labeled region that equals its max.

Asked By: jesnes

||

Answers:

This returns the ideal output you need, but it is not fast enough. On my machine, it needs about 60 µs:

def max_mask(labeled_trees, heights):
    cmp = labeled_trees.reshape(1, -1) != np.unique(labeled_trees)[1:, None]
    indices = np.ma.masked_array(np.broadcast_to(heights.ravel(), cmp.shape), cmp).argmax(-1)
    ret = np.zeros(heights.size, bool)
    ret[indices] = True
    return ret.reshape(heights.shape)

Some explanations:

  1. The first step is to use broadcast to return the comparison result of each value of np.unique(labeled_trees)[1:] with labeled_trees.ravel(), which will be a 2d array with the shape of (np.unique(labeled_trees)[1:].size, labeled_trees.size). The equivalent code is given below:
cmp = np.array([labeled_tree.ravel() != elem for elem in np.unique(labeled_tree)[1:]])
  1. The second step is to flatten the heights and broadcast it as the shape of cmp as the value of np.ma.masked_array, cmp as the mask, and then find argmax for the mask array, which will find out the position of the maximum value of the valid part for each sub array. The equivalent code is given below:
indices = np.array([np.ma.masked_array(heights, mask).argmax() for mask in cmp])
  1. The remaining steps are very simple. We have already got the position of the maximum value of each unique value range of heights. Just create a bool array of the same size and set the corresponding position to True, finally reshape and return.

Test:

>>> print(max_mask(labeled_trees, heights).astype(int))
[[0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]
>>> heights[max_mask(labeled_trees, heights)]
array([42.18, 41.8 , 35.95])

Better performance version: Here I refer to the impletementation of masked_array.argmax to get a faster method. On my machine, it only needs about 22 μs:

def max_mask(labeled_trees, heights):
    cmp = labeled_trees.reshape(1, -1) != np.unique(labeled_trees)[1:, None]
    indices = np.where(cmp, -np.inf, heights.ravel()).argmax(-1)
    ret = np.zeros(heights.size, bool)
    ret[indices] = True
    return ret.reshape(heights.shape)

In order to avoid the possible copy caused by labeled_trees.reshape, it can be changed to the following form:

def max_mask(labeled_trees, heights):
    cmp = labeled_trees[None] != np.unique(labeled_trees)[1:, None, None]
    indices = np.where(cmp, -np.inf, heights).reshape(-1, heights.size).argmax(-1)
    ret = np.zeros(heights.size, bool)
    ret[indices] = True
    return ret.reshape(heights.shape)

More consistent with ideal output version: I noticed that you asked "every location within a labeled region that equals its max", and I updated the answer again, it needs about 34 μs to run on my machine:

def max_mask(labeled_trees, heights):
    cmp = labeled_trees[None] != np.unique(labeled_trees)[1:, None, None]
    masked = np.where(cmp, -np.inf, heights).reshape(-1, heights.size)
    return (masked.max(-1, keepdims=True) == masked).any(0).reshape(heights.shape)

Test:

>>> print(max_mask(labeled_trees, heights).astype(int))
[[0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]
>>> heights[max_mask(labeled_trees, heights)]
array([42.18, 42.18, 42.18, 42.18, 41.8 , 41.8 , 41.8 , 41.8 , 41.8 ,
       41.8 , 35.95, 35.95])

Supplement: the above version has been vectorized as much as possible, but it requires a large amount of memory (especially when np.unique(labeled_trees) is very large). I tested it with random numbers and found that its speed will be seriously slowed down due to memory problems. Therefore, a solution using loops is provided here. It requires very little memory:

def max_mask_loop(labeled_trees, heights):
    ret = np.zeros(heights.shape, bool)
    for val in np.unique(labeled_trees)[1:]:
        masked = np.where(labeled_trees != val, -np.inf, heights)
        ret |= masked.max() == masked
    return ret

Comparison:

>>> heights = np.random.rand(2500, 2500)
>>> labeled_trees = np.random.randint(0, 300, heights.shape)
>>> timeit(lambda: max_mask(labeled_trees, heights), number=1)
37.30420290003531
>>> timeit(lambda: max_mask_loop(labeled_trees, heights), number=1)
9.9376986999996
Answered By: Mechanic Pig

Check Below pure numpy implementation using reduceat

## Step 1 Flatten Array
height_1d = heights[labeled_trees>0].reshape(1,-1)[0]
labeled_trees_1d = labeled_trees[labeled_trees>0].reshape(1,-1)[0]

## Step 2 : Sort arrays while maintaining there relationship
srt_indicies = labeled_trees_1d.argsort()
sorted_heights = height_1d[srt_indicies]
sorted_labeled_trees = labeled_trees_1d[srt_indicies]

## Extract indices where maximum need to be found
_, idx = np.unique(sorted_labeled_trees, return_index=True)

## Use maximum.reduce to find array and dict comprehension for final output
{str(key):value for  key, value  in zip(list(sorted_labeled_trees[idx]) ,list(np.maximum.reduceat(sorted_heights, idx))) if key > 0}

Output:

enter image description here

Answered By: Abhishek

Here is a much simpler use of np.maximum.reduceat:

idx = labeled_trees.argsort(None)
sorted_labeled_trees = labeled_trees.ravel()[idx]
sorted_heights = heights.ravel()[idx]
bins = np.flatnonzero(np.diff(sorted_labeled_trees) != 0) + 1
max_heights = np.maximum.reduceat(sorted_heights, bins)
max_trees = sorted_labeled_trees[bins]

If you insist on a dictionary, you can make one with zip:

result = dict(zip(max_trees, max_heights))

If you want a mask of the positions where the maxima occur and the number of trees is relatively small, you can compute the mask more-or-less directly using broadcasting:

peak_mask = ((max_trees == labeled_trees[..., None]) & (max_height == heights[..., None])).any(-1)

If the number of trees is not small, you will be better off using a loop over the labels:

peak_mask = np.zeros(labeled_trees.shape, bool)
for t, h in zip(max_trees, max_height):
    peak_mask |= (labeled_trees == t) & (heights == h)
Answered By: Mad Physicist