Generating bounding boxes from heatmap data

Question:

I have the heatmap data for a vehicle detection project I’m working on but I’m at a loss for where to go next. I want to draw a generalized bounding box around the ‘hottest’ portions of the image. My first thought was to just draw a box over all portions that overlap but something’s telling me there is a more accurate way to do this. Any help would be appreciated! Unfortunately my reputation prevents me from posting images. Here’s how I’m creating the heatmap:

# Positive prediction window coordinate structure: ((x1, y1), (x2, y2))
def create_heatmap(bounding_boxes_list):
     # Create a black image the same size as the input data
     heatmap = np.zeros(shape=(375, 1242))
     # Traverse the list of bounding box locations in test image
     for bounding_box in bounding_boxes_list:
          heatmap[bounding_box[0][1]:bounding_box[1][1], bounding_box[0][ 0]:bounding_box[1][0]] += 1

return heatmap

Here’s the link to the heatmap I have

Here’s a general idea of what I had in mind

Asked By: brokenfulcrum

||

Answers:

Otsu’s threshold and contour detection on the binary image should do it. Using this screenshotted image without the axis lines:

enter image description here

enter image description here

import cv2

# Grayscale then Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]

# Find contours
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)

cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.waitKey()
Answered By: nathancy

Another approach is to threshold the heatmap, find connected components, and draw a box around each connected component.

An optional additional step is to downsample before finding connected components. This makes it faster and joins nearby components (resulting in fewer bounding boxes).

Here’s some TensorFlow code I used to do that:

import tensorflow as tf
import tensorflow_addons as tfa
from dataclasses import dataclass
from typing import NewType

TensorIndexVector = NewType('TensorColor', tf.Tensor)  # A vector of indices
TensorLTRBBoxes = NewType('TensorLTRBBoxes', tf.Tensor)  # An array of boxes, specified by (Left, Right, Top, Bottom) pixel


@dataclass
class ConnectedComponentSegmenter(ITensorImageBoxer):
    """ Creates bounding boxes from heatmap """
    pre_pool_factor = 16  # We use downsampling to both reduce compute time and link nearby components
    times_mean_thresh: float = 100  # Heat must be this much times the mean to qualify
    pad: int = 10  # Pad the box by this many pixels (may cause box edge to fall outside image)

    def find_bounding_boxes(self, heatmap: TensorHeatmap) -> Tuple[TensorIndexVector, TensorLTRBBoxes]:
        """ Generate bounding boxes (represented as (box_ids, box_coords) from heatmap) """
        salient_mask = tf.cast(heatmap / tf.reduce_mean(heatmap) > self.times_mean_thresh, tf.int32)
        salient_mask_shrunk = tf_max_downsample(salient_mask, factor=self.pre_pool_factor)
        component_label_image_shrunk = tfa.image.connected_components(salient_mask_shrunk)
        component_label_image = tf.image.resize(component_label_image_shrunk[:, :, None], size=(heatmap.shape[0], heatmap.shape[1]), method=ResizeMethod.NEAREST_NEIGHBOR)[:, :, 0] 
            * salient_mask
        nonzero_ij = tf.where(component_label_image)
        component_indices = tf.gather_nd(component_label_image, nonzero_ij)

        num_segments = tf.reduce_max(component_label_image_shrunk) + 1
        box_lefts = tf.math.unsorted_segment_min(nonzero_ij[:, 1], component_indices, num_segments=num_segments)
        box_tops = tf.math.unsorted_segment_min(nonzero_ij[:, 0], component_indices, num_segments=num_segments)
        box_rights = tf.math.unsorted_segment_max(nonzero_ij[:, 1], component_indices, num_segments=num_segments)
        box_bottoms = tf.math.unsorted_segment_max(nonzero_ij[:, 0], component_indices, num_segments=num_segments)
        boxes = tf.concat([box_lefts[1:, None], box_tops[1:, None], box_rights[1:, None], box_bottoms[1:, None]], axis=1)
        return tf.range(len(boxes)), boxes

enter image description here

Answered By: Peter

Another approach that is not based on connected components or contour-finding is to "square off" each segment in a downward-rightward direction and find the resulting corners.

Code in this answer: https://stackoverflow.com/a/73298520/851699

Answered By: Peter