Given a list of rectangles, how to find all rectangles that are fully contained within other ones?

Question:

I’ve got a computer vision algorithm that puts bounding boxes around detected objects. The bounding boxes list goes as follows:

bounding_boxes = [[x, y, w, h], [x2, y2, w2, h2], ...]

Where x and y are coordinates of top left corner, h and w are height and width of the box. However, I’m not interested in boxes that are fully contained within any other larger boxes. What’s an efficient method to detect those?

Asked By: megashigger

||

Answers:

As you confirmed in the comments under the question, you need to identify and remove the boxes that contains in a single other box. If a box is contained in a union of other boxes, but no other single box contains it, then it should not be removed (e.g. in the case boxes = [[0, 0, 2, 4], [1, 1, 3, 3], [2, 0, 4, 4]], the second box is contained in the union of the first and the third, but it should not be removed).


The naive (brute-force) algorithm for this task is very simple. Here is the pseudocode:

for i in [0, 1, ..., n]:
    for j in [i+1, i+2, ..., n]:
        check if box[i] contains in box[j] and otherwise.

The complexity of this algorithm is obviously O(n^2). This algorithm is very easy to implement and is recommended if the number of boxes is small (around 100-500, or even 1000 if you don’t need a real-time performance for video processing).


The complexity of the fast algorithm is O(n log n), which I believe is also the minimal theoretical complexity for this problem. Formally, the required algorithm takes the following input and returns the following output:

Input: boxes[] - Array of n Rectangles, Tuples of (x1, y1, x2, y2), where 
                 (x1, y1) is coordinates of the left bottom corner, (x2, y2)
                 is the coordinates of the top right corner.
Output: inner_boxes[] - Array of Rectangles that should be removed.

The pseudocode for the fast algorithm:

1) Allocate an Array events[] with the length 2*n, the elements of which are 
   Tuples (y, corresponding_box_index, event). 

2) For i in [0, 1, ..., n]:
     events[2 * i    ] = Tuple(boxes[i].y1, i, 'push')
     events[2 * i + 1] = Tuple(boxes[i].y2, i, 'pop')

3) Sort events[] by the ascending of y coordinate (from smaller to larger).
   If there are equal y coordinates, Then:
   - Tuples with 'pop' event are smaller thant Tuples with 'push' event.
   - If two Tuples has the same event, they are sorted by the ascending of
     the width of their corresponding boxes.

4) Create a Map cross_section_map[], that maps a Key (Value) x to a Tuple
   (corresponding_box_index, type), where type can be either 'left' or 'right'.
   Make sure that the 'insert' and 'erase' operation of this data structure 
   has the complexity O(log n), it is iterable, the elements are iterated in 
   an key-ascending manner, and you can search for a key in O(log n) time.

5) For step in [0, 1, ..., 2*n]:
     If events[step].event is 'push':
       - Let i = events[step].corresponding_box_index
       - Insert a map boxes[i].x1 -> (i, 'left') to cross_section_map[]
       - Insert a map boxes[i].x2 -> (i, 'right') to cross_section_map[]
       - Search for a 'right'-typed key with x value no less than boxes[i].x2
       - Iterate from that key until you found a key, which corresponds to
         a box that contains boxes[i], or the x1 coordinate of which is larger
         than the x1 coordinate of a newly added box. In the first case, add
         boxes[i] to inner_boxes[].
     If events[step].event is 'pop':
       - Let i = events[step].corresponding_box_index
       - Erase the elements with the keys boxes[i].x1 and boxes[i].x2

Now, the tricky part is step (4) of this algorithm. It may seem to be hard to implement such a data structure. However, there is a wonderful implementation out-of-the-box in C++ standard library, called std::map. The search operations that works in O(log n) are std::map::lower_bound and std::map::upper_bound.

This algorithm has an average complexity of O(n log n), worst-case complexity of O(n^2) and, if the number of boxes and their sizes are relatively small comparing to the image size, the complexity is near to O(n).

Answered By: hav4ik
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.