How to apply Connected Component Analysis left to right order in openCV

Question:

i am using connected component analysis to recognize characters from the image. for that i am using
cv2.connectedComponentsWithStats() function. As the output it is getting the characters but without a order.

num_labels, labels, stats, centroids = cv2.connectedComponentsWithStats(img, 8, cv2.CV_32S)

after getting the component dimensions i am previewing it. but the order is randomized.
As it is how to get the components same as in original image order.

actual output order

enter image description here

expected character order

enter image description here

Asked By: Andrew Kaleem

||

Answers:

As @Cris Luengo mentioned, It runs along image rows, left to right, then top to bottom. So it sees first the characters that are taller first. You need to reorder them based on their coordinates.

For example, in the below code, I will get a sample text ‘hello,’ apply it to preprocess and get connected components.

# import the necessary packages
import cv2
from google.colab.patches import cv2_imshow

img = cv2.imread('img.png')
img_bw=cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
cv2_imshow(img_bw)

bw_img

# applies thresh using Otu's method
thresh = cv2.threshold(img_bw, 0, 255,cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
cv2_imshow(thresh)

thresh

# getting connected components
numlabels, labels, stats, centroids = cv2.connectedComponentsWithStats(thresh, 8, cv2.CV_32S)

#with the stats returns cropping the characters from the mask(img which included all detected components)
identified_character_components =[]
for i in range(0,numlabels):

  # skipping 0 due to it outputs the background
  if i!=0:
  
    # identified dimensions unpacking
    x = stats[i, cv2.CC_STAT_LEFT]
    y = stats[i, cv2.CC_STAT_TOP]
    w = stats[i, cv2.CC_STAT_WIDTH]
    h = stats[i, cv2.CC_STAT_HEIGHT]
    a = stats[i, cv2.CC_STAT_AREA]

    component_mask = (labels == i).astype("uint8") * 255
    box_image = component_mask[y:y+h, x:x+w]
    identified_character_components.append((x,box_image)) # adding object pixels and x_axis to sort the order in next steps
    cv2_imshow(box_image)
    print("")

identified componenets

As you can see, it is printed as ‘l l h e o’ since it runs along image rows, left to right, then top to bottom. So it sees first the characters that are taller first. To reorder these identified characters, now it is possible to use the identified_character_components, which has the x-axis and detected character pixels.

#function to get the first element
def takeFirstElm(ele):
    return ele[0]


#function to order the array using the first element(x-axis)  
def reorder_first_index(list):
  return sorted(list,key=takeFirstElm)

ordered_elements = reorder_first_index(identified_character_components)

#removing the x-axis from the elements
ordered_character_components=[]
for element in ordered_elements:
  ordered_character_components.append(element[1])# appending only the image pixels(removing added index in earlier steps)


# printing the ordered images.
for character in ordered_character_components:
  cv2_imshow(character)
  print("")

ordered output img

Now ordered_elements consist of the ordered characters by the x-axis.

Answered By: Nathindu Himansha