How to detect figures in a paper news image in Python?

Question:

So i have this project in Python (Computer Vision), which is seperating text from figures of an image (like a paper news image).

My question is what’s the best way to detect those figures in the paper ? (in Python).

Paper image example : Paper .

Haven’t try anything. I have no idea ..

Asked By: Hamid Khellaf

||

Answers:

I would get started with the OpenCV module in Python, as it has a lot of really useful tools for image recognition. I’ll link it here:

https://pypi.org/project/opencv-python/

https://github.com/opencv

Got to the first link to download the module package, and then check out the github link if you need help or have any issues.

Answered By: NeonSilver2

you can use image segmentation approach. Use connected components labelling algorithm so that all the text and images are detected as components. The components with larger area than a particular threshold can be detected as images in the paper. The connectedcomponentswithstats method can help to get components and get area of all components.

Hope this helps.

Answered By: Archie
import cv2
import numpy as np

# Read the image
image = cv2.imread('paper-news.png')

# Convert to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Blur the image
blurred = cv2.GaussianBlur(gray, (5, 5), 0)

canny = cv2.Canny(blurred, 30, 150)

# Find contours in the image
contours, hierarchy = cv2.findContours(canny.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

# Iterate over the contours
for contour in contours:
    # Get the rectangle bounding the contour
    # Draw the rectangle
    cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)

# Show the image
cv2.imshow('Image with Figures Detected', image)
cv2.waitKey(0)

this will help you.

  1. Simple way would be to detect the region of text using this resource.

Detect text region in image using Opencv

  1. Then, do white color background thresholding and blob detection in the remaining region to find the images using this resource.

Detecting and counting blobs/connected objects with opencv

Answered By: bharathrajad

I found layout-parser python toolkit which is very helpful for your project.

Layout Parser is a unified toolkit for Deep Learning Based Document Image Analysis.

With the help of Deep Learning, layoutparser supports the analysis very complex documents and processing of the hierarchical structure in the layouts.

Check this complete notebook example on detecting newspaper layouts (separating images and text regions on the newspaper image)

it’s recommended to use Jupyter notebook on Linux or macOS because layout-parser isn’t supported on windows OS, or you can use Google Colab which I used for direct running of the toolkit.

Requirements for installing the toolkit

pip install layoutparser # Install the base layoutparser library with  
pip install "layoutparser[layoutmodels]" # Install DL layout model toolkit 
pip install "layoutparser[ocr]" # Install OCR toolkit

Then installing the detectron2 model backend dependencies

pip install layoutparser torchvision && pip install "git+https://github.com/facebookresearch/[email protected]#egg=detectron2"    

Running the toolkit on newspaper image

import layoutparser as lp
import cv2

# Convert the image from BGR (cv2 default loading style)
# to RGB
image = cv2.imread("test.jpg")
image = image[..., ::-1] 

# Load the deep layout model from the layoutparser API 
# For all the supported model, please check the Model 
# Zoo Page: https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html       
model = lp.models.Detectron2LayoutModel('lp://PrimaLayout/mask_rcnn_R_50_FPN_3x/config', 
                                 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.7],
                                 label_map={1:"TextRegion", 2:"ImageRegion", 3:"TableRegion", 4:"MathsRegion", 5:"SeparatorRegion", 6:"OtherRegion"})
    
# Detect the layout of the input image
layout = model.detect(image)
   
# Show the detected layout of the input image
lp.draw_box(image, layout, box_width=3)
    

newspaper layouts detection

From the result image you can see text layouts regions in orange box and image layouts regions (figure) in white box. It’s amazing deep learning toolkit for image recognition.

Answered By: Oghli