Why does Tesseract not recognize the text on the licence plate while easyocr does?

Question

I am working on automatic licence plate recognition. I could cropped the Plate from inital image. But, Tesseract does not recognize the text on this plate while easyocr does. What is the reason? Thanks in advance for the answers. I used the code extract the plate from a car and recognize.

import cv2 as cv
import pytesseract
import imutils
import numpy as np
import easyocr
pytesseract.pytesseract.tesseract_cmd = r'C:Program FilesTesseract-OCRtesseract.exe'

img3 = cv.imread("4.png")
cv.imshow("Car", img3)
img3 = cv.cvtColor(img3, cv.COLOR_BGR2RGB)
gray = cv.cvtColor(img3, cv.COLOR_RGB2GRAY)

bfilter_img3 = cv.bilateralFilter(img3, 11, 17, 17)
edged_img3 = cv.Canny(bfilter_img3, 30, 200)

keypoints = cv.findContours(edged_img3.copy(), cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE)
contours = imutils.grab_contours(keypoints)
contours = sorted(contours, key=cv.contourArea, reverse=True)[:10]

location = None
for contour in contours:
    approx = cv.approxPolyDP(contour, 10, True)
    if len(approx) == 4:
        location = approx
        break

mask = np.zeros(gray.shape, np.uint8)
new_img = cv.drawContours(mask, [location], 0, 255, -1)
new_img = cv.bitwise_and(img3, img3, mask=mask)

print(location)
cv.imshow("Plate", new_img)


(x,y)=np.where(mask==255)
(x1,y1)=(np.min(x),np.min(y))
(x2,y2)=(np.max(x),np.max(y))
cropped_img=gray[x1:x2+1, y1:y2+1]
ret, cropped_img=cv.threshold(cropped_img,127,255,cv.THRESH_BINARY)
cv.imshow("Plate3", cropped_img)
cropped_img = cv.resize(cropped_img, None, fx=2/3, fy=2/3, interpolation=cv.INTER_AREA)
#"cropped_img= the plate image in the question"***********
text = pytesseract.image_to_string(cropped_img)
print("Text by tesseract: ",text)

""""
reader=easyocr.Reader(['en'])
text2=reader.readtext(cropped_img)
print(text2)
"""
k = cv.waitKey(0)

Asked By: Mustafa Sonkal

||

Source

Answer 1

I’m kind of curious why did you use bilateralFilter, Canny, findContours etc.? Did you see the result of each method?

Anyway, if you set the page-segmentation-mode to 6 which is:

Assume a single uniform block of text.

The result will be:

34 DUA34

Code:

import cv2
import pytesseract

# Load the image
img = cv2.imread("vHQ5q.jpg")

# Convert to the gray-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# OCR
print(pytesseract.image_to_string(gry, config="--psm 6"))

# Display
cv2.imshow("", gry)
cv2.waitKey(0)

You should know the Page segmentation method.

I’ve got the result using pytesseract-version-0.3.7.

Answered By: Ahx

Answer 2

You can try PaddleOCR, they publish a license plate recognition model and also provide a training guide for license plate recognition.

link: https://github.com/PaddlePaddle/PaddleOCR/blob/dygraph/applications/%E8%BD%BB%E9%87%8F%E7%BA%A7%E8%BD%A6%E7%89%8C%E8%AF%86%E5%88%AB.md

Answered By: zhoujun

Why does Tesseract not recognize the text on the licence plate while easyocr does?

Question:

Answers: