How to remove noise around numbers using OpenCV

Question

I’m trying to use Tesseract-OCR to get the readings on below images but having issues getting consistent results with the spotted background. I have below configuration on my pytesseract

CONFIG = f"—psm 6 -c tessedit_char_whitelist=01234567890ABCDEFGHIJKLMNOPQRSTUVWXYZÅÄabcdefghijklmnopqrstuvwxyzåäö.,-"

I have also tried below image pre-processing with some good results, but still not perfect results

blur = cv2.blur(img,(4,4))
(T, threshInv) = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)

What I want is to consistently be able to identify the numbers and the decimal separator. What image pre-processing could help in getting consistent results on images as below?

Asked By: MisterButter

||

Source

Answer 1

You may find a solution using a slightly more complex approach by filtering in the frequency domain instead of the spatial domain. The thresholds might require some tweaking depending on how tesseract performs with the output images.

Implementation:

import cv2
import numpy as np
from matplotlib import pyplot as plt

img = cv2.imread('C:\Test\number.jpg', cv2.IMREAD_GRAYSCALE)

# Perform 2D FFT
f = np.fft.fft2(img)
fshift = np.fft.fftshift(f)
magnitude_spectrum = 20*np.log(np.abs(fshift))

# Squash all of the frequency magnitudes above a threshold
for idx, x in np.ndenumerate(magnitude_spectrum):
    if x > 195:
        fshift[idx] = 0

# Inverse FFT back into the real-spatial-domain
f_ishift = np.fft.ifftshift(fshift)
img_back = np.fft.ifft2(f_ishift)
img_back = np.real(img_back)
img_back = cv2.normalize(img_back, None, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_32F)
out_img = np.copy(img)

# Use the inverted FFT image to keep only the black values below a threshold
for idx, x in np.ndenumerate(img_back):
    if x < 100:
        out_img[idx] = 0
    else:
        out_img[idx] = 255

plt.subplot(131),plt.imshow(img, cmap = 'gray')
plt.title('Input Image'), plt.xticks([]), plt.yticks([])
plt.subplot(132),plt.imshow(img_back, cmap = 'gray')
plt.title('Reversed FFT'), plt.xticks([]), plt.yticks([])
plt.subplot(133),plt.imshow(out_img, cmap = 'gray')
plt.title('Output'), plt.xticks([]), plt.yticks([])
plt.show()

Output:

Median Blur Implementation:

import cv2
import numpy as np

img = cv2.imread('C:\Test\number.jpg', cv2.IMREAD_GRAYSCALE)
blur = cv2.medianBlur(img, 3)

for idx, x in np.ndenumerate(blur):
    if x < 20:
        blur[idx] = 0

cv2.imshow("Test", blur)
cv2.waitKey()

Output:

Final Edit:

So using Eumel’s solution and combining this bit of code on the bottom of it yields a 100% successful result:

img[pat_thresh_1==1] = 255
img[pat_thresh_15==1] = 255
img[pat_thresh_2==1] = 255
img[pat_thresh_25==1] = 255
img[pat_thresh_3==1] = 255
img[pat_thresh_35==1] = 255
img[pat_thresh_4==1] = 255

# Eumel's code above this line

img = cv2.erode(img, np.ones((3,3)))

cv2.imwrite("out.png", img)
cv2.imshow("Test", img)

print(pytesseract.image_to_string(Image.open("out.png"), lang='eng', config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789.,'))

Output Image Examples:

Whitelisting the tesseract characters appears to help quite a bit as well to prevent false identification.

Answered By: Abstract

Answer 2

That was a challenge but i think i have an interesting approach: Pattern-matching

If you zoom in, you realize that the pattern in the back only has 4 possible dots, a single full pixle, a double full pixel and a double pixel with a medium left or right. So what i did was grab these 4 patterns from the image with 17.160.000,00 and got to work. Save these to load again, i just grabbed them on the fly

img = cv2.imread('C:/Users/***/17.jpg', cv2.IMREAD_GRAYSCALE)

pattern_1 = img[2:5,1:5]
pattern_2 = img[6:9,5:9]
pattern_3 = img[6:9,11:15]
pattern_4 = img[9:12,22:26]

# just to show it carries over to other pics ;)
img = cv2.imread('C:/Users/****/6.jpg', cv2.IMREAD_GRAYSCALE)

Actual Pattern Matching

Next we match all the patterns and threshold to find all occurrences, i used 0.7 but you can play around with it a little. These patterns take off some pixels on the side and only match a sigle pixel on the left so we pad twice (one with an extra) to hit both for the first 3 patterns. The last one is the single pixel so it doesnt need it

res_1 = cv2.matchTemplate(img,pattern_1,cv2.TM_CCOEFF_NORMED )
thresh_1 = cv2.threshold(res_1,0.7,1,cv2.THRESH_BINARY)[1].astype(np.uint8)
pat_thresh_1 = np.pad(thresh_1,((1,1),(1,2)),'constant')
pat_thresh_15 = np.pad(thresh_1,((1,1),(2,1)), 'constant')
res_2 = cv2.matchTemplate(img,pattern_2,cv2.TM_CCOEFF_NORMED )
thresh_2 = cv2.threshold(res_2,0.7,1,cv2.THRESH_BINARY)[1].astype(np.uint8)
pat_thresh_2 = np.pad(thresh_2,((1,1),(1,2)),'constant')
pat_thresh_25 = np.pad(thresh_2,((1,1),(2,1)), 'constant')
res_3 = cv2.matchTemplate(img,pattern_3,cv2.TM_CCOEFF_NORMED )
thresh_3 = cv2.threshold(res_3,0.7,1,cv2.THRESH_BINARY)[1].astype(np.uint8)
pat_thresh_3 = np.pad(thresh_3,((1,1),(1,2)),'constant')
pat_thresh_35 = np.pad(thresh_3,((1,1),(2,1)), 'constant')
res_4 = cv2.matchTemplate(img,pattern_4,cv2.TM_CCOEFF_NORMED )
thresh_4 = cv2.threshold(res_4,0.7,1,cv2.THRESH_BINARY)[1].astype(np.uint8)
pat_thresh_4 = np.pad(thresh_4,((1,1),(1,2)),'constant')

Editing the Image

Now the only thing left to do is remove all the matches from the image. Since we have a mostly white backround we just set them to 255 to blend in.

img[pat_thresh_1==1] = 255
img[pat_thresh_15==1] = 255
img[pat_thresh_2==1] = 255
img[pat_thresh_25==1] = 255
img[pat_thresh_3==1] = 255
img[pat_thresh_35==1] = 255
img[pat_thresh_4==1] = 255

Output

Edit:

Take a look at Abstracts answer as well for refining this output and tesseract finetuning

Answered By: Eumel

How to remove noise around numbers using OpenCV

Question:

Answers: