Real time object detection lag

Question

im trying to capture position of license plate with webcam feed using YOLOv4 tiny then input the result to easyOCR to extract the characters. The detection works well in real time, however when i apply the OCR the webcam stream become really laggy. Is there anyway i can improve this code to make it make it less laggy?

my YOLOv4 detection

#detection
while 1:
    #_, pre_img = cap.read()
    #pre_img= cv2.resize(pre_img, (640, 480))
    _, img = cap.read()
    #img = cv2.flip(pre_img,1)
    hight, width, _ = img.shape
    blob = cv2.dnn.blobFromImage(img, 1 / 255, (416, 416), (0, 0, 0), swapRB=True, crop=False)

    net.setInput(blob)

    output_layers_name = net.getUnconnectedOutLayersNames()

    layerOutputs = net.forward(output_layers_name)

    boxes = []
    confidences = []
    class_ids = []

    for output in layerOutputs:
        for detection in output:
            score = detection[5:]
            class_id = np.argmax(score)
            confidence = score[class_id]
            if confidence > 0.7:
                center_x = int(detection[0] * width)
                center_y = int(detection[1] * hight)
                w = int(detection[2] * width)
                h = int(detection[3] * hight)
                x = int(center_x - w / 2)
                y = int(center_y - h / 2)
                boxes.append([x, y, w, h])
                confidences.append((float(confidence)))
                class_ids.append(class_id)

    indexes = cv2.dnn.NMSBoxes(boxes, confidences, .5, .4)

    boxes = []
    confidences = []
    class_ids = []

    for output in layerOutputs:
        for detection in output:
            score = detection[5:]
            class_id = np.argmax(score)
            confidence = score[class_id]
            if confidence > 0.5:
                center_x = int(detection[0] * width)
                center_y = int(detection[1] * hight)
                w = int(detection[2] * width)
                h = int(detection[3] * hight)

                x = int(center_x - w / 2)
                y = int(center_y - h / 2)

                boxes.append([x, y, w, h])
                confidences.append((float(confidence)))
                class_ids.append(class_id)

    indexes = cv2.dnn.NMSBoxes(boxes, confidences, .8, .4)
    font = cv2.FONT_HERSHEY_PLAIN
    colors = np.random.uniform(0, 255, size=(len(boxes), 3))
    if len(indexes) > 0:
        for i in indexes.flatten():
            x, y, w, h = boxes[i]
            label = str(classes[class_ids[i]])
            confidence = str(round(confidences[i], 2))
            color = colors[i]
            cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
           # detection= cv2.rectangle(img, (x, y), (x + w, y + h), color, 2)
            detected_image = img[y:y+h, x:x+w]
            cv2.putText(img, label + " " + confidence, (x, y + 400), font, 2, color, 2)
            #print(detected_image)
            cv2.imshow('detection',detected_image)

            cv2.imwrite('lp5.jpg',detected_image)
            cropped_image = cv2.imread('lp5.jpg')
            cv2.waitKey(5000)
            print("system is waiting")
            result = OCR(cropped_image)
            print(result)

easy OCR function

def OCR(cropped_image):
    reader = easyocr.Reader(['en'], gpu=False)  # what the reader expect from  the image
    result = reader.readtext(cropped_image)
    text = ''
    for result in result:
        text += result[1] + ' '

    spliced = (remove(text))
    return spliced

Asked By: EREN YEETGAR

||

Source

Answer 1

You are essentially saying "the while loop must be fast."
And of course the OCR() call is a bit slow.
Ok, good.

Don’t call OCR() from within the loop.

Rather, enqueue a request,
and let another thread / process / host
worry about the OCR computation,
while the loop quickly continues
upon its merry way.

You could use a threaded Queue,
or a subprocess,
or blast it over to RabbitMQ or Kafka.
The simplest approach would be to
simply overwrite /tmp/cropped_image.png
within the loop,
and have another process notice such
updates and (slowly) call OCR(),
appending the results to a log file.

There might be a couple of updates
to the image file while a single
OCR call is in progress, and that’s fine.
The two are decoupled from one another,
each progressing at their own pace.
Downside of a queue would be OCR
sometimes falling behind — you actually
want to shed load by skipping some
(redundant) cropped images.

The two are racing, and that’s fine.
But take care to do things in atomic
fashion — you wouldn’t want to OCR
an image that starts with one frame
and ends with part of a subsequent
frame.
Write to a temp file and, after close(),
use os.rename() to atomically
make those pixels available under
the name that the OCR daemon
will read from.
Once it has a file descriptor
open for read, it will have no
problem reading to EOF without
interference, the kernel takes
care of that for us.

Answered By: J_H

Answer 2

There are several points.

cv2.waitKey(5000) in your loop causes some delay even though you are pressing a key. So remove it if you are not debugging.
You are saving a detected region into a JPEG image and loading it each time. Do not do that – just pass the cv image(Numpy array) into the OCR module.
EasyOCR is a DNN model based on ResNet, but you are not using a GPU(gpu=False). So you should use GPU. See this benchmark by Liao.
You are creating many EasyOCR Reader instances inside a loop. Create only one instance before the loop and reuse it inside a loop. I think this is the most important bottleneck.

Answered By: relent95

Real time object detection lag

Question:

Answers: