Python3 Key error when Key exists in dictionary of image files

Question

Using python and os to creat dictionary of key values for files in directory, and tensor flow to preprocess images and extract/print text.
End Goal: create a For Loop that takes each image in the directory, appends the filename as string to path in grocery_cve_project, processes each image, and extracts the text to be read in the console

import os
print('os imported')
    
# import packages
from PIL import Image
import pytesseract
import cv2
    
print('packages imported')
    
### Part 1: store image names in dictionary
    
dir_name = ".\grocery_cve_project"
# This is where we get our array
# of file names and store in results
result = os.listdir(dir_name)
    
key_index_store = {}
for i, e in enumerate(result):
    key_index_store[i] = e
    #print(i, e)
    
#print("Our key value store is: ")
#print(key_index_store)
    
#  The types of file names we care about.
photo_extensions = [".jpg", ".png"]

# declare the tesseract executable path
pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files\Tesseract-OCR\tesseract.exe'

Part 2: image processing

for e in key_index_store[e]:
    image_to_ocr = cv2.imread('grocery_cve_project_\%s' % 'e')
    print(image_to_ocr)
        
    # convert to gray
    preprocessed_img = cv2.cvtColor(image_to_ocr, cv2.COLOR_BGR2GRAY)
   
    # step 2: do binary and Otsu thresholding
    preprocessed_img = cv2.threshold(preprocessed_img, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
    
    # step 3: Median Blur to remove noise in image
        preprocessed_img = cv2.medianBlur(preprocessed_img, 3)
    
    '''Step 4: SAVE AND LOAD IMAGE AS PIL image'''
    
    # step 1: Save the processed image to convert to PIL image
    for i in key_index_store[i]:
        cv2.imwrite(("tempdir\temp_img_%s.jpg" % 'i'), preprocessed_img)
        # step 2: load the image as a PIL/Pillow image
        preprocessed__pil_img = Image.open('temp_img.jpg')
    
    # step 1: do OCR of image using Tesseract
    text_extracted = pytesseract.image_to_string(preprocessed__pil_img)
    #Step 2: print the text
    print(text_extracted)

(Grocery_env) D:DocumentsPythonMultiple file array>"1. grocery tesseract.py"
    os imported
    packages imported
    Traceback (most recent call last):
      File "D:DocumentsPythonMultiple file array1. grocery tesseract.py", line 44, in <module>
        for e in key_index_store[e]:
    KeyError: 'file_99.png'

research indicates this error comes up when an item in the dictionary does not exist. However, if I run the code commented out in line 21 print(i, e), it puts out the key/value pairs for all the files in the directory, and ‘file_99’ does exist at index 236, AND physically in the given directory.
the directory for the image files is in the same folder as the source code.

Asked By: BassBarPat

||

Source

Answer 1

In the first part you populate the dictionary with numerical indexes

key_index_store = {}
for i, e in enumerate(result):
    key_index_store[i] = e

This is a bit redundant as your results are already indexed by number.
Then, on second part you iterate over key_index_store[e] its most likely an error, just remove the [e]

Answered By: Florin C.

Answer 2

If I understood your code properly, I think you might be slightly confused about how to extract key/value pairs from dictionaries. But in this case the dict isn’t even necessary.

You could write this all in a single loop:

for idx, filename in enumerate(result):
    image_to_ocr = cv2.imread(os.path.join(dir_name, filename))
    # ... your image processing code ...
    out_filename = os.path.join("tempdir", f"temp_img_{idx}.jpg")
    cv2.imwrite(out_filename, preprocessed_img)
    preprocessed_pil_img = Image.open(out_filename)
    # ... the rest ...

Answered By: Iguananaut

Python3 Key error when Key exists in dictionary of image files

Question:

Part 2: image processing

Answers: