Python3 Key error when Key exists in dictionary of image files

Question:

  • Using python and os to creat dictionary of key values for files in directory, and tensor flow to preprocess images and extract/print text.

  • End Goal: create a For Loop that takes each image in the directory, appends the filename as string to path in grocery_cve_project, processes each image, and extracts the text to be read in the console

import os
print('os imported')
    
# import packages
from PIL import Image
import pytesseract
import cv2
    
print('packages imported')
    
### Part 1: store image names in dictionary
    
dir_name = ".\grocery_cve_project"
# This is where we get our array
# of file names and store in results
result = os.listdir(dir_name)
    
key_index_store = {}
for i, e in enumerate(result):
    key_index_store[i] = e
    #print(i, e)
    
#print("Our key value store is: ")
#print(key_index_store)
    
#  The types of file names we care about.
photo_extensions = [".jpg", ".png"]
# declare the tesseract executable path
pytesseract.pytesseract.tesseract_cmd = 'C:\Program Files\Tesseract-OCR\tesseract.exe'

Part 2: image processing

for e in key_index_store[e]:
    image_to_ocr = cv2.imread('grocery_cve_project_\%s' % 'e')
    print(image_to_ocr)
        
    # convert to gray
    preprocessed_img = cv2.cvtColor(image_to_ocr, cv2.COLOR_BGR2GRAY)
   
    # step 2: do binary and Otsu thresholding
    preprocessed_img = cv2.threshold(preprocessed_img, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
    
    # step 3: Median Blur to remove noise in image
        preprocessed_img = cv2.medianBlur(preprocessed_img, 3)
    
    '''Step 4: SAVE AND LOAD IMAGE AS PIL image'''
    
    # step 1: Save the processed image to convert to PIL image
    for i in key_index_store[i]:
        cv2.imwrite(("tempdir\temp_img_%s.jpg" % 'i'), preprocessed_img)
        # step 2: load the image as a PIL/Pillow image
        preprocessed__pil_img = Image.open('temp_img.jpg')
    
    # step 1: do OCR of image using Tesseract
    text_extracted = pytesseract.image_to_string(preprocessed__pil_img)
    #Step 2: print the text
    print(text_extracted)
(Grocery_env) D:DocumentsPythonMultiple file array>"1. grocery tesseract.py"
    os imported
    packages imported
    Traceback (most recent call last):
      File "D:DocumentsPythonMultiple file array1. grocery tesseract.py", line 44, in <module>
        for e in key_index_store[e]:
    KeyError: 'file_99.png'
  • research indicates this error comes up when an item in the dictionary does not exist. However, if I run the code commented out in line 21 print(i, e), it puts out the key/value pairs for all the files in the directory, and ‘file_99’ does exist at index 236, AND physically in the given directory.

  • the directory for the image files is in the same folder as the source code.

Asked By: BassBarPat

||

Answers:

In the first part you populate the dictionary with numerical indexes

key_index_store = {}
for i, e in enumerate(result):
    key_index_store[i] = e

This is a bit redundant as your results are already indexed by number.
Then, on second part you iterate over key_index_store[e] its most likely an error, just remove the [e]

Answered By: Florin C.

If I understood your code properly, I think you might be slightly confused about how to extract key/value pairs from dictionaries. But in this case the dict isn’t even necessary.

You could write this all in a single loop:

for idx, filename in enumerate(result):
    image_to_ocr = cv2.imread(os.path.join(dir_name, filename))
    # ... your image processing code ...
    out_filename = os.path.join("tempdir", f"temp_img_{idx}.jpg")
    cv2.imwrite(out_filename, preprocessed_img)
    preprocessed_pil_img = Image.open(out_filename)
    # ... the rest ...
Answered By: Iguananaut
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.