How to change annotation features for Vision OCR?

Question:

I’m trying to extract text from a local image with Python and Vision, based off Cloud Vision API: Detect text in images.

This is the function to extract text:

def detect_text(path):
    """Detects text in the file."""
    from google.cloud import vision
    import io
    client = vision.ImageAnnotatorClient()

with io.open(path, 'rb') as image_file:
        content = image_file.read()

image = vision.Image(content=content)

response = client.text_detection(image=image)
    texts = response.text_annotations

It works, but I’d like to specify the use of features like TEXT_DETECTION instead of the default DOCUMENT_TEXT_DETECTION feature, as well as specify language hints. How would I do that? The text_detection function doesn’t seem to take such parameters.

Asked By: Tom Tolland

||

Answers:

The following article explains it, scroll down to the ‘Creating the Application’ section.

You need to add a request object to your code

request = {
   "image": {
      "source": {
         "image_uri": "IMAGE_URL"
      }
   },    
   "features": [
      {
         "type": "TEXT_DETECTION"
      }
   ]
   "imageContext": {
     "languageHints": ["en-t-i0-handwrit"]
   }
}

Then past it in to the request.

response = client.annotate_image(request)
Answered By: BeeFriedman

Alternatively you can request language hints by adding image_context object:

response = client.text_detection(image=image,
image_context={"language_hints": ["en"]})
Answered By: Nestor Ceniza Jr