Convert png/jpg to word file using python

Question:

I need to convert lots of jpg/png files to docx files & then to pdf. My sole concern is to write the data in an image to a pdf file & if I need to edit any text manually, I can do that in word & save it in the corresponding pdf file.

I’ve tried using API but failed as the text is not correctly matching.

My image files contain only texts & not anything else.

I already have docx to pdf conversion code in Python.

from docx2pdf import convert

input = 'INPUT_FILE_NAME.docx'
output = 'OUTPUT_FILE_NAME.pdf'
convert(input)
convert(input, output)
convert("Output")

Kindly suggest me how to convert a png/jpg file to docx. Thanks.

EDIT ————–

I’ve successfully made this code run. I’ve uploaded in my github repo.

Asked By: Nitin Kumar

||

Answers:

Are you want to png text into a text file?

Answered By: Masud Al Nahid
from PIL import Image
from pytesseract import pytesseract

#Define path to tessaract.exe
path_to_tesseract = r'C:Program FilesTesseract-OCRtesseract.exe'


#Define path to image
path_to_image = 'texttoimage.png'

#Point tessaract_cmd to tessaract.exe
pytesseract.tesseract_cmd = path_to_tesseract

#Open image with PIL
img = Image.open(path_to_image)

#Extract text from image
text = pytesseract.image_to_string(img)

print(text)
Answered By: Masud Al Nahid
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.