pdf-extraction

is there any way to find text using dimensions using pymupdf?

is there any way to find text using dimensions using pymupdf? Question: import fitz doc = fitz.open("" List item ) for page in doc: print(page.search_for("Bank Account")) this program is for get dimensions of given text. i want to do reverse of it, find text using its dimensions. Asked By: chintan bhimani || Source Answers: import …

Total answers: 2

How to improve Hindi text extraction?

How to improve Hindi text extraction? Question: I am trying to extract Hindi text from a PDF. I tried all the methods to exract from the PDF, but none of them worked. There are explanations why it doesn’t work, but no answers as such. So, I decided to convert the PDF to an image, and …

Total answers: 3

How to extract text from pdf in Python 3.7

How to extract text from pdf in Python 3.7 Question: I am trying to extract text from a PDF file using Python. My main goal is I am trying to create a program that reads a bank statement and extracts its text to update an excel file to easily record monthly spendings. Right now I …

Total answers: 10

How to check if PDF is scanned image or contains text

How to check if PDF is scanned image or contains text Question: I have a large number of files, some of them are scanned images into PDF and some are full/partial text PDF. Is there a way to check these files to ensure that we are only processing files which are scanned images and not …

Total answers: 12