pypdf

Extract annotations by layer from a PDF in Python

Extract annotations by layer from a PDF in Python Question: I have a PDF with annotations (markups) stored in different layers. Each layer has a specific name. I need to extract the annotations with their layer name. In particular, I’m interested only in the location of the annotation (as in, the bounding box of it) …

Total answers: 2

Issue with loading online pdf in python notebook using langchain PyPDFLoader

Issue with loading online pdf in python notebook using langchain PyPDFLoader Question: I am trying to load with python langchain library an online pdf from: http://datasheet.octopart.com/CL05B683KO5NNNC-Samsung-Electro-Mechanics-datasheet-136482222.pdf This is the code that I’m running locally: loader = PyPDFLoader(datasheet_path) pages = loader.load_and_split() Am getting the following error ————————————————————————— PermissionError Traceback (most recent call last) Cell In[4], line …

Total answers: 2

How to extract text using PyPDF2 without the verbose output

How to extract text using PyPDF2 without the verbose output Question: I want to copy the contents from a PDF into a text file. I am able to extract the text using the following code: from PyPDF2 import PdfReader infile = open("input.pdf", ‘rb’) reader = PdfReader(infile) for i in reader.pages: text = i.extract_text() … However, …

Total answers: 1

How to extract print scaling factor from a PDF file in Python?

How to extract print scaling factor from a PDF file in Python? Question: I would like to extract the PDF scaling factor programmatically using Python. Specifically, I want to extract the scale factor that appears in the "Fit" under "Print Sizing & Handling" when printing the PDF file. For example, if the "Fit" dropdown shows …

Total answers: 2

How to replace a word in pdf with Python

How to replace a word in pdf with Python Question: i want to replace a word in a pdf but when i try to do that it always returns me same pdf. Here is my code block. Currentyle i am using pypdf2 but if is there any suggestion i can switch it. What is the …

Total answers: 1

How can PyPDF2 read the correct size of a PDF page

How can PyPDF2 read the correct size of a PDF page Question: I tried to get width and height of pdf with PyPDF2 with w, h = page.mediaBox.getWidth(), page.mediaBox.getHeight() print(w, h) # showing 595 842 However, the pdf is actually 842 X 595(11.7 X 8.27 inch) Asked By: Mas Zero || Source Answers: A PDF …

Total answers: 1

Create a partial pdf from bytes in python

Create a partial pdf from bytes in python Question: I have a pdf file somewhere. This pdf is being send to the destination in equal amount of bytes (apart from the last chunk). Let’s say this pdf file is being read in like this in python: with open(filename, ‘rb’) as file: chunk = file.read(3000) while …

Total answers: 1

how to edit/modify text in PDF

how to edit/modify text in PDF Question: I am working on my final year project, so I working on a website where a user can come and read PDF, I am adding some features such as converting currency to their country currency, I am using flask and pymuPDF for my project and I don’t know …

Total answers: 1