How to replace a word in pdf with Python
Question:
i want to replace a word in a pdf but when i try to do that it always returns me same pdf. Here is my code block. Currentyle i am using pypdf2 but if is there any suggestion i can switch it. What is the missing part at my code?
with open(file_path, 'rb') as file:
pdf_reader = PdfFileReader(file)
# Encrypt the word in the PDF content
encrypted_word = self.cipher.encrypt(word_to_encrypt_bytes)
encrypted_word_b64 = base64.b64encode(encrypted_word)
# Write the encrypted PDF content to a new PDF file
pdf_writer = PdfFileWriter()
for i in range(pdf_reader.getNumPages()):
page = pdf_reader.getPage(i)
page_content = page.extractText()
page_content_b = page_content.encode('utf-8')
page_content_b = page_content_b.replace(word_to_encrypt.encode(), encrypted_word_b64)
page_content = page_content_b.decode('utf-8')
pdf_writer.addPage(page)
output_path = os.path.join(file_dir, file_name_without_ext + '_encryptedm' + ext)
with open(output_path, 'wb') as output_file:
pdf_writer.write(output_file)
I want to place a word in my pdf.
Answers:
Forgive me when suggesting a solutiuon with PyMuPDF. Example page text:
Suppose we want to correct the misspelling, we can use this snippet:
In [1]: import fitz # PyMuPDF
...
In [9]: doc=fitz.open("test.pdf")
In [10]: page=doc[0]
In [11]: words=page.get_text("words") # extract variant by single words
...
In [13]: for word in words:
...: if word[4] == "Currentyle":
...: page.add_redact_annot(word[:4],text="Currently")
...:
In [14]: page.apply_redactions()
Out[14]: True
In [15]: doc.ez_save("text-replaced.pdf")
i want to replace a word in a pdf but when i try to do that it always returns me same pdf. Here is my code block. Currentyle i am using pypdf2 but if is there any suggestion i can switch it. What is the missing part at my code?
with open(file_path, 'rb') as file:
pdf_reader = PdfFileReader(file)
# Encrypt the word in the PDF content
encrypted_word = self.cipher.encrypt(word_to_encrypt_bytes)
encrypted_word_b64 = base64.b64encode(encrypted_word)
# Write the encrypted PDF content to a new PDF file
pdf_writer = PdfFileWriter()
for i in range(pdf_reader.getNumPages()):
page = pdf_reader.getPage(i)
page_content = page.extractText()
page_content_b = page_content.encode('utf-8')
page_content_b = page_content_b.replace(word_to_encrypt.encode(), encrypted_word_b64)
page_content = page_content_b.decode('utf-8')
pdf_writer.addPage(page)
output_path = os.path.join(file_dir, file_name_without_ext + '_encryptedm' + ext)
with open(output_path, 'wb') as output_file:
pdf_writer.write(output_file)
I want to place a word in my pdf.
Forgive me when suggesting a solutiuon with PyMuPDF. Example page text:
Suppose we want to correct the misspelling, we can use this snippet:
In [1]: import fitz # PyMuPDF
...
In [9]: doc=fitz.open("test.pdf")
In [10]: page=doc[0]
In [11]: words=page.get_text("words") # extract variant by single words
...
In [13]: for word in words:
...: if word[4] == "Currentyle":
...: page.add_redact_annot(word[:4],text="Currently")
...:
In [14]: page.apply_redactions()
Out[14]: True
In [15]: doc.ez_save("text-replaced.pdf")