How to replace a word in pdf with Python

Question:

i want to replace a word in a pdf but when i try to do that it always returns me same pdf. Here is my code block. Currentyle i am using pypdf2 but if is there any suggestion i can switch it. What is the missing part at my code?

  with open(file_path, 'rb') as file:
        pdf_reader = PdfFileReader(file)
        # Encrypt the word in the PDF content
        encrypted_word = self.cipher.encrypt(word_to_encrypt_bytes)
        encrypted_word_b64 = base64.b64encode(encrypted_word)
        # Write the encrypted PDF content to a new PDF file
        pdf_writer = PdfFileWriter()
        for i in range(pdf_reader.getNumPages()):
            page = pdf_reader.getPage(i)
            page_content = page.extractText()
            page_content_b = page_content.encode('utf-8')
            page_content_b = page_content_b.replace(word_to_encrypt.encode(), encrypted_word_b64)
            page_content = page_content_b.decode('utf-8')
            pdf_writer.addPage(page)

        output_path = os.path.join(file_dir, file_name_without_ext + '_encryptedm' + ext)
        with open(output_path, 'wb') as output_file:
            pdf_writer.write(output_file)

I want to place a word in my pdf.

Asked By: dsDjango

||

Answers:

Forgive me when suggesting a solutiuon with PyMuPDF. Example page text:enter image description here

Suppose we want to correct the misspelling, we can use this snippet:

In [1]: import fitz  # PyMuPDF
...
In [9]: doc=fitz.open("test.pdf")
In [10]: page=doc[0]
In [11]: words=page.get_text("words")  # extract variant by single words
...
In [13]: for word in words:
    ...:     if word[4] == "Currentyle":
    ...:         page.add_redact_annot(word[:4],text="Currently")
    ...:
In [14]: page.apply_redactions()
Out[14]: True
In [15]: doc.ez_save("text-replaced.pdf")

Gives us this:
enter image description here

Answered By: Jorj McKie
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.