Creating and then modifying pdf file in python

Question:

I am writing some code that merges some pdfs from their file paths and then writes some text on each page of the merged document. My problem is this: I can do both things separately – merge pdfs and write text to a pdf – I just cant seem to do it all in one go.

My code is below – the pdfs are merged together from their file paths contained in an excel workbook, they are then saved as a single pdf with file name obtained from the workbook (this will change depending on what pdfs are merged so it needs to be dynamic) and I am then attempting to write text (a question number) to this merged pdf.

I keep getting error "cannot save with zero pages" and not sure why this is so as I can saved the merged file, and I can write the desired text to any other pdf with function I made if I pass the document file path into it. Any ideas on how I can merge these pdfs into a single file, then edit it with the inserted text and save it with the chosen file name from the excel doc? Hopefully you get what I mean!

from pypdf import PdfMerger

def insert_qu_numbers(document):
    qu_numbers = fitz.open(document)
    counter = 0
    for page in qu_numbers:
        page.clean_contents()
        counter += 1
        text = f"Question {counter}"
        text_length = fitz.get_text_length(text, fontname= "times-roman")
        print(text_length)
        rect_x0 = 70
        rect_y0 = 50
        rect_x1 = rect_x0 + text_length + 35
        rect_y1 = rect_y0 + 40

        rect = fitz.Rect(rect_x0, rect_y0, rect_x1, rect_y1)
        page.insert_textbox(rect, text, fontsize = 16, fontname = "times-roman", align = 0)
    qu_numbers.write()

    # opens the workbook and gets the file paths.
    wbxl = xw.Book('demo.xlsm')
    get_links = wbxl.sheets['Sheet1'].range('C2:C5').value

    # gets rid of any blank cells in range and makes a list of all the file paths called 
    filenames
    filenames = []
    for file in get_links:
        if file is not None:
            filenames.append(file)

    # gets each file path from filename list and adds it to merged pdf where it will be 
    merged
    merged_pdf = PdfMerger()
    for i in range(len(filenames)):
        merged_pdf.append(filenames[i], 'rb')

    # merges separate file paths into one pdf and names it the output name in the given 
    cell
    output_name = wbxl.sheets['Sheet1'].range('C7').value
    final = merged_pdf.write(output_name + ".pdf")

    insert_qu_numbers(final)
Asked By: Jon Percival

||

Answers:

You can use PyMuPDF for merging and modifcation as well:

# filelist = list of files to merge
doc = fitz.open()  # the output to receive the merged PDFs
for file in filelist:
    src = fitz.open(file)
    doc.insert_pdf(src)  # append input file
    src.close()

for page in doc:  # now iterate through the pages of the result
    page.insert_text(...blahblah ...)  # insert text or whatever was on your mind
doc.ez_save("output.pdf")
Answered By: Jorj McKie
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.