Py2PDF PdfFileWriter – Splitting PDF is appending files rather than saving own file

Question:

I have a dictionary with about 30 key/values of a name and page number. I am looping through a PDF and trying to get the page number in the dictionary and pull that page out and then save it as it’s own file.

It seems to be doing most of what I want, but rather than saving the file with it’s own page, it is keeping the previous looped file open and then adding a new page to the file and re-saving it with the new name.

How do I get the file to save each page that I loop through as it’s own file rather than appending it to the previous file?

 reader = PdfFileReader(infile)
 writer = PdfFileWriter()

 for x, y in page_list.items():
    with open(x+'.pdf', 'wb') as outfile:
       writer.addPage(reader.getPage(y-1))
       writer.write(outfile)    
Asked By: George M

||

Answers:

You should re-instantiate the Writer in the loop, as shown below

reader = PdfFileReader(infile)

for x, y in page_list.items():
    writer = PdfFileWriter()
    with open(x+'.pdf', 'wb') as outfile:
       writer.addPage(reader.getPage(y-1))
       writer.write(outfile)

Keeping the writer instantiated outside the loop, will result in splitting the pages.

Alternatively, a faster approach can be using pdftk burst input.pdf

Answered By: SIDDHARTH SAHANI
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.