How can I set the PDF version with PyPDF2?

Question:

I’m using PyPDF2 1.4 and Python 2.7:

How can I change the PDF version from a file?

What I tried

my_input_filename.pdf is PDF version 1.5, but _my_output_filename.pdf is a 1.3 PDF, I want to keep 1.5 in the output:

from PyPDF2 import PdfFileWriter, PdfFileReader
from PyPDF2.generic import NameObject, createStringObject

input_filename = 'my_input_filename.pdf'

# Read input PDF file
inputPDF = PdfFileReader(open(input_filename, 'rb'))
info = inputPDF.documentInfo

for i in xrange(inputPDF.numPages):
    # Create output PDF
    outputPDF = PdfFileWriter()
    # Create dictionary for output PDF
    infoDict = outputPDF._info.getObject()
    # Update output PDF metadata with input PDF metadata
    for key in info:
        infoDict.update({NameObject(key): createStringObject(info[key])})
    outputPDF.addPage(inputPDF.getPage(i))

with open(output_filename , 'wb') as outputStream:
    outputPDF.write(outputStream)
    

Answers:

PyPDF2 in its current versions can’t produce anything but files with a PDF1.3 header; from the official source code :
class PdfFileWriter(object):

    """
    This class supports writing PDF files out, given pages produced by another
    class (typically :class:`PdfFileReader<PdfFileReader>`).
    """
    def __init__(self):
        self._header = b_("%PDF-1.3")
        ...

If that is legal, considering it gives you the ability to feed in >1.3 things, is questionable.

If you want to just fix the version string in the header (I don’t know which consequences that would have, so I assume you know more about the PDF standard than I do!)

from PyPDF2.utils import b_
...
outputPDF._header.replace(b_("PDF-1.3"),b_("PDF-1.5"))

or something of the like.

Answered By: Marcus Müller

Going to add to Marcus’ answer above:

There’s (currently – I can’t speak for when Marcus wrote his post) nothing stopping you from specifying the version in the metadata using standard PyPDF2 addMetadata function. The example below is using PdfFileMerger (as I’ve recently being doing some cleanup of PDF metadata on existing files), but PdfFileWriter has the same function:

from PyPDF2 import PdfFileMerger

# Define file input/output, and metadata containing version string.
# Using separate input/output files, since it's worth keeping a copy of the originals!
fileIn = 'foo.pdf'
fileOut = 'bar.pdf'
metadata = {
    u'/Version': 'PDF-1.5'
}

# Set up PDF file merger, copy existing file contents into merger object.
merger = PdfFileMerger()

with open( fileIn, 'rb') as fh_in:
    merger.append(fh_in)

# Append metadata to PDF content in merger.
merger.addMetadata(metadata)

# Write new PDF file with appended metadata to output
# CAUTION: This will overwrite any existing files without prompt!
with open( fileOut, 'wb' ) as fh_out:
    merger.write(fh_out)
Answered By: Rohaq

If I open the output of a file I ran through PyPDF2 it is showing version 1.7 8x, changing 1.3 to 1.5 or whatever doesn’t make a difference

Answered By: Audio DiWHY
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.