How to get PDF file metadata 'Page Size' using Python?

Question:

I try to use PyPDF2 module in Python 3 but I can’t display ‘Page Size’ property.
I would like to know what the sheet of paper dimensions were before scanning to PDF file.

Something like this:

import PyPDF2
pdf=PdfFileReader("sample.pdf","rb")
print(pdf.getNumPages())

But I’m looking for another Python function instead of for example getNumPages()…

This command below prints some kind of metadata but without page size:

pdf_info=pdf.getDocumentInfo()
print(pdf_info)
Asked By: Mirek

||

Answers:

This code should help you:

import PyPDF2
pdf = PyPDF2.PdfFileReader("a.pdf","rb")
p = pdf.getPage(1)

w_in_user_space_units = p.mediaBox.getWidth()
h_in_user_space_units = p.mediaBox.getHeight()

# 1 user space unit is 1/72 inch
# 1/72 inch ~ 0.352 millimeters

w = float(p.mediaBox.getWidth()) * 0.352
h = float(p.mediaBox.getHeight()) * 0.352
Answered By: Perf

When trying the answer of @Perf in 2023 (PyPDF2 3.0.1), a whole lot of deprecation warnings/errors surface. Here’s a more up to date version:

import PyPDF2

pdf = PyPDF2.PdfReader("a.pdf","rb")
page = pdf.pages[1]

width_in_user_space_units = page.mediabox.width
height_in_user_space_units = page.mediabox.height

width_in_mm = float(width_in_user_space_units) * 0.352
height_in_mm = float(height_in_user_space_units) * 0.352

Answered By: creimers
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.