How can PyPDF2 read the correct size of a PDF page

Question:

I tried to get width and height of pdf with PyPDF2 with

w, h = page.mediaBox.getWidth(), page.mediaBox.getHeight()
print(w, h) # showing 595 842

However, the pdf is actually 842 X 595(11.7 X 8.27 inch)
enter image description here

Asked By: Mas Zero

||

Answers:

A PDF page can have a "viewing rotation" which causes the PDF viewer to rotate the page before showing it.

The viewing rotation for this page is 90 degrees or 270 degrees. Use PyPDF2 to find out the rotation, and swap the width and height if the rotation is 90 or 270.

You can normalise the pages to remove the viewing rotations whilst leaving the PDF visually intact with:

cpdf in.pdf -upright -o out.pdf

Answered By: johnwhitington
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.