Downloading pdf files from a php server using python

Question:

I am trying to download the PDFs (a few can be word files, very rarely) located on a PHP server. It appears that on the server, the PDFs are numbered increasingly from 1 to 14000. The PDFs can be downloaded using the link: http://ppmoe.dot.ca.gov/des/oe/awards/bidsum/dl.php?id=X, where X is a number in the [1, 14000] range. I am using the following code for X = 200, which I can then loop over all the [1, 14000] values to save all the files in a specific folder:

import requests

url = "http://ppmoe.dot.ca.gov/des/oe/awards/bidsum/dl.php?id=200"

s = requests.Session()
response = s.get(url)

with open("file200.pdf", "w") as f:
    f.write(response.content)
    f.close()

But it’s returning the following error:

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
TypeError: write() argument must be str, not bytes

I’m unsure if we can download these files using python, and PHP is unfamiliar to me. Thanks!

Asked By: Pepa

||

Answers:

You need to add b to the argument so it writes the data to the file as binary data (response.content contains bytes, not a string):

with open("file200.pdf", "wb") as f:
    f.write(response.content)
    f.close()
Answered By: Xiddoc
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.