Authenticate Scrapy HTTP Proxy

Question:

I can set an http proxy using request.meta[‘proxy’], but how do I authenticate the proxy?

This does not work to specify user and pass:

request.meta['proxy'] = 'http://user:[email protected]:2222'

From looking around, I may have to send request.headers[‘Proxy-Authorization’], but what format do I send it in?

Asked By: Lionel

||

Answers:

username and password are base64 encoded in the form “username:password”

import base64

# Set the location of the proxy
proxy_string = choice(self._get_proxies_from_file('proxies.txt')) # user:pass@ip:port
proxy_items = proxy_string.split('@')
request.meta['proxy'] = "http://%s" % proxy_items[1]

# setup basic authentication for the proxy
user_pass=base64.encodestring(proxy_items[0])
request.headers['Proxy-Authorization'] = 'Basic ' + user_pass
Answered By: Lionel

The w3lib module has a very convenient function for this usecase.

from w3lib.http import basic_auth_header

request.meta["proxy"] = "http://192.168.1.1:8050"
request.headers["Proxy-Authorization"] = basic_auth_header(proxy_user, proxy_pass)

This is also mentioned in a blog article of Zyte (the maintainers of scrapy)

Answered By: itsmartinhi