How to make python Requests work via SOCKS proxy
Question:
I’m using the great Requests library in my Python script:
import requests
r = requests.get("http://example.com")
print(r.text)
I would like to use a SOCKS proxy, how can I do that? Requests seems to only support HTTP proxies.
Answers:
The modern way:
pip install -U requests[socks]
then
import requests
resp = requests.get('http://go.to',
proxies=dict(http='socks5://user:pass@host:port',
https='socks5://user:pass@host:port'))
# SOCKS5 proxy for HTTP/HTTPS
proxiesDict = {
'http' : "socks5://1.2.3.4:1080",
'https' : "socks5://1.2.3.4:1080"
}
# SOCKS4 proxy for HTTP/HTTPS
proxiesDict = {
'http' : "socks4://1.2.3.4:1080",
'https' : "socks4://1.2.3.4:1080"
}
# HTTP proxy for HTTP/HTTPS
proxiesDict = {
'http' : "1.2.3.4:1080",
'https' : "1.2.3.4:1080"
}
You need install pysocks , my version is 1.0 and the code works for me:
import socket
import socks
import requests
ip='localhost' # change your proxy's ip
port = 0000 # change your proxy's port
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, ip, port)
socket.socket = socks.socksocket
url = u'http://ajax.googleapis.com/ajax/services/search/images?v=1.0&q=inurl%E8%A2%8B'
print(requests.get(url).text)
I installed pysocks and monkey patched create_connection in urllib3, like this:
import socks
import socket
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS4, "127.0.0.1", 1080)
def create_connection(address, timeout=socket._GLOBAL_DEFAULT_TIMEOUT,
source_address=None, socket_options=None):
"""Connect to *address* and return the socket object.
Convenience function. Connect to *address* (a 2-tuple ``(host,
port)``) and return the socket object. Passing the optional
*timeout* parameter will set the timeout on the socket instance
before attempting to connect. If no *timeout* is supplied, the
global default timeout setting returned by :func:`getdefaulttimeout`
is used. If *source_address* is set it must be a tuple of (host, port)
for the socket to bind as a source address before making the connection.
An host of '' or port 0 tells the OS to use the default.
"""
host, port = address
if host.startswith('['):
host = host.strip('[]')
err = None
for res in socket.getaddrinfo(host, port, 0, socket.SOCK_STREAM):
af, socktype, proto, canonname, sa = res
sock = None
try:
sock = socks.socksocket(af, socktype, proto)
# If provided, set socket level options before connecting.
# This is the only addition urllib3 makes to this function.
urllib3.util.connection._set_socket_options(sock, socket_options)
if timeout is not socket._GLOBAL_DEFAULT_TIMEOUT:
sock.settimeout(timeout)
if source_address:
sock.bind(source_address)
sock.connect(sa)
return sock
except socket.error as e:
err = e
if sock is not None:
sock.close()
sock = None
if err is not None:
raise err
raise socket.error("getaddrinfo returns an empty list")
# monkeypatch
urllib3.util.connection.create_connection = create_connection
As soon as python requests
will be merged with SOCKS5
pull request it will do as simple as using proxies
dictionary:
Update: PR was already merged.
#proxy
# SOCKS5 proxy for HTTP/HTTPS
proxies = {
'http' : "socks5://myproxy:9191",
'https' : "socks5://myproxy:9191"
}
#headers
headers = {
}
url='http://example.com/'
res = requests.get(url, headers=headers, proxies=proxies)
Another options, in case that you cannot wait request
to be ready, when you cannot use requesocks
– like on GoogleAppEngine due to the lack of pwd
built-in module, is to use PySocks that was mentioned above:
- Grab the
socks.py
file from the repo and put a copy in your root folder;
- Add
import socks
and import socket
At this point configure and bind the socket before using with urllib2
– in the following example:
import urllib2
import socket
import socks
socks.set_default_proxy(socks.SOCKS5, "myprivateproxy.example",port=9050)
socket.socket = socks.socksocket
res=urllib2.urlopen(url).read()
In case someone has tried all of these older answers, and is still running into problems like:
requests.exceptions.ConnectionError:
SOCKSHTTPConnectionPool(host='myhost', port=80):
Max retries exceeded with url: /my/path
(Caused by NewConnectionError('<requests.packages.urllib3.contrib.socks.SOCKSConnection object at 0x106812bd0>:
Failed to establish a new connection:
[Errno 8] nodename nor servname provided, or not known',))
It may be because, by default, requests
is configured to resolve DNS queries on the local side of the connection.
Try changing your proxy URL from socks5://proxyhost:1234
to socks5h://proxyhost:1234
. Note the extra h
(it stands for hostname resolution).
The PySocks package module default is to do remote resolution, and I’m not sure why requests made their integration this obscurely divergent, but here we are.
I could do this on Linux.
$ pip3 install --user 'requests[socks]'
$ https_proxy=socks5://<hostname or ip>:<port> python3 -c
> 'import requests;print(requests.get("https://httpbin.org/ip").text)'
You can just run your script with https_proxy
environment variable.
- Install socks support if it necessary.
pip install PySocks
pip install pysocks5
- Setup environment variable
export https_proxy=socks5://<hostname or ip>:<port>
- Run your script. This example makes request using proxy and shows IP-address:
echo Your real IP
python -c 'import requests;print(requests.get("http://ipinfo.io/ip").text)'
echo IP with socks-proxy
python -c 'import requests;print(requests.get("https://ipinfo.io/ip").text)'
I’m using the great Requests library in my Python script:
import requests
r = requests.get("http://example.com")
print(r.text)
I would like to use a SOCKS proxy, how can I do that? Requests seems to only support HTTP proxies.
The modern way:
pip install -U requests[socks]
then
import requests
resp = requests.get('http://go.to',
proxies=dict(http='socks5://user:pass@host:port',
https='socks5://user:pass@host:port'))
# SOCKS5 proxy for HTTP/HTTPS
proxiesDict = {
'http' : "socks5://1.2.3.4:1080",
'https' : "socks5://1.2.3.4:1080"
}
# SOCKS4 proxy for HTTP/HTTPS
proxiesDict = {
'http' : "socks4://1.2.3.4:1080",
'https' : "socks4://1.2.3.4:1080"
}
# HTTP proxy for HTTP/HTTPS
proxiesDict = {
'http' : "1.2.3.4:1080",
'https' : "1.2.3.4:1080"
}
You need install pysocks , my version is 1.0 and the code works for me:
import socket
import socks
import requests
ip='localhost' # change your proxy's ip
port = 0000 # change your proxy's port
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5, ip, port)
socket.socket = socks.socksocket
url = u'http://ajax.googleapis.com/ajax/services/search/images?v=1.0&q=inurl%E8%A2%8B'
print(requests.get(url).text)
I installed pysocks and monkey patched create_connection in urllib3, like this:
import socks
import socket
socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS4, "127.0.0.1", 1080)
def create_connection(address, timeout=socket._GLOBAL_DEFAULT_TIMEOUT,
source_address=None, socket_options=None):
"""Connect to *address* and return the socket object.
Convenience function. Connect to *address* (a 2-tuple ``(host,
port)``) and return the socket object. Passing the optional
*timeout* parameter will set the timeout on the socket instance
before attempting to connect. If no *timeout* is supplied, the
global default timeout setting returned by :func:`getdefaulttimeout`
is used. If *source_address* is set it must be a tuple of (host, port)
for the socket to bind as a source address before making the connection.
An host of '' or port 0 tells the OS to use the default.
"""
host, port = address
if host.startswith('['):
host = host.strip('[]')
err = None
for res in socket.getaddrinfo(host, port, 0, socket.SOCK_STREAM):
af, socktype, proto, canonname, sa = res
sock = None
try:
sock = socks.socksocket(af, socktype, proto)
# If provided, set socket level options before connecting.
# This is the only addition urllib3 makes to this function.
urllib3.util.connection._set_socket_options(sock, socket_options)
if timeout is not socket._GLOBAL_DEFAULT_TIMEOUT:
sock.settimeout(timeout)
if source_address:
sock.bind(source_address)
sock.connect(sa)
return sock
except socket.error as e:
err = e
if sock is not None:
sock.close()
sock = None
if err is not None:
raise err
raise socket.error("getaddrinfo returns an empty list")
# monkeypatch
urllib3.util.connection.create_connection = create_connection
As soon as python requests
will be merged with SOCKS5
pull request it will do as simple as using proxies
dictionary:
Update: PR was already merged.
#proxy
# SOCKS5 proxy for HTTP/HTTPS
proxies = {
'http' : "socks5://myproxy:9191",
'https' : "socks5://myproxy:9191"
}
#headers
headers = {
}
url='http://example.com/'
res = requests.get(url, headers=headers, proxies=proxies)
Another options, in case that you cannot wait request
to be ready, when you cannot use requesocks
– like on GoogleAppEngine due to the lack of pwd
built-in module, is to use PySocks that was mentioned above:
- Grab the
socks.py
file from the repo and put a copy in your root folder; - Add
import socks
andimport socket
At this point configure and bind the socket before using with urllib2
– in the following example:
import urllib2
import socket
import socks
socks.set_default_proxy(socks.SOCKS5, "myprivateproxy.example",port=9050)
socket.socket = socks.socksocket
res=urllib2.urlopen(url).read()
In case someone has tried all of these older answers, and is still running into problems like:
requests.exceptions.ConnectionError:
SOCKSHTTPConnectionPool(host='myhost', port=80):
Max retries exceeded with url: /my/path
(Caused by NewConnectionError('<requests.packages.urllib3.contrib.socks.SOCKSConnection object at 0x106812bd0>:
Failed to establish a new connection:
[Errno 8] nodename nor servname provided, or not known',))
It may be because, by default, requests
is configured to resolve DNS queries on the local side of the connection.
Try changing your proxy URL from socks5://proxyhost:1234
to socks5h://proxyhost:1234
. Note the extra h
(it stands for hostname resolution).
The PySocks package module default is to do remote resolution, and I’m not sure why requests made their integration this obscurely divergent, but here we are.
I could do this on Linux.
$ pip3 install --user 'requests[socks]'
$ https_proxy=socks5://<hostname or ip>:<port> python3 -c
> 'import requests;print(requests.get("https://httpbin.org/ip").text)'
You can just run your script with https_proxy
environment variable.
- Install socks support if it necessary.
pip install PySocks
pip install pysocks5
- Setup environment variable
export https_proxy=socks5://<hostname or ip>:<port>
- Run your script. This example makes request using proxy and shows IP-address:
echo Your real IP
python -c 'import requests;print(requests.get("http://ipinfo.io/ip").text)'
echo IP with socks-proxy
python -c 'import requests;print(requests.get("https://ipinfo.io/ip").text)'