Timeout within session while sending requests

Question:

I’m trying to learn how I can use timeout within a session while sending requests. The way I’ve tried below can fetch the content of a webpage but I’m not sure this is the right way as I could not find the usage of timeout in this documentation.

import requests

link = "https://stackoverflow.com/questions/tagged/web-scraping"

with requests.Session() as s:
    r = s.get(link,timeout=5)
    print(r.text)

How can I use timeout within session?

Asked By: SMTH

||

Answers:

According to the Documentation – Quick Start.

You can tell Requests to stop waiting for a response after a given
number of seconds with the timeout parameter. Nearly all production code should use this parameter in nearly all requests.

requests.get('https://github.com/', timeout=0.001)

Or from the Documentation Advanced Usage you can set 2 values (connect and read timeout)

The timeout value will be applied to both the connect and the read
timeouts. Specify a tuple if you would like to set the values
separately:

r = requests.get('https://github.com', timeout=(3.05, 27))

Making Session Wide Timeout

Searched throughout the documentation and it seams it is not possible to set timeout parameter session wide.

But there is a GitHub Issue Opened (Consider making Timeout option required or have a default) which provides a workaround as an HTTPAdapter you can use like this:

import requests
from requests.adapters import HTTPAdapter

class TimeoutHTTPAdapter(HTTPAdapter):
    def __init__(self, *args, **kwargs):
        if "timeout" in kwargs:
            self.timeout = kwargs["timeout"]
            del kwargs["timeout"]
        super().__init__(*args, **kwargs)

    def send(self, request, **kwargs):
        timeout = kwargs.get("timeout")
        if timeout is None and hasattr(self, 'timeout'):
            kwargs["timeout"] = self.timeout
        return super().send(request, **kwargs)

And mount on a requests.Session()

s = requests.Session() 
s.mount('http://', TimeoutHTTPAdapter(timeout=5)) # 5 seconds
s.mount('https://', TimeoutHTTPAdapter(timeout=5))
...
r = s.get(link) 
print(r.text)

or similarly you can use the proposed EnhancedSession by @GordonAitchJay

with EnhancedSession(5) as s: # 5 seconds
    r = s.get(link)
    print(r.text)
Answered By: imbr

I’m not sure this is the right way as I could not find the usage of timeout in this documentation.

Scroll to the bottom. It’s definitely there. You can search for it in the page by pressing Ctrl+F and entering timeout.

You’re using timeout correctly in your code example.

You can actually specify the timeout in a few different ways, as explained in the documentation:

If you specify a single value for the timeout, like this:

r = requests.get('https://github.com', timeout=5)

The timeout value will be applied to both the connect and the read timeouts. Specify a tuple if you would like to set the values separately:

r = requests.get('https://github.com', timeout=(3.05, 27))

If the remote server is very slow, you can tell Requests to wait forever for a response, by passing None as a timeout value and then retrieving a cup of coffee.

r = requests.get('https://github.com', timeout=None)

Try using https://httpstat.us/200?sleep=5000 to test your code.

For example, this raises an exception because 0.2 seconds is not long enough to establish a connection with the server:

import requests

link = "https://httpstat.us/200?sleep=5000"

with requests.Session() as s:
    try:
        r = s.get(link, timeout=(0.2, 10))
        print(r.text)
    except requests.exceptions.Timeout as e:
        print(e)

Output:

HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=0.2)

This raises an exception because the server waits for 5 seconds before sending the response, which is longer than the 2 second read timeout set:

import requests

link = "https://httpstat.us/200?sleep=5000"

with requests.Session() as s:
    try:
        r = s.get(link, timeout=(3.05, 2))
        print(r.text)
    except requests.exceptions.Timeout as e:
        print(e)

Output:

HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=2)

You specifically mention using a timeout within a session. So maybe you want a session object which has a default timeout. Something like this:

import requests

link = "https://httpstat.us/200?sleep=5000"

class EnhancedSession(requests.Session):
    def __init__(self, timeout=(3.05, 4)):
        self.timeout = timeout
        return super().__init__()

    def request(self, method, url, **kwargs):
        print("EnhancedSession request")
        if "timeout" not in kwargs:
            kwargs["timeout"] = self.timeout
        return super().request(method, url, **kwargs)

session = EnhancedSession()

try:
    response = session.get(link)
    print(response)
except requests.exceptions.Timeout as e:
    print(e)

try:
    response = session.get(link, timeout=1)
    print(response)
except requests.exceptions.Timeout as e:
    print(e)

try:
    response = session.get(link, timeout=10)
    print(response)
except requests.exceptions.Timeout as e:
    print(e)

Output:

EnhancedSession request
HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=4)
EnhancedSession request
HTTPSConnectionPool(host='httpstat.us', port=443): Read timed out. (read timeout=1)
EnhancedSession request
<Response [200]>
Answered By: GordonAitchJay