aiohttp: set maximum number of requests per second

Question:

How can I set maximum number of requests per second (limit them) in client side using aiohttp?

Asked By: v18o

||

Answers:

I found one possible solution here: http://compiletoi.net/fast-scraping-in-python-with-asyncio.html

Doing 3 requests at the same time is cool, doing 5000, however, is not so nice. If you try to do too many requests at the same time, connections might start to get closed, or you might even get banned from the website.

To avoid this, you can use a semaphore. It is a synchronization tool that can be used to limit the number of coroutines that do something at some point. We’ll just create the semaphore before creating the loop, passing as an argument the number of simultaneous requests we want to allow:

sem = asyncio.Semaphore(5)

Then, we just replace:

page = yield from get(url, compress=True)

by the same thing, but protected by a semaphore:

with (yield from sem):
    page = yield from get(url, compress=True)

This will ensure that at most 5 requests can be done at the same time.

Answered By: v18o

Although it’s not exactly a limit on the number of requests per second, note that since v2.0, when using a ClientSession, aiohttp automatically limits the number of simultaneous connections to 100.

You can modify the limit by creating your own TCPConnector and passing it into the ClientSession. For instance, to create a client limited to 50 simultaneous requests:

import aiohttp

connector = aiohttp.TCPConnector(limit=50)
client = aiohttp.ClientSession(connector=connector)

In case it’s better suited to your use case, there is also a limit_per_host parameter (which is off by default) that you can pass to limit the number of simultaneous connections to the same "endpoint". Per the docs:

limit_per_host (int) – limit for simultaneous connections to the same endpoint. Endpoints are the same if they are have equal (host, port, is_ssl) triple.

Example usage:

import aiohttp

connector = aiohttp.TCPConnector(limit_per_host=50)
client = aiohttp.ClientSession(connector=connector)
Answered By: Mark Amery

This is an example without aiohttp, but you can wrap any async method or aiohttp.request using the Limit decorator

import asyncio
import time


class Limit(object):
    def __init__(self, calls=5, period=1):
        self.calls = calls
        self.period = period
        self.clock = time.monotonic
        self.last_reset = 0
        self.num_calls = 0

    def __call__(self, func):
        async def wrapper(*args, **kwargs):
            if self.num_calls >= self.calls:
                await asyncio.sleep(self.__period_remaining())

            period_remaining = self.__period_remaining()

            if period_remaining <= 0:
                self.num_calls = 0
                self.last_reset = self.clock()

            self.num_calls += 1

            return await func(*args, **kwargs)

        return wrapper

    def __period_remaining(self):
        elapsed = self.clock() - self.last_reset
        return self.period - elapsed


@Limit(calls=5, period=2)
async def test_call(x):
    print(x)


async def worker():
    for x in range(100):
        await test_call(x + 1)


asyncio.run(worker())
Answered By: Andrew Nodermann
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.