How do I get Python to send as many concurrent HTTP requests as possible?

Question:

I’m trying to send HTTPS requests as quickly as possible. I know this would have to be concurrent requests due to my goal being 150 to 500+ requests a second. I’ve searched everywhere, but get no Python 3.11+ answer or one that doesn’t give me errors. I’m trying to avoid AIOHTTP as the rigmarole of setting it up was a pain, which didn’t even work.

The input should be an array or URLs and the output an array of the html string.

Asked By: Surgemus

||

Answers:

Hope this helps, this question asked What is the fastest way to send 10000 http requests

I observed 15000 requests in 10s, using wireshark to trap on localhost and saved packets to CSV, only counted packets that had GET in them.

FILE: a.py

from treq import get
from twisted.internet import reactor

def done(response):
   if response.code == 200:
       get("http://localhost:3000").addCallback(done)

get("http://localhost:3000").addCallback(done)

reactor.callLater(10, reactor.stop)
reactor.run()

Run test like this:

pip3 install treq
python3 a.py  # code from above

Setup test website like this, mine was on port 3000

mkdir myapp
cd myapp
npm init
npm install express
node app.js

FILE: app.js

const express = require('express')
const app = express()
const port = 3000

app.get('/', (req, res) => {
  res.send('Hello World!')
})

app.listen(port, () => {
  console.log(`Example app listening on port ${port}`)
})

OUTPUT

grep GET wireshark.csv  | head
"5","0.000418","::1","::1","HTTP","139","GET / HTTP/1.1 "
"13","0.002334","::1","::1","HTTP","139","GET / HTTP/1.1 "
"17","0.003236","::1","::1","HTTP","139","GET / HTTP/1.1 "
"21","0.004018","::1","::1","HTTP","139","GET / HTTP/1.1 "
"25","0.004803","::1","::1","HTTP","139","GET / HTTP/1.1 "

grep GET wireshark.csv  | tail
"62145","9.994184","::1","::1","HTTP","139","GET / HTTP/1.1 "
"62149","9.995102","::1","::1","HTTP","139","GET / HTTP/1.1 "
"62153","9.995860","::1","::1","HTTP","139","GET / HTTP/1.1 "
"62157","9.996616","::1","::1","HTTP","139","GET / HTTP/1.1 "
"62161","9.997307","::1","::1","HTTP","139","GET / HTTP/1.1 "

Answered By: atl

This works, getting around 250+ requests a second.
This solution does work on Windows 10. You may have to pip install for concurrent and requests.

import time
import requests
import concurrent.futures

start = int(time.time()) # get time before the requests are sent

urls = [] # input URLs/IPs array
responses = [] # output content of each request as string in an array

# create an list of 5000 sites to test with
for y in range(5000):urls.append("https://example.com")

def send(url):responses.append(requests.get(url).content)

with concurrent.futures.ThreadPoolExecutor(max_workers=10000) as executor:
    futures = []
    for url in urls:futures.append(executor.submit(send, url))
        
end = int(time.time()) # get time after stuff finishes
print(str(round(len(urls)/(end - start),0))+"/sec") # get average requests per second

Output:
286.0/sec

Note: If your code requires something extremely time dependent, replace the middle part with this:

with concurrent.futures.ThreadPoolExecutor(max_workers=10000) as executor:
    futures = []
    for url in urls:
        futures.append(executor.submit(send, url))
    for future in concurrent.futures.as_completed(futures):
        responses.append(future.result())

This is a modified version of what this site showed in an example.

The secret sauce is the max_workers=10000. Otherwise, it would average about 80/sec. Although, when setting it to beyond 1000, there wasn’t any boost in speed.

Answered By: Surgemus

It’s quite unfortunate that you couldn’t setup AIOHTTP properly because this is one of the most efficient way to do asynchronous requests in Python.

Setup is not that hard:

import asyncio
import aiohttp
from time import perf_counter


def urls(n_reqs: int):
    for _ in range(n_reqs):
        yield "https://python.org"

async def get(session: aiohttp.ClientSession, url: str):
    async with session.get(url) as response:
        _ = await response.text()
             
async def main(n_reqs: int):
    async with aiohttp.ClientSession() as session:
        await asyncio.gather(
            *[get(session, url) for url in urls(n_reqs)]
        )


if __name__ == "__main__":
    n_reqs = 10_000
    
    start = perf_counter()
    asyncio.run(main(n_reqs))
    end = perf_counter()
    
    print(f"{n_reqs / (end - start)} req/s")

You basically need to create a single ClientSession which you then reuse to send the get requests. The requests are made concurrently with to asyncio.gather(). You could also use the newer asyncio.TaskGroup:

async def main(n_reqs: int):
    async with aiohttp.ClientSession() as session:
        async with asyncio.TaskGroup() as group:
            for url in urls(n_reqs):
                group.create_task(get(session, url))

This easily achieves 500+ requests per seconds on my 7+ years old bi-core computer. Contrary to what other answers suggested, this solution does not require to spawn thousands of threads, which are expensive.

You may improve the speed even more my using a custom connector in order to allow more concurrent connections (default is 100) in a single session:

async def main(n_reqs: int):
    let connector = aiohttp.TCPConnector(limit=0)
    async with aiohttp.ClientSession(connector=connector) as session:
        ...

Answered By: Louis Lac