How do I get Python to send as many concurrent HTTP requests as possible?
Question:
I’m trying to send HTTPS requests as quickly as possible. I know this would have to be concurrent requests due to my goal being 150 to 500+ requests a second. I’ve searched everywhere, but get no Python 3.11+ answer or one that doesn’t give me errors. I’m trying to avoid AIOHTTP as the rigmarole of setting it up was a pain, which didn’t even work.
The input should be an array or URLs and the output an array of the html string.
Answers:
Hope this helps, this question asked What is the fastest way to send 10000 http requests
I observed 15000 requests in 10s, using wireshark to trap on localhost and saved packets to CSV, only counted packets that had GET
in them.
FILE: a.py
from treq import get
from twisted.internet import reactor
def done(response):
if response.code == 200:
get("http://localhost:3000").addCallback(done)
get("http://localhost:3000").addCallback(done)
reactor.callLater(10, reactor.stop)
reactor.run()
Run test like this:
pip3 install treq
python3 a.py # code from above
Setup test website like this, mine was on port 3000
mkdir myapp
cd myapp
npm init
npm install express
node app.js
FILE: app.js
const express = require('express')
const app = express()
const port = 3000
app.get('/', (req, res) => {
res.send('Hello World!')
})
app.listen(port, () => {
console.log(`Example app listening on port ${port}`)
})
OUTPUT
grep GET wireshark.csv | head
"5","0.000418","::1","::1","HTTP","139","GET / HTTP/1.1 "
"13","0.002334","::1","::1","HTTP","139","GET / HTTP/1.1 "
"17","0.003236","::1","::1","HTTP","139","GET / HTTP/1.1 "
"21","0.004018","::1","::1","HTTP","139","GET / HTTP/1.1 "
"25","0.004803","::1","::1","HTTP","139","GET / HTTP/1.1 "
grep GET wireshark.csv | tail
"62145","9.994184","::1","::1","HTTP","139","GET / HTTP/1.1 "
"62149","9.995102","::1","::1","HTTP","139","GET / HTTP/1.1 "
"62153","9.995860","::1","::1","HTTP","139","GET / HTTP/1.1 "
"62157","9.996616","::1","::1","HTTP","139","GET / HTTP/1.1 "
"62161","9.997307","::1","::1","HTTP","139","GET / HTTP/1.1 "
This works, getting around 250+ requests a second.
This solution does work on Windows 10. You may have to pip install
for concurrent and requests.
import time
import requests
import concurrent.futures
start = int(time.time()) # get time before the requests are sent
urls = [] # input URLs/IPs array
responses = [] # output content of each request as string in an array
# create an list of 5000 sites to test with
for y in range(5000):urls.append("https://example.com")
def send(url):responses.append(requests.get(url).content)
with concurrent.futures.ThreadPoolExecutor(max_workers=10000) as executor:
futures = []
for url in urls:futures.append(executor.submit(send, url))
end = int(time.time()) # get time after stuff finishes
print(str(round(len(urls)/(end - start),0))+"/sec") # get average requests per second
Output:
286.0/sec
Note: If your code requires something extremely time dependent, replace the middle part with this:
with concurrent.futures.ThreadPoolExecutor(max_workers=10000) as executor:
futures = []
for url in urls:
futures.append(executor.submit(send, url))
for future in concurrent.futures.as_completed(futures):
responses.append(future.result())
This is a modified version of what this site showed in an example.
The secret sauce is the max_workers=10000
. Otherwise, it would average about 80/sec. Although, when setting it to beyond 1000, there wasn’t any boost in speed.
It’s quite unfortunate that you couldn’t setup AIOHTTP properly because this is one of the most efficient way to do asynchronous requests in Python.
Setup is not that hard:
import asyncio
import aiohttp
from time import perf_counter
def urls(n_reqs: int):
for _ in range(n_reqs):
yield "https://python.org"
async def get(session: aiohttp.ClientSession, url: str):
async with session.get(url) as response:
_ = await response.text()
async def main(n_reqs: int):
async with aiohttp.ClientSession() as session:
await asyncio.gather(
*[get(session, url) for url in urls(n_reqs)]
)
if __name__ == "__main__":
n_reqs = 10_000
start = perf_counter()
asyncio.run(main(n_reqs))
end = perf_counter()
print(f"{n_reqs / (end - start)} req/s")
You basically need to create a single ClientSession
which you then reuse to send the get requests. The requests are made concurrently with to asyncio.gather()
. You could also use the newer asyncio.TaskGroup
:
async def main(n_reqs: int):
async with aiohttp.ClientSession() as session:
async with asyncio.TaskGroup() as group:
for url in urls(n_reqs):
group.create_task(get(session, url))
This easily achieves 500+ requests per seconds on my 7+ years old bi-core computer. Contrary to what other answers suggested, this solution does not require to spawn thousands of threads, which are expensive.
You may improve the speed even more my using a custom connector in order to allow more concurrent connections (default is 100) in a single session:
async def main(n_reqs: int):
let connector = aiohttp.TCPConnector(limit=0)
async with aiohttp.ClientSession(connector=connector) as session:
...
I’m trying to send HTTPS requests as quickly as possible. I know this would have to be concurrent requests due to my goal being 150 to 500+ requests a second. I’ve searched everywhere, but get no Python 3.11+ answer or one that doesn’t give me errors. I’m trying to avoid AIOHTTP as the rigmarole of setting it up was a pain, which didn’t even work.
The input should be an array or URLs and the output an array of the html string.
Hope this helps, this question asked What is the fastest way to send 10000 http requests
I observed 15000 requests in 10s, using wireshark to trap on localhost and saved packets to CSV, only counted packets that had GET
in them.
FILE: a.py
from treq import get
from twisted.internet import reactor
def done(response):
if response.code == 200:
get("http://localhost:3000").addCallback(done)
get("http://localhost:3000").addCallback(done)
reactor.callLater(10, reactor.stop)
reactor.run()
Run test like this:
pip3 install treq
python3 a.py # code from above
Setup test website like this, mine was on port 3000
mkdir myapp
cd myapp
npm init
npm install express
node app.js
FILE: app.js
const express = require('express')
const app = express()
const port = 3000
app.get('/', (req, res) => {
res.send('Hello World!')
})
app.listen(port, () => {
console.log(`Example app listening on port ${port}`)
})
OUTPUT
grep GET wireshark.csv | head
"5","0.000418","::1","::1","HTTP","139","GET / HTTP/1.1 "
"13","0.002334","::1","::1","HTTP","139","GET / HTTP/1.1 "
"17","0.003236","::1","::1","HTTP","139","GET / HTTP/1.1 "
"21","0.004018","::1","::1","HTTP","139","GET / HTTP/1.1 "
"25","0.004803","::1","::1","HTTP","139","GET / HTTP/1.1 "
grep GET wireshark.csv | tail
"62145","9.994184","::1","::1","HTTP","139","GET / HTTP/1.1 "
"62149","9.995102","::1","::1","HTTP","139","GET / HTTP/1.1 "
"62153","9.995860","::1","::1","HTTP","139","GET / HTTP/1.1 "
"62157","9.996616","::1","::1","HTTP","139","GET / HTTP/1.1 "
"62161","9.997307","::1","::1","HTTP","139","GET / HTTP/1.1 "
This works, getting around 250+ requests a second.
This solution does work on Windows 10. You may have to pip install
for concurrent and requests.
import time
import requests
import concurrent.futures
start = int(time.time()) # get time before the requests are sent
urls = [] # input URLs/IPs array
responses = [] # output content of each request as string in an array
# create an list of 5000 sites to test with
for y in range(5000):urls.append("https://example.com")
def send(url):responses.append(requests.get(url).content)
with concurrent.futures.ThreadPoolExecutor(max_workers=10000) as executor:
futures = []
for url in urls:futures.append(executor.submit(send, url))
end = int(time.time()) # get time after stuff finishes
print(str(round(len(urls)/(end - start),0))+"/sec") # get average requests per second
Output:
286.0/sec
Note: If your code requires something extremely time dependent, replace the middle part with this:
with concurrent.futures.ThreadPoolExecutor(max_workers=10000) as executor:
futures = []
for url in urls:
futures.append(executor.submit(send, url))
for future in concurrent.futures.as_completed(futures):
responses.append(future.result())
This is a modified version of what this site showed in an example.
The secret sauce is the max_workers=10000
. Otherwise, it would average about 80/sec. Although, when setting it to beyond 1000, there wasn’t any boost in speed.
It’s quite unfortunate that you couldn’t setup AIOHTTP properly because this is one of the most efficient way to do asynchronous requests in Python.
Setup is not that hard:
import asyncio
import aiohttp
from time import perf_counter
def urls(n_reqs: int):
for _ in range(n_reqs):
yield "https://python.org"
async def get(session: aiohttp.ClientSession, url: str):
async with session.get(url) as response:
_ = await response.text()
async def main(n_reqs: int):
async with aiohttp.ClientSession() as session:
await asyncio.gather(
*[get(session, url) for url in urls(n_reqs)]
)
if __name__ == "__main__":
n_reqs = 10_000
start = perf_counter()
asyncio.run(main(n_reqs))
end = perf_counter()
print(f"{n_reqs / (end - start)} req/s")
You basically need to create a single ClientSession
which you then reuse to send the get requests. The requests are made concurrently with to asyncio.gather()
. You could also use the newer asyncio.TaskGroup
:
async def main(n_reqs: int):
async with aiohttp.ClientSession() as session:
async with asyncio.TaskGroup() as group:
for url in urls(n_reqs):
group.create_task(get(session, url))
This easily achieves 500+ requests per seconds on my 7+ years old bi-core computer. Contrary to what other answers suggested, this solution does not require to spawn thousands of threads, which are expensive.
You may improve the speed even more my using a custom connector in order to allow more concurrent connections (default is 100) in a single session:
async def main(n_reqs: int):
let connector = aiohttp.TCPConnector(limit=0)
async with aiohttp.ClientSession(connector=connector) as session:
...