Proxy requests alway slow

Question:

i need to do many requests to one url, but after ~20 requests, I get a 429 too many requests. So my plan was to use proxy requests. I have tried 3 things:

  • Tor-proxy using python
  • Free proxy lists
  • ScraperApi

But all of them(even the scraperApi-trial) are unbelieveably slow, like 5-10 seconds each request. An example looks like this:

import requests

url = "https://httpbin.org/ip"
proxies = {"https": "164.155.149.1:80"}
r = requests.get(url,proxies=proxies)
print(r.text)

The proxy-ip was from some free proxy website. Sure, proxies are an extra node inbetween but was hoping to find a way to get proxies which at maximum take 1 second..

Is there any way to solve this issue?

Thanks in advance

Asked By: codedor

||

Answers:

Codedor, one way I could think is:

  • Create a pool of EC2 instances on AWS(or any other cloud service provider of your choice). These can be the cheapest ones – even spot instances on AWS.
  • Round-robin your requests from each of these VMs. Since each VM will have it’s own public IP, you are less likely to get "429 too many requests" sooner. The more instances you have, the less likely.

Eg:

  • Say you have 10 VMs.
  • In each VM you make 1 request/5s = 12 requests/min.
  • Altogether you will make 12X10 = 120 requests/min.
  • Add reasonable delays.

Distributing the jobs on the VM would be a little trickier – but doable.
You can have a master node running a Python script, that iterates through the VMs and spawns the request command on them. You could use various libraries to execute a command on a remote machine in Python – like paramiko, subprocess, os, etc.

Answered By: Loner

The way to solve this issue is using rotating residential proxies, like the ones Bright Data, SOAX, NetNut, etc. offer.

Not only the requests are routed via super-proxy servers for maximum speeds, but it also gives you an option to use advanced proxy manager tools which allows thousands of concurrent connections so essentially you have thousands of different IPs sending requests and getting the data simultaneously, without getting blocked.

Answered By: Gidoneli
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.