Reached error page: The server at x is taking too long to respond

Question:

I want to deploy my application on Heroku. My application scrapes data of an apartment website. For one url, I have multiple selectors. The application is ran using APSceduler. Logs are showing the following error:

2020-08-10T11:02:56.259319+00:00 app[clock.1]: Running main
2020-08-10T11:04:34.374167+00:00 app[clock.1]: Job "main (trigger: interval[3:00:00], next run at: 2020-08-10 14:02:56 UTC)" raised an exception
2020-08-10T11:04:34.374183+00:00 app[clock.1]: Traceback (most recent call last):
2020-08-10T11:04:34.374184+00:00 app[clock.1]: File "/app/.heroku/python/lib/python3.8/site-packages/apscheduler/executors/base.py", line 125, in run_job
2020-08-10T11:04:34.374184+00:00 app[clock.1]: retval = job.func(*job.args, **job.kwargs)
2020-08-10T11:04:34.374185+00:00 app[clock.1]: File "/app/scraper/common.py", line 70, in main
2020-08-10T11:04:34.374186+00:00 app[clock.1]: driver.get(listing.url)
2020-08-10T11:04:34.374187+00:00 app[clock.1]: File "/app/.heroku/python/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 333, in get
2020-08-10T11:04:34.374188+00:00 app[clock.1]: self.execute(Command.GET, {'url': url})
2020-08-10T11:04:34.374188+00:00 app[clock.1]: File "/app/.heroku/python/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
2020-08-10T11:04:34.374189+00:00 app[clock.1]: self.error_handler.check_response(response)
2020-08-10T11:04:34.374189+00:00 app[clock.1]: File "/app/.heroku/python/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
2020-08-10T11:04:34.374190+00:00 app[clock.1]: raise exception_class(message, screen, stacktrace)
2020-08-10T11:04:34.374191+00:00 app[clock.1]: selenium.common.exceptions.WebDriverException: Message: Reached error page: about:neterror?e=netTimeout&u=x&d=The%20server%20at%20x%20is%20taking%20too%20long%20to%20respond.

Decoded:

about:neterror?e=netTimeout&u=&d=The server at x is taking too long to respond.

If I go to the link,I can access it. I have disabled JavaScript and images so that links are loaded more quickly.

I am not sure what is the problem here.

Asked By: Matej J

||

Answers:

I think that you want to wait until the element you are looking for waits:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.common.exceptions import TimeoutException

try:
   my_element = WebDriverWait(browser, delay).until(EC.presence_of_element_located((By.ID, 'ID_of_element')))
   print "Page is ready"
except TimeoutException:
   print "Loading took to much time"
Answered By: Matěj Mudra

As it turned out, the target website was blocking Heroku. Solution is to use proxy

Answered By: Matej J

Got the same problem, maybe somebody else could also prevent the same mistake. If the website is using http but you type https it would also have this exact error

Example:

The Correct Website : http://some-website.com

driver.get('http://some-website.com')

driver.get('https://some-website.com') => Server is taking too long error

Answered By: Michael Halim