Python Selenium Chromedriver not working with –headless option

Question

I am running chromedriver to try and scrape some data off of a website. Everything works fine without the headless option. However, when I add the option the webdriver takes a very long time to load the url, and when I try to find an element (that is found when run without –headless), I receive an error.

Using print statements and getting the html after the url “loaded”, I find that there is no html, it’s empty (See in output below).

class Fidelity:
    def __init__(self):
        self.url = 'https://eresearch.fidelity.com/eresearch/gotoBL/fidelityTopOrders.jhtml'
        self.options = Options()
        self.options.add_argument("--headless")
        self.options.add_argument("--window-size=1500,1000")
        self.driver = webdriver.Chrome(executable_path='.\dependencies\chromedriver.exe', options = self.options)
        print("init")

    def initiate_browser(self):
        self.driver.get(self.url)
        time.sleep(5)
        script = self.driver.execute_script("return document.documentElement.outerHTML")
        print(script)
        print("got url")

    def find_orders(self):
        wait = WebDriverWait(self.driver, 15)
        data= wait.until(ec.visibility_of_element_located((By.CSS_SELECTOR, '[id*="t_trigger_TSLA"]'))) #ERROR ON THIS LINE

This is the entire output:

init
<html><head></head><body></body></html>
url
Traceback (most recent call last):
  File "C:UsersZacharyDocumentsPythonTesla Stock InfoScraper.py", line 102, in <module>
    orders = scrape.find_tesla_orders()
  File "C:UsersZacharyDocumentsPythonTesla Stock InfoScraper.py", line 75, in find_tesla_orders
    tesla = self.driver.find_element_by_xpath("//a[@href='https://qr.fidelity.com/embeddedquotes/redirect/research?symbol=TSLA']")
  File "C:Program Files (x86)Python37-32libsite-packagesseleniumwebdriverremotewebdriver.py", line 394, in find_element_by_xpath
    return self.find_element(by=By.XPATH, value=xpath)
  File "C:Program Files (x86)Python37-32libsite-packagesseleniumwebdriverremotewebdriver.py", line 978, in find_element
    'value': value})['value']
  File "C:Program Files (x86)Python37-32libsite-packagesseleniumwebdriverremotewebdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "C:Program Files (x86)Python37-32libsite-packagesseleniumwebdriverremoteerrorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//a[@href='https://qr.fidelity.com/embeddedquotes/redirect/research?symbol=TSLA']"}
  (Session info: headless chrome=74.0.3729.169)
  (Driver info: chromedriver=74.0.3729.6 (255758eccf3d244491b8a1317aa76e1ce10d57e9-refs/branch-heads/3729@{#29}),platform=Windows NT 10.0.17763 x86_64)

New error with updated code:

init
<html><head></head><body></body></html>
url
Traceback (most recent call last):
  File "C:UsersZacharyDocumentsPythonTesla Stock InfoScraper.py", line 104, in <module>
    orders = scrape.find_tesla_orders()
  File "C:UsersZacharyDocumentsPythonTesla Stock InfoScraper.py", line 76, in find_tesla_orders
    tesla = wait.until(ec.visibility_of_element_located((By.CSS_SELECTOR, '[id*="t_trigger_TSLA"]')))
  File "C:Program Files (x86)Python37-32libsite-packagesseleniumwebdriversupportwait.py", line 80, in until
    raise TimeoutException(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message:

I have tried finding the answer to this through google but none of the suggestions work. Is anyone else having this issue with certain websites? Any help appreciated.

Update

This script still does not work unfortunately, the webdriver is not loading the page correctly for some reason while headless, even though everything works perfectly without running this using the headless option.

Asked By: LuckyZakary

||

Source

Answer 1

Add explicit wait. You should also use another locator, the current one match 3 elements. The element has unique id attribute

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as ec
from selenium.webdriver.common.by import By

wait = WebDriverWait(self.driver, timeout)
data = wait.until(ec.visibility_of_element_located((By.CSS_SELECTOR, '[id*="t_trigger_TSLA"]')))

Answered By: Guy

Answer 2

For anyone in the future who is wondering the fix to this, some websites just don’t load correctly with the headless option of chrome. I don’t think there is a way to fix this. Just use a different browser (like firefox). Thanks to user8426627 for this.

Answered By: LuckyZakary

Answer 3

Have you tried using a User-Agent?

I was experiencing the same error. First what I did was to download the HTML source page for both headless and normal with:

html = driver.page_source
file = open("foo.html","w")
file.write(html)
file.close()

The HTML source code for the headless mode was a short file with this line nearly at the end: The page cannot be displayed. Please contact the administrator for additional information. But the normal mode was the expected HTML.

I solve the issue by adding an User-Agent:

from fake_useragent import UserAgent
user_agent = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.2 (KHTML, like Gecko) Chrome/22.0.1216.0 Safari/537.2'
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument(f'user-agent={user_agent}')
driver = webdriver.Chrome(executable_path = f"your_path",chrome_options=chrome_options)

Answered By: David López

Answer 4

Try setting the window size as well as being headless. Add this:

chromeOptions.add_argument("--window-size=1920,1080")

The default size of the headless browser is tiny. If the code works when headless is not enabled it might be because your object is outside the window.

Answered By: Atul Kumar

Answer 5

some websites just don’t load correctly with the headless option of chrome.

The previous statement is actually wrong. I just got into this problem where Chrome wasn’t detecting the elements. When I saw the @LuckyZakary answer I was shocked because someone created a scrapping for the same website with nodeJs and didn’t got this error.

@AtulGumar answer helped on Windows but on Ubuntu server it failed. So it wasn’t enough. After reading this, all to the bottom, what @AtulGumar missed was to add the –disable-gpu flag.

So it work for me on Windows and Ubuntu server with no GUI with those options:

webOptions = webdriver.ChromeOptions()
webOptions.headless = True
webOptions.add_argument("--window-size=1920,1080")
webOptions.add_argument("–disable-gpu")
driver = webdriver.Chrome(options=webOptions)

I also installed xvfb and other packages as suggested here:

sudo apt-get -y install xorg xvfb gtk2-engines-pixbuf
sudo apt-get -y install dbus-x11 xfonts-base xfonts-100dpi xfonts-75dpi xfonts-cyrillic xfonts-scalable

and executed:

Xvfb -ac :99 -screen 0 1280x1024x16 &
export DISPLAY=:99

Answered By: user3502626

Python Selenium Chromedriver not working with –headless option

Question:

Update

Answers: