Selenium with chromedriver doesn't start via cron
Question:
Python script with Selenium and Chromedriver in headless mode on CentOS7 runs fine when called manually.
options = webdriver.ChromeOptions()
options.add_argument('headless')
options.add_argument('no-sandbox')
self.driver = webdriver.Chrome(chrome_options=options)
When starting script with crontab however it throws this exception at line 4 (above). Full traceback at bottom.
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally (Driver info: chromedriver=2.38.552522
Cron is setup with crontab -e
* * * * * cd /to/path && /to/path/.virtualenvs/selenium/bin/python /to/path/script.py -t arg1 arg2 > /to/path/log.txt 2>&1
This produced errors like chromedriver couldn’t be found. I then added following to crontab -e.
1) Use bash instead of sh, although starting python script manually from sh works fine
2) Specify path to chromedriver
SHELL=/bin/bash
PATH=/usr/local/bin/
I tried different suggestions found on the web like adding –no-sandbox options to chromedriver in my script. All didn’t help. Please note that I am using chrome in headless mode, so I think I don’t need this export DISPLAY=:0 stuff in cron, or Xvfb libs as it used to be.
Python 3.6.1
Selenium 3.4.3
Chromedriver 2.38.552522
google-chrome-stable 65.0.3325.181
Full traceback
Exception in thread <name>:
Traceback (most recent call last):
File "/usr/lib64/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib64/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/path/to/script.py", line 53, in start
self.site_scrape(test_run)
File "/path/to/script.py", line 65, in site
self.driver = webdriver.Chrome(chrome_options=options)
File "/home/<user>/.virtualenvs/selenium/lib64/python3.6/site-packages/selenium/webdriver/chrome/webdriver.py", line 69, in __init__
desired_capabilities=desired_capabilities)
File "/home/<user>/.virtualenvs/selenium/lib64/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 98, in __init__
self.start_session(desired_capabilities, browser_profile)
File "/home/<user>/.virtualenvs/selenium/lib64/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 188, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/home/<user>/.virtualenvs/selenium/lib64/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 256, in execute
self.error_handler.check_response(response)
File "/home/<user>/.virtualenvs/selenium/lib64/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally
(Driver info: chromedriver=2.38.552522 (437e6fbedfa8762dec75e2c5b3ddb86763dc9dcb),platform=Linux 4.14.12-x86_64-linode92 x86_64)
Answers:
Finally found the solution. Boy this was bugging me for far too long. Issue was following missing PATH directories: /usr/bin, /usr/sbin in cron. Complete cron looks now like this:
SHELL=/bin/bash
PATH=/usr/local/bin/:/usr/bin:/usr/sbin
* * * * * cd /to/path && /to/path/.virtualenvs/selenium/bin/python /to/path/script.py -t arg1 arg2 > /to/path/log.txt 2>&1
What helped for me were the following steps:
- Add ‘DISPLAY=:1’ to my crontab
- Set the correct shell in the crontab (I use ‘zsh’)
- Source the environment variables since crontab does not do this by default
- Use absolute path to the chromedriver
TLDR
-
The required modifications to your crontab would be:
SHELL=/bin/zsh
05 * * * * export DISPLAY=:<displayNumber> && source /home/<username>/.zshrc && cd <absoluteExecutableDirectory> && ./<pythonFile> >> log.log 2>&1
-
Use the following line to initialize the selenium chrome driver:
driver = webdriver.Chrome(<absoluteDriverPath>,...)
Replace everything within the angled brackets with their respective values.
1. Set display
To find out which display to add to your crontab, use:
env | grep 'DISPLAY'
Then add this piece to your crontab command:
export DISPLAY=:1
2. Set shell
-
If you have a non-default shell*, then set the shell.
Find out the location of your shell with one of the two commands
which bash
which zsh
Then set the shell to the response of the previous command (in your crontab):
SHELL=/bin/zsh
3. Source the environment variables
Add one of the following piece to your crontab command, depending on wether you use bash or zsh:
source /home/<username>/.zshrc
source /home/<username>/.bashrc
4. Use an absolute path for your chromedriver:
When initializing the driver use the following line where points to the selenium chrome driver.
driver = webdriver.Chrome(<absoluteDriverPath>,options=options)
Other
>> log.log 2>&1
Means that all output is written to a file (This allows for easier debugging the crontabs).
On ubuntu 18.04, Python 3.6.9:
It’s on my main workstation, so I always have a logged in X session going.
my selenium invocation:
#!/usr/bin/python3
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
driver = webdriver.Firefox()
location = 'http://example.com'
driver.get(location)
what I put in crontab:
33 14 * * * DISPLAY=:0 lxterminal --working-directory=/home/user/Documents/ -e /home/user/bin/get.stats.by.zip.py > /home/user/gbzp.log 2>&1
lxterminal is a relatively simple terminal program that I tend to use for most terminal stuff, instead of, say gnome-terminal, installed with apt-get.
I don’t know if I’m late but i think my solution is going to help a lot of people.I have found the solution after trying for 1 1/2 hrs. The solution steps are:
- Open Terminal and run the command:
echo $DISPLAY
Let the output of this command is: :0
- Open the crontab for editing:
crontab -e
- Inside the crontab, append these lines:
DISPLAY=:0
## If output of "echo $DISPLAY" is: ":1", then change the above line to: "DISPLAY=:1" (without quotes)
## if running the python file every 2 minutes:
# If firefox is used for selenium automation:
*/2 * * * * export PATH=$PATH:path_to_python_executable_folder:geckodriver_folder_path_for_firefox; python path_to_your_python_script.py
# If chrome is used for selenium automation:
*/2 * * * * export PATH=$PATH:path_to_python_executable_folder:chromedriver_folder_path; python path_to_your_python_script.py
Python script with Selenium and Chromedriver in headless mode on CentOS7 runs fine when called manually.
options = webdriver.ChromeOptions()
options.add_argument('headless')
options.add_argument('no-sandbox')
self.driver = webdriver.Chrome(chrome_options=options)
When starting script with crontab however it throws this exception at line 4 (above). Full traceback at bottom.
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally (Driver info: chromedriver=2.38.552522
Cron is setup with crontab -e
* * * * * cd /to/path && /to/path/.virtualenvs/selenium/bin/python /to/path/script.py -t arg1 arg2 > /to/path/log.txt 2>&1
This produced errors like chromedriver couldn’t be found. I then added following to crontab -e.
1) Use bash instead of sh, although starting python script manually from sh works fine
2) Specify path to chromedriver
SHELL=/bin/bash
PATH=/usr/local/bin/
I tried different suggestions found on the web like adding –no-sandbox options to chromedriver in my script. All didn’t help. Please note that I am using chrome in headless mode, so I think I don’t need this export DISPLAY=:0 stuff in cron, or Xvfb libs as it used to be.
Python 3.6.1
Selenium 3.4.3
Chromedriver 2.38.552522
google-chrome-stable 65.0.3325.181
Full traceback
Exception in thread <name>:
Traceback (most recent call last):
File "/usr/lib64/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib64/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/path/to/script.py", line 53, in start
self.site_scrape(test_run)
File "/path/to/script.py", line 65, in site
self.driver = webdriver.Chrome(chrome_options=options)
File "/home/<user>/.virtualenvs/selenium/lib64/python3.6/site-packages/selenium/webdriver/chrome/webdriver.py", line 69, in __init__
desired_capabilities=desired_capabilities)
File "/home/<user>/.virtualenvs/selenium/lib64/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 98, in __init__
self.start_session(desired_capabilities, browser_profile)
File "/home/<user>/.virtualenvs/selenium/lib64/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 188, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/home/<user>/.virtualenvs/selenium/lib64/python3.6/site-packages/selenium/webdriver/remote/webdriver.py", line 256, in execute
self.error_handler.check_response(response)
File "/home/<user>/.virtualenvs/selenium/lib64/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py", line 194, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: Chrome failed to start: exited abnormally
(Driver info: chromedriver=2.38.552522 (437e6fbedfa8762dec75e2c5b3ddb86763dc9dcb),platform=Linux 4.14.12-x86_64-linode92 x86_64)
Finally found the solution. Boy this was bugging me for far too long. Issue was following missing PATH directories: /usr/bin, /usr/sbin in cron. Complete cron looks now like this:
SHELL=/bin/bash
PATH=/usr/local/bin/:/usr/bin:/usr/sbin
* * * * * cd /to/path && /to/path/.virtualenvs/selenium/bin/python /to/path/script.py -t arg1 arg2 > /to/path/log.txt 2>&1
What helped for me were the following steps:
- Add ‘DISPLAY=:1’ to my crontab
- Set the correct shell in the crontab (I use ‘zsh’)
- Source the environment variables since crontab does not do this by default
- Use absolute path to the chromedriver
TLDR
-
The required modifications to your crontab would be:
SHELL=/bin/zsh 05 * * * * export DISPLAY=:<displayNumber> && source /home/<username>/.zshrc && cd <absoluteExecutableDirectory> && ./<pythonFile> >> log.log 2>&1
-
Use the following line to initialize the selenium chrome driver:
driver = webdriver.Chrome(<absoluteDriverPath>,...)
Replace everything within the angled brackets with their respective values.
1. Set display
To find out which display to add to your crontab, use:
env | grep 'DISPLAY'
Then add this piece to your crontab command:
export DISPLAY=:1
2. Set shell
-
If you have a non-default shell*, then set the shell.
Find out the location of your shell with one of the two commandswhich bash which zsh
Then set the shell to the response of the previous command (in your crontab):
SHELL=/bin/zsh
3. Source the environment variables
Add one of the following piece to your crontab command, depending on wether you use bash or zsh:
source /home/<username>/.zshrc
source /home/<username>/.bashrc
4. Use an absolute path for your chromedriver:
When initializing the driver use the following line where points to the selenium chrome driver.
driver = webdriver.Chrome(<absoluteDriverPath>,options=options)
Other
>> log.log 2>&1
Means that all output is written to a file (This allows for easier debugging the crontabs).
On ubuntu 18.04, Python 3.6.9:
It’s on my main workstation, so I always have a logged in X session going.
my selenium invocation:
#!/usr/bin/python3
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
driver = webdriver.Firefox()
location = 'http://example.com'
driver.get(location)
what I put in crontab:
33 14 * * * DISPLAY=:0 lxterminal --working-directory=/home/user/Documents/ -e /home/user/bin/get.stats.by.zip.py > /home/user/gbzp.log 2>&1
lxterminal is a relatively simple terminal program that I tend to use for most terminal stuff, instead of, say gnome-terminal, installed with apt-get.
I don’t know if I’m late but i think my solution is going to help a lot of people.I have found the solution after trying for 1 1/2 hrs. The solution steps are:
- Open Terminal and run the command:
echo $DISPLAY
Let the output of this command is: :0
- Open the crontab for editing:
crontab -e
- Inside the crontab, append these lines:
DISPLAY=:0 ## If output of "echo $DISPLAY" is: ":1", then change the above line to: "DISPLAY=:1" (without quotes) ## if running the python file every 2 minutes: # If firefox is used for selenium automation: */2 * * * * export PATH=$PATH:path_to_python_executable_folder:geckodriver_folder_path_for_firefox; python path_to_your_python_script.py # If chrome is used for selenium automation: */2 * * * * export PATH=$PATH:path_to_python_executable_folder:chromedriver_folder_path; python path_to_your_python_script.py