Selenium leaves behind running processes?

Question:

When my selenium program crashes due to some error, it seems to leave behind running processes.

For example, here is my process list:

carol    30186  0.0  0.0 103576  7196 pts/11   Sl   00:45   0:00 /home/carol/test/chromedriver --port=51789
carol    30322  0.0  0.0 102552  7160 pts/11   Sl   00:45   0:00 /home/carol/test/chromedriver --port=33409
carol    30543  0.0  0.0 102552  7104 pts/11   Sl   00:48   0:00 /home/carol/test/chromedriver --port=42567
carol    30698  0.0  0.0 102552  7236 pts/11   Sl   00:50   0:00 /home/carol/test/chromedriver --port=46590
carol    30938  0.0  0.0 102552  7496 pts/11   Sl   00:55   0:00 /home/carol/test/chromedriver --port=51930
carol    31546  0.0  0.0 102552  7376 pts/11   Sl   01:16   0:00 /home/carol/test/chromedriver --port=53077
carol    31549  0.5  0.0      0     0 pts/11   Z    01:16   0:03 [chrome] <defunct>
carol    31738  0.0  0.0 102552  7388 pts/11   Sl   01:17   0:00 /home/carol/test/chromedriver --port=55414
carol    31741  0.3  0.0      0     0 pts/11   Z    01:17   0:02 [chrome] <defunct>
carol    31903  0.0  0.0 102552  7368 pts/11   Sl   01:19   0:00 /home/carol/test/chromedriver --port=54205
carol    31906  0.6  0.0      0     0 pts/11   Z    01:19   0:03 [chrome] <defunct>
carol    32083  0.0  0.0 102552  7292 pts/11   Sl   01:20   0:00 /home/carol/test/chromedriver --port=39083
carol    32440  0.0  0.0 102552  7412 pts/11   Sl   01:24   0:00 /home/carol/test/chromedriver --port=34326
carol    32443  1.7  0.0      0     0 pts/11   Z    01:24   0:03 [chrome] <defunct>
carol    32691  0.1  0.0 102552  7360 pts/11   Sl   01:26   0:00 /home/carol/test/chromedriver --port=36369
carol    32695  2.8  0.0      0     0 pts/11   Z    01:26   0:02 [chrome] <defunct>

Here is my code:

from selenium import webdriver

browser = webdriver.Chrome("path/to/chromedriver")
browser.get("http://stackoverflow.com")
browser.find_element_by_id('...').click()

browser.close()

Sometimes, the browser doesn’t load the webpage elements quickly enough so Selenium crashes when it tries to click on something it didn’t find. Other times it works fine.

This is a simple example for simplicity sake, but with a more complex selenium program, what is a guaranteed clean way of exiting and not leave behind running processes? It should cleanly exit on an unexpected crash and on a successful run.

Asked By: warchest

||

Answers:

Chromedriver.exe crowds the TaskManager ( in case of Windows) everytime Selenium runs on Chrome.Sometimes, it doesn’t clear even if the browser didn’t crash.

I usually run a bat file or a cmd to kill all the existing chromedriver.exe processes before launching another one.

Take a look at this : release Selenium chromedriver.exe from memory

  • I know this is a Unix-related question but I am sure the way it has been handled in Windows can be applied there.
Answered By: Grace A

Whats happening is that your code is throwing an exception, halting the python process from continuing on. As such, the close/quit methods never get called on the browser object, so the chromedrivers just hang out indefinitely.

You need to use a try/except block to ensure the close method is called every time, even when an exception is thrown. A very simplistic example is:

from selenium import webdriver

browser = webdriver.Chrome("path/to/chromedriver")
try:
    browser.get("http://stackoverflow.com")
    browser.find_element_by_id('...').click()

except:
    browser.close()
    browser.quit()  # I exclusively use quit

There are a number of much more sophisticated approaches you can take here, such as creating a context manager to use with the with statement, but its difficult to recommend one without having a better understanding of your codebase.

Answered By: Levi Noecker

As already pointed out you should run browser.quit()

But on linux (inside docker) this will leave defunct processes. These are typically not really a problem as they are mere an entry in the process-table and consume no resources. But if you have many of those you will run out of processes. Typically my server melts down at 65k processes.

It looks like this:

# root@dockerhost1:~/odi/docker/bf1# ps -ef | grep -i defunct | wc -l
28599

root@dockerhost1:~/odi/docker/bf1# ps -ef | grep -i defunct | tail
root     32757 10839  0 Oct18 ?        00:00:00 [chrome] <defunct>
root     32758   895  0 Oct18 ?        00:00:02 [chrome] <defunct>
root     32759 15393  0 Oct18 ?        00:00:00 [chrome] <defunct>
root     32760 13849  0 01:23 ?        00:00:00 [chrome] <defunct>
root     32761   472  0 Oct18 ?        00:00:00 [chrome] <defunct>
root     32762 19360  0 01:35 ?        00:00:00 [chrome] <defunct>
root     32763 30701  0 00:34 ?        00:00:00 [chrome] <defunct>
root     32764 17556  0 Oct18 ?        00:00:00 [chrome] <defunct>
root     32766  8102  0 00:49 ?        00:00:00 [cat] <defunct>
root     32767  9490  0 Oct18 ?        00:00:00 [chrome] <defunct>

The following code will solve the problem:

def quit_driver_and_reap_children(driver):
    log.debug('Quitting session: %s' % driver.session_id)
    driver.quit()
    try:
        pid = True
        while pid:
            pid = os.waitpid(-1, os.WNOHANG)
            log.debug("Reaped child: %s" % str(pid))

            #Wonka's Solution to avoid infinite loop cause pid value -> (0, 0)
            try:
                if pid[0] == 0:
                    pid = False
            except:
                pass
            #---- ----

    except ChildProcessError:
        pass
Answered By: Jimmy Engelbrecht

I see this pretty old thread, but maybe my case will be useful for somebody.
For some reasons I had to run a lot of scrapers with separate webdriver instance with headfull (non headless) browser for every request in Docker container with Xvfb. So every request produced 2-3 zombie processes with the firefox. (and 12 whith Chromedriver).
So after few minutes of scraing I had thousands of zombie processes.
driver.close() and driver.quit() had no success.
The Jimmy’s Engelbrecht solution is better, but it was killing only part of processes.
So the only working method for me was to enable init in docker container.

docker run --init container

It protects you from software that accidentally creates zombie processes, which can (over time!) starve your entire system for PIDs (and make it unusable).

Answered By: sganabis

I encountered the same problem: running chromedriver in docker. But when quit() is called, chromedriver becomes a zombie thread.
I used dumb-init to solve my problem. I guess this problem does not only appear in chromedriver, it is related to the characteristics of docker, which lacks some components of Linux, which makes it impossible to handle sub-threads correctly.

Dockerfile add:

RUN wget https://github.com/Yelp/dumb-init/releases/download/v1.2.5/dumb-init_1.2.5_amd64.deb
RUN sudo dpkg -i dumb-init_*.deb
ENTRYPOINT ["/usr/bin/dumb-init", "--", "./entrypoint.sh"]

entrypoint.sh:

#!/bin/sh
echo "使用参数为 $*"
exec java -jar $JAR_NAME "$@"

ENTRYPOINT and exec is very important in docker.

Answered By: Zhen.Yu

I don’t think that’s a problem for the OP but it may help someone else landing here with a similar problem: deleting the arg no-sandbox fixed the issue for me (source)

c#:

public static string GetPageHeadless(string url, out string redirectedUrl)
{
    var options = new ChromeOptions();
    options.AddArguments(new List<string>() { "headless", "disable-gpu" });
    using var service = ChromeDriverService.CreateDefaultService();
    service.HideCommandPromptWindow = true;
    using var browser = new ChromeDriver(service, options);
    browser.Navigate().GoToUrl(url);
    var html = browser.ExecuteScript("return document.body.parentElement.outerHTML");

    redirectedUrl = browser.Url;
    browser.Quit();
    return html.ToString();
}
Answered By: Johann

Sometimes driver.close() or driver.quit() still leaves behind zombie threads. You can kill those using taskkill like this

import subprocess

subprocess.call("TASKKILL /f  /IM  CHROME.EXE /T")
subprocess.call("TASKKILL /f  /IM  CHROMEDRIVER.exe /T")

In case where image naming is not conventional (in my case I used undetected_chromedriver), I handle like this since taskkill does not accept wildcard at the start of string

processes = subprocess.getoutput("tasklist /fo list | findstr "chrome"")
array = processes.split('n')
for i in array:
    image_name = list(filter(None, i.split(' ')))[2]
    subprocess.call(f"TASKKILL /f  /IM  {image_name} /T")
  • /f: force kill
  • /im: image name
  • /t: kill children too

More here on document

Answered By: Mjho