Python Code Coverage and Multiprocessing
Question:
I use coveralls in combination with coverage.py to track python code coverage of my testing scripts. I use the following commands:
coverage run --parallel-mode --source=mysource --omit=*/stuff/idont/need.py ./mysource/tests/run_all_tests.py
coverage combine
coveralls --verbose
This works quite nicely with the exception of multiprocessing. Code executed by worker pools or child processes is not tracked.
Is there a possibility to also track multiprocessing code? Any particular option I am missing? Maybe adding wrappers to the multiprocessing library to start coverage every time a new process is spawned?
EDIT:
I (and jonrsharpe, also 🙂 found a monkey-patch for multiprocessing.
However, this does not work for me, my Tracis-CI build is killed almost right after the start. I checked the problem on my local machine and apparently adding the patch to multiprocessing busts my memory. Tests that take much less than 1GB of memory need more than 16GB with this fix.
EDIT2:
The monkey-patch does work after a small modification: Removing
the config_file
parsing (config_file=os.environ['COVERAGE_PROCESS_START']
) did the trick. This solved the issue of the bloated memory. Accordingly, the corresponding line simply becomes:
cov = coverage(data_suffix=True)
Answers:
Coverage 4.0 includes a command-line option --concurrency=multiprocessing
to deal with this. You must use coverage combine
afterward. For instance, if your tests are in regression_tests.py
, then you would simply do this at the command line:
coverage run --concurrency=multiprocessing regression_tests.py
coverage combine
I’ve had spent some time trying to make sure coverage works with multiprocessing.Pool
, but it never worked.
I have finally made a fix that makes it work – would be happy if someone directed me if I am doing something wrong.
https://gist.github.com/andreycizov/ee59806a3ac6955c127e511c5e84d2b6
One of the possible causes of missing coverage data from forked processes, even with concurrency=multiprocessing
, is the way of multiprocessing.Pool
shutdown. For example, with
statement leads to terminate()
call (see __exit__
here). As a consequence, pool workers have no time to save coverage data. I had to use close()
, timed join()
(in a thread), terminate
sequence instead of with
to get coverage results saved.
I use coveralls in combination with coverage.py to track python code coverage of my testing scripts. I use the following commands:
coverage run --parallel-mode --source=mysource --omit=*/stuff/idont/need.py ./mysource/tests/run_all_tests.py
coverage combine
coveralls --verbose
This works quite nicely with the exception of multiprocessing. Code executed by worker pools or child processes is not tracked.
Is there a possibility to also track multiprocessing code? Any particular option I am missing? Maybe adding wrappers to the multiprocessing library to start coverage every time a new process is spawned?
EDIT:
I (and jonrsharpe, also 🙂 found a monkey-patch for multiprocessing.
However, this does not work for me, my Tracis-CI build is killed almost right after the start. I checked the problem on my local machine and apparently adding the patch to multiprocessing busts my memory. Tests that take much less than 1GB of memory need more than 16GB with this fix.
EDIT2:
The monkey-patch does work after a small modification: Removing
the config_file
parsing (config_file=os.environ['COVERAGE_PROCESS_START']
) did the trick. This solved the issue of the bloated memory. Accordingly, the corresponding line simply becomes:
cov = coverage(data_suffix=True)
Coverage 4.0 includes a command-line option --concurrency=multiprocessing
to deal with this. You must use coverage combine
afterward. For instance, if your tests are in regression_tests.py
, then you would simply do this at the command line:
coverage run --concurrency=multiprocessing regression_tests.py
coverage combine
I’ve had spent some time trying to make sure coverage works with multiprocessing.Pool
, but it never worked.
I have finally made a fix that makes it work – would be happy if someone directed me if I am doing something wrong.
https://gist.github.com/andreycizov/ee59806a3ac6955c127e511c5e84d2b6
One of the possible causes of missing coverage data from forked processes, even with concurrency=multiprocessing
, is the way of multiprocessing.Pool
shutdown. For example, with
statement leads to terminate()
call (see __exit__
here). As a consequence, pool workers have no time to save coverage data. I had to use close()
, timed join()
(in a thread), terminate
sequence instead of with
to get coverage results saved.