Python: Wait on all of `concurrent.futures.ThreadPoolExecutor`'s futures
Question:
I’ve given concurrent.futures.ThreadPoolExecutor
a bunch of tasks, and I want to wait until they’re all completed before proceeding with the flow. How can I do that, without having to save all the futures and call wait
on them? (I want an action on the executor.)
Answers:
Bakuriu’s answer is correct. Just to extend a little bit. As we all know a context manager has __enter__
and __exit__
method. Here is how class Executor
(ThreadPoolExecutor’s base class) is defined
class Executor(object):
# other methods
def shutdown(self, wait=True):
"""Clean-up the resources associated with the Executor.
It is safe to call this method several times. Otherwise, no other
methods can be called after this one.
Args:
wait: If True then shutdown will not return until all running
futures have finished executing and the resources used by the
executor have been reclaimed.
"""
pass
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self.shutdown(wait=True)
return False
And it is ThreadPoolExecutor
that actually defines the shutdown
method
class ThreadPoolExecutor(_base.Executor):
def shutdown(self, wait=True):
with self._shutdown_lock:
self._shutdown = True
self._work_queue.put(None)
if wait:
for t in self._threads:
t.join()
Just call Executor.shutdown
:
shutdown(wait=True)
Signal the executor that it should free any resources that it is
using when the currently pending futures are done executing. Calls
to Executor.submit()
and Executor.map()
made after shutdown will
raise RuntimeError
.
If wait is True
then this method will not return until all the pending futures are
done executing and the resources associated with the executor have been freed.
However if you keep track of your futures in a list then you can avoid shutting the executor down for future use using the futures.wait()
function:
concurrent.futures.wait(fs, timeout=None, return_when=ALL_COMPLETED)
Wait for the Future
instances (possibly created by different
Executor
instances) given by fs
to complete. Returns a named 2-tuple
of sets. The first set, named done, contains the futures that
completed (finished or were cancelled) before the wait completed. The
second set, named not_done, contains uncompleted futures.
note that if you don’t provide a timeout
it waits until all futures have completed.
You can also use futures.as_completed()
instead, however you’d have to iterate over it.
As stated before, one can use Executor.shutdown(wait=True)
, but also pay attention to the following note in the documentation:
You can avoid having to call this method explicitly if you use the with
statement, which will shutdown the Executor
(waiting as if Executor.shutdown()
were called with wait
set to True
):
import shutil
with ThreadPoolExecutor(max_workers=4) as e:
e.submit(shutil.copy, 'src1.txt', 'dest1.txt')
e.submit(shutil.copy, 'src2.txt', 'dest2.txt')
e.submit(shutil.copy, 'src3.txt', 'dest3.txt')
e.submit(shutil.copy, 'src4.txt', 'dest4.txt')
I’ve given concurrent.futures.ThreadPoolExecutor
a bunch of tasks, and I want to wait until they’re all completed before proceeding with the flow. How can I do that, without having to save all the futures and call wait
on them? (I want an action on the executor.)
Bakuriu’s answer is correct. Just to extend a little bit. As we all know a context manager has __enter__
and __exit__
method. Here is how class Executor
(ThreadPoolExecutor’s base class) is defined
class Executor(object):
# other methods
def shutdown(self, wait=True):
"""Clean-up the resources associated with the Executor.
It is safe to call this method several times. Otherwise, no other
methods can be called after this one.
Args:
wait: If True then shutdown will not return until all running
futures have finished executing and the resources used by the
executor have been reclaimed.
"""
pass
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self.shutdown(wait=True)
return False
And it is ThreadPoolExecutor
that actually defines the shutdown
method
class ThreadPoolExecutor(_base.Executor):
def shutdown(self, wait=True):
with self._shutdown_lock:
self._shutdown = True
self._work_queue.put(None)
if wait:
for t in self._threads:
t.join()
Just call Executor.shutdown
:
shutdown(wait=True)
Signal the executor that it should free any resources that it is
using when the currently pending futures are done executing. Calls
toExecutor.submit()
andExecutor.map()
made after shutdown will
raiseRuntimeError
.If wait is
True
then this method will not return until all the pending futures are
done executing and the resources associated with the executor have been freed.
However if you keep track of your futures in a list then you can avoid shutting the executor down for future use using the futures.wait()
function:
concurrent.futures.wait(fs, timeout=None, return_when=ALL_COMPLETED)
Wait for the
Future
instances (possibly created by different
Executor
instances) given byfs
to complete. Returns a named 2-tuple
of sets. The first set, named done, contains the futures that
completed (finished or were cancelled) before the wait completed. The
second set, named not_done, contains uncompleted futures.
note that if you don’t provide a timeout
it waits until all futures have completed.
You can also use futures.as_completed()
instead, however you’d have to iterate over it.
As stated before, one can use Executor.shutdown(wait=True)
, but also pay attention to the following note in the documentation:
You can avoid having to call this method explicitly if you use the
with
statement, which will shutdown theExecutor
(waiting as ifExecutor.shutdown()
were called withwait
set toTrue
):import shutil with ThreadPoolExecutor(max_workers=4) as e: e.submit(shutil.copy, 'src1.txt', 'dest1.txt') e.submit(shutil.copy, 'src2.txt', 'dest2.txt') e.submit(shutil.copy, 'src3.txt', 'dest3.txt') e.submit(shutil.copy, 'src4.txt', 'dest4.txt')