When using asyncio, how do you allow all running tasks to finish before shutting down the event loop

Question:

I have the following code:

@asyncio.coroutine
def do_something_periodically():
    while True:
        asyncio.async(my_expensive_operation())
        yield from asyncio.sleep(my_interval)
        if shutdown_flag_is_set:
            print("Shutting down")
            break

I run this function until complete. The problem occurs when shutdown is set – the function completes and any pending tasks are never run.

This is the error:

task: <Task pending coro=<report() running at script.py:33> wait_for=<Future pending cb=[Task._wakeup()]>>

How do I schedule a shutdown correctly?

To give some context, I’m writing a system monitor which reads from /proc/stat every 5 seconds, computes the cpu usage in that period, and then sends the result to a server. I want to keep scheduling these monitoring jobs until I receive sigterm, when I stop scheduling, wait for all current jobs to finish, and exit gracefully.

Asked By: derekdreery

||

Answers:

You can retrieve unfinished tasks and run the loop again until they finished, then close the loop or exit your program.

pending = asyncio.all_tasks()
loop.run_until_complete(asyncio.gather(*pending))
  • pending is a list of pending tasks.
  • asyncio.gather() allows to wait on several tasks at once.

If you want to ensure all the tasks are completed inside a coroutine (maybe you have a “main” coroutine), you can do it this way, for instance:

async def do_something_periodically():
    while True:
        asyncio.create_task(my_expensive_operation())
        await asyncio.sleep(my_interval)
        if shutdown_flag_is_set:
            print("Shutting down")
            break

    await asyncio.gather(*asyncio.all_tasks())

Also, in this case, since all the tasks are created in the same coroutine, you already have access to the tasks:

async def do_something_periodically():
    tasks = []
    while True:
        tasks.append(asyncio.create_task(my_expensive_operation()))
        await asyncio.sleep(my_interval)
        if shutdown_flag_is_set:
            print("Shutting down")
            break

    await asyncio.gather(*tasks)
Answered By: Martin Richard

As of Python 3.7 the above answer uses multiple deprecated APIs (asyncio.async and Task.all_tasks,@asyncio.coroutine, yield from, etc.) and you should rather use this:

import asyncio


async def my_expensive_operation(expense):
    print(await asyncio.sleep(expense, result="Expensive operation finished."))


async def do_something_periodically(expense, interval):
    while True:
        asyncio.create_task(my_expensive_operation(expense))
        await asyncio.sleep(interval)


loop = asyncio.get_event_loop()
coro = do_something_periodically(1, 1)

try:
    loop.run_until_complete(coro)
except KeyboardInterrupt:
    coro.close()
    tasks = asyncio.all_tasks(loop)
    expensive_tasks = {task for task in tasks if task._coro.__name__ != coro.__name__}
    loop.run_until_complete(asyncio.gather(*expensive_tasks))

You might also consider using asyncio.shield, although by doing this way you won’t get ALL the running tasks finished but only shielded. But it still can be useful in some scenarios.

Besides that, as of Python 3.7 we also can use the high-level API method asynio.run here. As Python core developer, Yury Selivanov suggests:
https://youtu.be/ReXxO_azV-w?t=636
Note: asyncio.run function has been added to asyncio in Python 3.7 on a provisional basis.

Hope that helps!

import asyncio


async def my_expensive_operation(expense):
    print(await asyncio.sleep(expense, result="Expensive operation finished."))


async def do_something_periodically(expense, interval):
    while True:
        asyncio.create_task(my_expensive_operation(expense))
        # using asyncio.shield
        await asyncio.shield(asyncio.sleep(interval))


coro = do_something_periodically(1, 1)

if __name__ == "__main__":
    try:
        # using asyncio.run
        asyncio.run(coro)
    except KeyboardInterrupt:
        print('Cancelled!')
Answered By: Ramil Aglyautdinov

Use a wrapper coroutine that waits until the pending task count is 1 before returning.

async def loop_job():
    asyncio.create_task(do_something_periodically())
    while len(asyncio.Task.all_tasks()) > 1:  # Any task besides loop_job() itself?
        await asyncio.sleep(0.2)

asyncio.run(loop_job())
Answered By: gilch

I’m not sure if this is what you’ve asked for but I had a similar problem and here is the ultimate solution that I came up with.

The code is python 3 compatible and uses only public asyncio APIs (meaning no hacky _coro and no deprecated APIs).

import asyncio

async def fn():
  await asyncio.sleep(1.5)
  print('fn')

async def main():
    print('main start')
    asyncio.create_task(fn()) # run in parallel
    await asyncio.sleep(0.2)
    print('main end')


def async_run_and_await_all_tasks(main):
  def get_pending_tasks():
      tasks = asyncio.Task.all_tasks()
      pending = [task for task in tasks if task != run_main_task and not task.done()]
      return pending

  async def run_main():
      await main()

      while True:
          pending_tasks = get_pending_tasks()
          if len(pending_tasks) == 0: return
          await asyncio.gather(*pending_tasks)

  loop = asyncio.new_event_loop()
  run_main_coro = run_main()
  run_main_task = loop.create_task(run_main_coro)
  loop.run_until_complete(run_main_task)

# asyncio.run(main()) # doesn't print from fn task, because main finishes earlier
async_run_and_await_all_tasks(main)

output (as expected):

main start
main end
fn

That async_run_and_await_all_tasks function will make python to behave in a nodejs manner: exit only when there are no unfinished tasks.

Answered By: grabantot

If you want a clean way to await on all running tasks created within some local scope without leaking memory (and while preventing garbage collection errors), you can maintain a set of running tasks and use task.add_done_callback(...) to remove the task from the set. Here is a class that handles this for you:

class TaskSet:
    def __init__(self):
        self.tasks = set()

    def add(self, coroutine: Coroutine) -> Task:
        task = asyncio.create_task(coroutine)
        self.tasks.add(task)
        task.add_done_callback(lambda _: self.tasks.remove(task))
        return task

    def __await__(self):
        return asyncio.gather(*self.tasks).__await__()

Which can be used like this:

async def my_function():
    await asyncio.sleep(0.5)


async def go():
    tasks = TaskSet()
    for i in range(10):
        tasks.add(my_function())
    await tasks

I noticed some answers suggested using asyncio.gather(*asyncio.all_tasks()), but the issue with that can sometimes be an infinite loop where it waits for the asyncio.current_task() to complete, which is itself. Some answers suggested some complicated workarounds involving checking coro names or len(asyncio.all_tasks()), but it turns out it’s very simple to do by taking advantage of set operations:

async def main():
    # Create some tasks.
    for _ in range(10):
        asyncio.create_task(asyncio.sleep(10))
    # Wait for all other tasks to finish other than the current task i.e. main().
    await asyncio.gather(*asyncio.all_tasks() - {asyncio.current_task()})

My use case has some main tasks that spawn short-lived tasks. This answer nicely exits immediately on seeing the main tasks finish (and some transient tasks as well) however I wanted a tidy up for other tasks. A time delay wouldn’t work (as additional tasks may be created) so actively using .cancel() seemed the right choice.

Code is:

import asyncio

MAX_TASKS = 10
task_maker_count = 0

async def task_maker():
    global task_maker_count
    task_maker_count += 1
    if len(asyncio.all_tasks()) < MAX_TASKS:
        asyncio.create_task(task_maker())
        asyncio.create_task(task_maker())

async def main_task():
    asyncio.create_task(task_maker())
    await asyncio.sleep(2.0)

async def main():
    global task_maker_count
    asyncio.create_task(main_task())
    asyncio.create_task(main_task())

Test

    await asyncio.gather(*asyncio.all_tasks() - {asyncio.current_task()})
    for task in [*asyncio.all_tasks() - {asyncio.current_task()}]:
        task.cancel()
    await asyncio.gather(*asyncio.all_tasks() - {asyncio.current_task()},
                         return_exceptions=True)  # needed for CancelledError
    print(f'{task_maker_count} task_maker tasks created')

if __name__ == '__main__':
    asyncio.run(main())

Result on my computer is:

194672 task_maker tasks created

Not specifically relevant however bumping MAX_TASKS to the thousands dramatically reduces the number of tasks completed.

Answered By: jwal
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.