How to properly create and run concurrent tasks using python's asyncio module?
Question:
I am trying to properly understand and implement two concurrently running Task
objects using Python 3’s relatively new asyncio
module.
In a nutshell, asyncio seems designed to handle asynchronous processes and concurrent Task
execution over an event loop. It promotes the use of await
(applied in async functions) as a callback-free way to wait for and use a result, without blocking the event loop. (Futures and callbacks are still a viable alternative.)
It also provides the asyncio.Task()
class, a specialized subclass of Future
designed to wrap coroutines. Preferably invoked by using the asyncio.ensure_future()
method. The intended use of asyncio tasks is to allow independently running tasks to run ‘concurrently’ with other tasks within the same event loop. My understanding is that Tasks
are connected to the event loop which then automatically keeps driving the coroutine between await
statements.
I like the idea of being able to use concurrent Tasks without needing to use one of the Executor
classes, but I haven’t found much elaboration on implementation.
This is how I’m currently doing it:
import asyncio
print('running async test')
async def say_boo():
i = 0
while True:
await asyncio.sleep(0)
print('...boo {0}'.format(i))
i += 1
async def say_baa():
i = 0
while True:
await asyncio.sleep(0)
print('...baa {0}'.format(i))
i += 1
# wrap in Task object
# -> automatically attaches to event loop and executes
boo = asyncio.ensure_future(say_boo())
baa = asyncio.ensure_future(say_baa())
loop = asyncio.get_event_loop()
loop.run_forever()
In the case of trying to concurrently run two looping Tasks, I’ve noticed that unless the Task has an internal await
expression, it will get stuck in the while
loop, effectively blocking other tasks from running (much like a normal while
loop). However, as soon the Tasks have to (a)wait, they seem to run concurrently without an issue.
Thus, the await
statements seem to provide the event loop with a foothold for switching back and forth between the tasks, giving the effect of concurrency.
Example output with internal await
:
running async test
...boo 0
...baa 0
...boo 1
...baa 1
...boo 2
...baa 2
Example output without internal await
:
...boo 0
...boo 1
...boo 2
...boo 3
...boo 4
Questions
Does this implementation pass for a ‘proper’ example of concurrent looping Tasks in asyncio
?
Is it correct that the only way this works is for a Task
to provide a blocking point (await
expression) in order for the event loop to juggle multiple tasks?
Edit
2022 UPDATE: Please note that the asyncio
API has changed fairly substantially since this question was asked. See the newly marked as correct answer which now shows the correct use of the API given Python 3.10. I still recommend the answer from @dano for broader knowledge of how this works under the hood.
Answers:
Yes, any coroutine that’s running inside your event loop will block other coroutines and tasks from running, unless it
- Calls another coroutine using
yield from
or await
(if using Python 3.5+).
- Returns.
This is because asyncio
is single-threaded; the only way for the event loop to run is for no other coroutine to be actively executing. Using yield from
/await
suspends the coroutine temporarily, giving the event loop a chance to work.
Your example code is fine, but in many cases, you probably wouldn’t want long-running code that isn’t doing asynchronous I/O running inside the event loop to begin with. In those cases, it often makes more sense to use asyncio.loop.run_in_executor
to run the code in a background thread or process. ProcessPoolExecutor
would be the better choice if your task is CPU-bound, ThreadPoolExecutor
would be used if you need to do some I/O that isn’t asyncio
-friendly.
Your two loops, for example, are completely CPU-bound and don’t share any state, so the best performance would come from using ProcessPoolExecutor
to run each loop in parallel across CPUs:
import asyncio
from concurrent.futures import ProcessPoolExecutor
print('running async test')
def say_boo():
i = 0
while True:
print('...boo {0}'.format(i))
i += 1
def say_baa():
i = 0
while True:
print('...baa {0}'.format(i))
i += 1
if __name__ == "__main__":
executor = ProcessPoolExecutor(2)
loop = asyncio.new_event_loop()
boo = loop.run_in_executor(executor, say_boo)
baa = loop.run_in_executor(executor, say_baa)
loop.run_forever()
You don’t necessarily need a yield from x
to give control over to the event loop.
In your example, I think the proper way would be to do a yield None
or equivalently a simple yield
, rather than a yield from asyncio.sleep(0.001)
:
import asyncio
@asyncio.coroutine
def say_boo():
i = 0
while True:
yield None
print("...boo {0}".format(i))
i += 1
@asyncio.coroutine
def say_baa():
i = 0
while True:
yield
print("...baa {0}".format(i))
i += 1
boo_task = asyncio.async(say_boo())
baa_task = asyncio.async(say_baa())
loop = asyncio.get_event_loop()
loop.run_forever()
Coroutines are just plain old Python generators.
Internally, the asyncio
event loop keeps a record of these generators and calls gen.send()
on each of them one by one in a never ending loop. Whenever you yield
, the call to gen.send()
completes and the loop can move on. (I’m simplifying it; take a look around https://hg.python.org/cpython/file/3.4/Lib/asyncio/tasks.py#l265 for the actual code)
That said, I would still go the run_in_executor
route if you need to do CPU intensive computation without sharing data.
The functions asyncio.ensure_future
and asyncio.get_event_loop
are deprecated in Python 3.10.
You can run the two coroutines say_boo
and say_baa
concurrently through asyncio.create_task
:
async def main():
boo = asyncio.create_task(say_boo())
baa = asyncio.create_task(say_baa())
await boo
await baa
asyncio.run(main())
You can also use asyncio.gather
async def main():
await asyncio.gather(say_boo(), say_baa())
asyncio.run(main())
I am trying to properly understand and implement two concurrently running Task
objects using Python 3’s relatively new asyncio
module.
In a nutshell, asyncio seems designed to handle asynchronous processes and concurrent Task
execution over an event loop. It promotes the use of await
(applied in async functions) as a callback-free way to wait for and use a result, without blocking the event loop. (Futures and callbacks are still a viable alternative.)
It also provides the asyncio.Task()
class, a specialized subclass of Future
designed to wrap coroutines. Preferably invoked by using the asyncio.ensure_future()
method. The intended use of asyncio tasks is to allow independently running tasks to run ‘concurrently’ with other tasks within the same event loop. My understanding is that Tasks
are connected to the event loop which then automatically keeps driving the coroutine between await
statements.
I like the idea of being able to use concurrent Tasks without needing to use one of the Executor
classes, but I haven’t found much elaboration on implementation.
This is how I’m currently doing it:
import asyncio
print('running async test')
async def say_boo():
i = 0
while True:
await asyncio.sleep(0)
print('...boo {0}'.format(i))
i += 1
async def say_baa():
i = 0
while True:
await asyncio.sleep(0)
print('...baa {0}'.format(i))
i += 1
# wrap in Task object
# -> automatically attaches to event loop and executes
boo = asyncio.ensure_future(say_boo())
baa = asyncio.ensure_future(say_baa())
loop = asyncio.get_event_loop()
loop.run_forever()
In the case of trying to concurrently run two looping Tasks, I’ve noticed that unless the Task has an internal await
expression, it will get stuck in the while
loop, effectively blocking other tasks from running (much like a normal while
loop). However, as soon the Tasks have to (a)wait, they seem to run concurrently without an issue.
Thus, the await
statements seem to provide the event loop with a foothold for switching back and forth between the tasks, giving the effect of concurrency.
Example output with internal await
:
running async test
...boo 0
...baa 0
...boo 1
...baa 1
...boo 2
...baa 2
Example output without internal await
:
...boo 0
...boo 1
...boo 2
...boo 3
...boo 4
Questions
Does this implementation pass for a ‘proper’ example of concurrent looping Tasks in asyncio
?
Is it correct that the only way this works is for a Task
to provide a blocking point (await
expression) in order for the event loop to juggle multiple tasks?
Edit
2022 UPDATE: Please note that the asyncio
API has changed fairly substantially since this question was asked. See the newly marked as correct answer which now shows the correct use of the API given Python 3.10. I still recommend the answer from @dano for broader knowledge of how this works under the hood.
Yes, any coroutine that’s running inside your event loop will block other coroutines and tasks from running, unless it
- Calls another coroutine using
yield from
orawait
(if using Python 3.5+). - Returns.
This is because asyncio
is single-threaded; the only way for the event loop to run is for no other coroutine to be actively executing. Using yield from
/await
suspends the coroutine temporarily, giving the event loop a chance to work.
Your example code is fine, but in many cases, you probably wouldn’t want long-running code that isn’t doing asynchronous I/O running inside the event loop to begin with. In those cases, it often makes more sense to use asyncio.loop.run_in_executor
to run the code in a background thread or process. ProcessPoolExecutor
would be the better choice if your task is CPU-bound, ThreadPoolExecutor
would be used if you need to do some I/O that isn’t asyncio
-friendly.
Your two loops, for example, are completely CPU-bound and don’t share any state, so the best performance would come from using ProcessPoolExecutor
to run each loop in parallel across CPUs:
import asyncio
from concurrent.futures import ProcessPoolExecutor
print('running async test')
def say_boo():
i = 0
while True:
print('...boo {0}'.format(i))
i += 1
def say_baa():
i = 0
while True:
print('...baa {0}'.format(i))
i += 1
if __name__ == "__main__":
executor = ProcessPoolExecutor(2)
loop = asyncio.new_event_loop()
boo = loop.run_in_executor(executor, say_boo)
baa = loop.run_in_executor(executor, say_baa)
loop.run_forever()
You don’t necessarily need a yield from x
to give control over to the event loop.
In your example, I think the proper way would be to do a yield None
or equivalently a simple yield
, rather than a yield from asyncio.sleep(0.001)
:
import asyncio
@asyncio.coroutine
def say_boo():
i = 0
while True:
yield None
print("...boo {0}".format(i))
i += 1
@asyncio.coroutine
def say_baa():
i = 0
while True:
yield
print("...baa {0}".format(i))
i += 1
boo_task = asyncio.async(say_boo())
baa_task = asyncio.async(say_baa())
loop = asyncio.get_event_loop()
loop.run_forever()
Coroutines are just plain old Python generators.
Internally, the asyncio
event loop keeps a record of these generators and calls gen.send()
on each of them one by one in a never ending loop. Whenever you yield
, the call to gen.send()
completes and the loop can move on. (I’m simplifying it; take a look around https://hg.python.org/cpython/file/3.4/Lib/asyncio/tasks.py#l265 for the actual code)
That said, I would still go the run_in_executor
route if you need to do CPU intensive computation without sharing data.
The functions asyncio.ensure_future
and asyncio.get_event_loop
are deprecated in Python 3.10.
You can run the two coroutines say_boo
and say_baa
concurrently through asyncio.create_task
:
async def main():
boo = asyncio.create_task(say_boo())
baa = asyncio.create_task(say_baa())
await boo
await baa
asyncio.run(main())
You can also use asyncio.gather
async def main():
await asyncio.gather(say_boo(), say_baa())
asyncio.run(main())