asyncio.sleep(0) does not yield control to the event loop

Question:

I have a simple async setup which includes two coroutines: light_job and heavy_job. light_job halts in the middle and heavy_job starts. I want heavy_job to yield the control in the middle and allow light_job to finish but asyncio.sleep(0) is not working as I expect.

this is the setup:

import asyncio
import time

loop = asyncio.get_event_loop()


async def light_job():
    print("hello ")
    print(time.time())
    await asyncio.sleep(1)
    print(time.time())
    print("world!")


async def heavy_job():
    print("heavy start")
    time.sleep(3)
    print("heavy halt started")
    await asyncio.sleep(0)
    print("heavy halt ended")
    time.sleep(3)
    print("heavy done")

loop.run_until_complete(asyncio.gather(
    light_job(),
    heavy_job()
))

if I run this code, the light_job will not continue until after heavy_job is done. this is the outpu:

hello 
1668793123.159075
haevy start
heavy halt started
heavy halt ended
heavy done
1668793129.1706061
world!

but if I change asyncio.sleep(0) to asyncio.sleep(0.0001), the code will work as expected:

hello 
1668793379.599066
heavy start
heavy halt started
1668793382.605899
world!
heavy halt ended
heavy done

based on documentations and related threads, I expect asyncio.sleep(0) to work exactly as asyncio.sleep(0.0001). what is off here?

Asked By: Saeed Mofidi

||

Answers:

Call asyncio.sleep(0) 3 times:

import asyncio
import time


async def light_job():
    print("hello ")
    print(time.time())
    await asyncio.sleep(1)
    print(time.time())
    print("world!")


async def heavy_job():
    print("heavy start")
    time.sleep(3)
    print("heavy halt started")
    for _ in range(3):
        await asyncio.sleep(0)
    print("heavy halt ended")
    time.sleep(3)
    print("heavy done")


async def test():
    await asyncio.gather(
        light_job(),
        heavy_job()
    )

asyncio.run(test())

This results in:

hello 
1668844526.157173
heavy start
heavy halt started
1668844529.1575627
world!
heavy halt ended
heavy done

Looking at "asyncio/base_events.py", "_run_once" goes over pending timers first then runs everything it sees after calculating that. asyncio.sleep can only skip one iteration of the event loop. Multiple sleeps are required because asyncio.sleep(1) schedules a future which takes one extra iteration before giving back control to light_job by adding light_job back to the queue, and asyncio happens to run newly queued jobs last.

For a clearer picture, it is possible to add more print statements:

import asyncio
import time


async def light_job():
    print("hello ")
    print(time.time())
    await asyncio.sleep(1)
    print(time.time())
    print("world!")


async def heavy_job():
    print("heavy start")
    time.sleep(3)
    print("heavy halt started")
    # Sleep to yield to the event loop. light_job isn't detected as ready so this iteration of the loop will finish
    await asyncio.sleep(0)

    print("after 1 sleep")
    # We are still in front of the event loop. Yield so that the 1 second timer in light_job runs.
    # The timer will realize it itself has expired, then put light_job back onto the queue.
    await asyncio.sleep(0)

    # Again the current Python implementation puts us in front. Yield so that the light_job runs
    print("after 2 sleeps")
    await asyncio.sleep(0)

    print("heavy halt ended")
    time.sleep(3)
    print("heavy done")


async def test():
    await asyncio.gather(
        light_job(),
        heavy_job()
    )

asyncio.run(test())

Then add breakpoints in "def _run_once(self):" of "asyncio/base_events.py". Add a breakpoint printing "loop start" on line 1842 at the start a.k.a "sched_count =". Add another one at line 1910 at the end a.k.a "handle = None" printing "loop end". Then add one before each task is run on line 1897 a.k.a "if self._debug:" evaluating and printing "_format_handle(handle)". The sequence of events is revealed:

loop start
<Task pending name='Task-1' coro=<test() running at /home/home/PycharmProjects/sandbox/notsync.py:34> cb=[_run_until_complete_cb() at /usr/lib/python3.11/asyncio/base_events.py:180]>
loop end
loop start
<Task pending name='Task-2' coro=<light_job() running at /home/home/PycharmProjects/sandbox/notsync.py:5> cb=[gather.<locals>._done_callback() at /usr/lib/python3.11/asyncio/tasks.py:759]>
hello 
1668844827.5052986
<Task pending name='Task-3' coro=<heavy_job() running at /home/home/PycharmProjects/sandbox/notsync.py:13> cb=[gather.<locals>._done_callback() at /usr/lib/python3.11/asyncio/tasks.py:759]>
heavy start
heavy halt started
loop end
loop start
<Task pending name='Task-3' coro=<heavy_job() running at /home/home/PycharmProjects/sandbox/notsync.py:18> cb=[gather.<locals>._done_callback() at /usr/lib/python3.11/asyncio/tasks.py:759]>
after 1 sleep
<TimerHandle when=37442.097934711 _set_result_unless_cancelled(<Future pendi...ask_wakeup()]>, None) at /usr/lib/python3.11/asyncio/futures.py:317>
loop end
loop start
<Task pending name='Task-3' coro=<heavy_job() running at /home/home/PycharmProjects/sandbox/notsync.py:23> cb=[gather.<locals>._done_callback() at /usr/lib/python3.11/asyncio/tasks.py:759]>
after 2 sleeps
<Task pending name='Task-2' coro=<light_job() running at /home/home/PycharmProjects/sandbox/notsync.py:8> wait_for=<Future finished result=None> cb=[gather.<locals>._done_callback() at /usr/lib/python3.11/asyncio/tasks.py:759]>
1668844830.9250844
world!
loop end
loop start
<Task pending name='Task-3' coro=<heavy_job() running at /home/home/PycharmProjects/sandbox/notsync.py:27> cb=[gather.<locals>._done_callback() at /usr/lib/python3.11/asyncio/tasks.py:759]>
heavy halt ended
heavy done
<Handle gather.<locals>._done_callback(<Task finishe...> result=None>) at /usr/lib/python3.11/asyncio/tasks.py:759>
loop end
loop start
<Handle gather.<locals>._done_callback(<Task finishe...> result=None>) at /usr/lib/python3.11/asyncio/tasks.py:759>
loop end
loop start
<Task pending name='Task-1' coro=<test() running at /home/home/PycharmProjects/sandbox/notsync.py:35> wait_for=<_GatheringFuture finished result=[None, None]> cb=[_run_until_complete_cb() at /usr/lib/python3.11/asyncio/base_events.py:180]>
loop end
loop start
<Handle _run_until_complete_cb(<Task finishe...> result=None>) at /usr/lib/python3.11/asyncio/base_events.py:180>
loop end
loop start
<Task pending name='Task-4' coro=<BaseEventLoop.shutdown_asyncgens() running at /usr/lib/python3.11/asyncio/base_events.py:539> cb=[_run_until_complete_cb() at /usr/lib/python3.11/asyncio/base_events.py:180]>
loop end
loop start
<Handle _run_until_complete_cb(<Task finishe...> result=None>) at /usr/lib/python3.11/asyncio/base_events.py:180>
loop end
loop start
<Task pending name='Task-5' coro=<BaseEventLoop.shutdown_default_executor() running at /usr/lib/python3.11/asyncio/base_events.py:564> cb=[_run_until_complete_cb() at /usr/lib/python3.11/asyncio/base_events.py:180]>
loop end
loop start
<Handle _run_until_complete_cb(<Task finishe...> result=None>) at /usr/lib/python3.11/asyncio/base_events.py:180>
loop end
Answered By: Daniel T

I think this subject needs some more discussion. I intend this post as an appendix to Daniel T’s excellent and very clever answer – that’s a fine piece of work. But Dan Getz’s comment made me think that some more detail would be helpful.

Dan suggests that there is no general way to yield to another task. This is correct because there is no guarantee that any other Task is ready to run, nor is there any guarantee of the execution order of the various Tasks. The example program fails to meet expectations because of details in the event loop implementation, which I discuss below.

There are, however, tools for unambiguously synchronizing work between different Tasks. It’s probably a bad idea to rely on time intervals in asyncio.sleep() for this purpose. Consider the following program, which uses an asyncio.Event to force light_job() to finish before heavy_job() can enter its second time.sleep delay. This will always work because the program logic is explicit:

import asyncio
import time

event = asyncio.Event()

async def light_job():
    print("hello ")
    print(time.time())
    await asyncio.sleep(1)
    print(time.time())
    print("world!")
    event.set()


async def heavy_job():
    print("heavy start")
    time.sleep(3)
    print("heavy halt started")
    # await asyncio.sleep(0)
    await event.wait()
    print("heavy halt ended")
    time.sleep(3)
    print("heavy done")
    
async def main():
    await asyncio.gather(light_job(), heavy_job())

asyncio.run(main())

Even simpler is this approach, which avoids the use of Event and even of gather:

import asyncio
import time

async def light_job():
    print("hello ")
    print(time.time())
    await asyncio.sleep(1)
    print(time.time())
    print("world!")

async def heavy_job():
    light = asyncio.create_task(light_job())
    print("heavy start")
    time.sleep(3)
    print("heavy halt started")
    # await asyncio.sleep(0)
    await light
    print("heavy halt ended")
    time.sleep(3)
    print("heavy done")
    
async def main():
    await heavy_job()

asyncio.run(main())

As for why the original script failed, the explanation can be found in the event loop implementation. An event loop keeps track of two things: a list of "ready" items, representing Tasks that are able to execute right now; and a list of "scheduled" items, representing Tasks that are waiting for some time interval to expire.

Every time the event loop goes through a cycle, its first step is to examine the list of scheduled items and see if any are ready to proceed. It appends any of those items to the "ready" list. Then it executes this simple loop to run all the ready Tasks (I have omitted some diagnostic code; this is from Python3.10 standard library module base_events.py). Here, _ready is a deque. The items in the queue all have a run method that causes the Task to take one step forward, or in other words, to resume the Task at the point where it previously was suspended (typically an await expression).

    ntodo = len(self._ready)
    for i in range(ntodo):
        handle = self._ready.popleft()
        if handle._cancelled:
            continue
        else:
            handle._run()

It’s also the case that await asyncio.sleep(0) is implemented differently from await asyncio.sleep(x) where x > 0. In the first case, the await expression yields a value of None. The Task object simply appends an item to the "ready" list. In the second case, the await expression executes a loop.call_later function call, which creates a Future. The Task object appends an item to the "scheduled" list. Here is the implementation of asyncio.sleep in tasks.py:

@types.coroutine
def __sleep0():
    """Skip one event loop run cycle.

    This is a private helper for 'asyncio.sleep()', used
    when the 'delay' is set to 0.  It uses a bare 'yield'
    expression (which Task.__step knows how to handle)
    instead of creating a Future object.
    """
    yield


async def sleep(delay, result=None):
    """Coroutine that completes after a given time (in seconds)."""
    if delay <= 0:
        await __sleep0()
        return result

    loop = events.get_running_loop()
    future = loop.create_future()
    h = loop.call_later(delay,
                        futures._set_result_unless_cancelled,
                        future, result)
    try:
        return await future
    finally:
        h.cancel()

So in the example script in the original post, the Task test will start with two items in its "ready" list: [light_job, heavy_job]. The scheduled list is empty. Light_job starts and hits await asyncio.sleep(1), so an item is appended to the "scheduled" list that represents this time delay. Now heavy_job runs for three seconds and hits await asyncio.sleep(0), so an item is appended to the "ready" list which indicates that this Task is to proceed without delay. That’s the end of one full cycle of the event loop. The cycle ends even though the ready list isn’t empty at that point, because the await with a zero delay caused heavy_job to be appended to the ready list immediately.

In the next cycle of the event loop, the ready list has one item, which was placed there on the previous cycle: [heavy_job]. The scheduled list also has one item: [light_job]. The event loop examines the scheduled list and sees that light_job is now ready, so it appends light_job to ready_list, which now looks like this: [heavy_job, light_job]. So the code logic has essentially caused the order of the Tasks to get switched. Result: heavy_job runs twice in a row, once at the end of the first cycle and once at the beginning of the second.

This also explains what happened when you replaced await asyncio.sleep(0) with await asyncio.sleep(0.0001). In that case, the Task got appended to the scheduled list rather than the ready list. Then ready=[] and scheduled=[light_job, heavy_job]. On the next cycle of the loop both Tasks are ready, but the order will once again be [light_job, heavy_job].

This machinery is invisible to client code, as it should be, but it has a weird consequence in this particular script. Whether or not this should be called a "bug" is a matter of debate. I assume there are good performance reasons why asyncio.sleep(0) is implemented differently from asyncio.sleep(nonzero).

Answered By: Paul Cornelius
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.