Why asynchronous sleep task and cpu-bound task cannot proceed concurrently?

Question:

I need to send HTTP requests and do some CPU intensive task while waiting for the response. I tried to mock the situation with an asyncio.sleep and a CPU task below:

import asyncio

async def main():
    loop = asyncio.get_event_loop()
    start = loop.time()
    task = asyncio.create_task(asyncio.sleep(1))

    # ------Useless CPU-Bound Task------ #
    for n in range(10 ** 7):
        n **= 7
    # ---------------------------------- #

    print(f"CPU-bound process finished in {loop.time()-start:.2f} seconds.")

    await task
    print(f"Finished in {loop.time()-start:.2f} seconds.")

asyncio.run(main())

Output:

CPU-bound process finished in 2.12 seconds.
Finished in 3.12 seconds.

I expected the sleeping task to proceed during the CPU process but apparently they ran synchronously. This also makes me worry about the requests that I need to send such that CPU process might begin and completely block the requests so that they don’t get sent to the server until the process finishes etc.

So the question is why does this happen and how to prevent it?

I’ve also read somewhere that asyncio only switches context upon await calls. Does this have disadvantages in a situation like this, if so, how?

Append: Will using threading have any advantages over asyncio in this scenario? I know it’s many questions, but I’m really confused.

Asked By: UpTheIrons

||

Answers:

Asyncio tasks are more co-operative concurrency than true concurrency.

Your sleeper task won’t actually start running until you "yield" control to it, which is usually done with an await call. Since that happens after your main (CPU-intensive) code is finished, there will be an extra second after that before everything is actually done.

An await asyncio.sleep(0) between sleeper task creation and CPU-intensive work will allow the sleeper task to commence. It will them immediately yield back to the main task and they’ll run "concurrently".

Of course, a CPU-bound async task sort of defeats the purpose of asyncio since it won’t yield to allow other tasks to run in a timely manner. That doesn’t really matter for this sleeper but, if it was a task that had to do thirty things, one per second, that would be a problem.

If you need to do anything like that, it’s a good idea to either choose one of the other forty-eight ways of doing concurrency in Python :-), or yield enough in the main task so that other tasks can run. In other words, something like:

yield_cycle = 0.1                              # Cycle time.
then = time.monotonic()                        # Base time.
for n in range(10 ** 7):
    n **= 7
    if time.monotonic() - then > yield_cycle:  # Check cycle time.
        await asyncio.sleep(0)                 # Yield if exceeded.
        then = time.monotonic()                # Prep next cycle.

In fact, we have a helper function in our own code base which does exactly this. I can’t give you the actual source code but I think it’s (hopefully) simple enough to recite from memory:

async def play_nice(secs: float, base: float) -> float:
    """Yield periodically in intensive task.
    Initial call can use negative base to yield immediately.
    Args:
        secs: Minimum run time before yield will happen.
        base: Base monotonic time to use for calculations.
    Returns:
        New base time to use.
    """

    if base < 0:
        base = time.monotonic() - secs
    if time.monotonic() - base >= secs:
        await asyncio.sleep(0)
        return time.monotonic()
    return base

# Your code is then:

then = await play_nice(secs=0.1, base=-1)        # Initial yield.
for n in range(10 ** 7):
    n **= 7
    then = await play_nice(secs=0.1, base=then)  # Subsequent ones.
Answered By: paxdiablo

The reason is your CPU intensive task has the control until it yields it. You can force it to yield using sleep:

sleep() always suspends the current task, allowing other tasks to run.

Setting the delay to 0 provides an optimized path to allow other tasks to run. This can be used by long-running functions to avoid blocking the event loop for the full duration of the function call.

import asyncio

async def test_sleep(n):
    await asyncio.sleep(n)

async def main():
    loop = asyncio.get_event_loop()
    start = loop.time()
    task = asyncio.create_task(asyncio.sleep(1))
    await asyncio.sleep(0)

    # ------Useless CPU-Bound Task------ #
    for n in range(10 ** 7):
        n **= 7
    # ---------------------------------- #

    print(f'CPU-bound process finished in {loop.time()-start:.2f} seconds.')

    await task
    print(f"Finished in {loop.time()-start:.2f} seconds.")

await main()

Will output

CPU-bound process finished in 4.21 seconds.
Finished in 4.21 seconds.
Answered By: Alex Bochkarev