Executing an awaitable / async function in Python RQ

Question:

My jobs are all a series of requests that need to be made per object. Ie, its a User with several data points (potentially hundreds) that need to be added to that user with requests. I had originally written those requests to run synchronously but it was blocking and slow. I was sending each User job to Python RQ and have 10 workers going through the Users sent down the queue. 1 worker, 1 user, blocking requests.

I’ve re-written my User job to use aiohttp instead of python requests, and its significantly faster. On the Python RQ documentation it says that ‘Any Python function call can be put on an RQ queue.’ but I can’t figure out how to send my async function down the queue?


async def get_prices(calls: List[dict]) -> List[dict]:
     async with aiohttp.ClientSession() as session:
         for price in prices.items():
                price_type, date = price
                price = await pg.get_price(
                    session=session, lookup_date=date
                )
        do_some_other_stuff()
        await session.close()


from core.extensions import test_queue
from prices import get_prices
job = test_queue.enqueue(get_prices, kwargs={"username":'username'})

The problem is that get_prices is never awaited, it just remains a coroutine futures…. How can I await my function on the queue?

Asked By: phil0s0pher

||

Answers:

Since python-rq won’t support asyncio directly, you can use a synchronous function that calls asyncio.run instead.

async def _get_prices(calls: List[dict]) -> List[dict]:
    # ...

def get_prices(*args, **kwargs):
    asyncio.run(_get_prices(*args, **kwargs))

Note, however, that asyncio.run only works if there’s no other running event loop. If you expect an asyncio loop to already be running, use loop.create_task instead.

def get_prices(*args, **kwargs):
    loop = asyncio.get_event_loop()
    coro = _get_prices(*args, **kwargs)
    loop.create_task(coro)

Then when python-rq calls get_prices it will cause the async function to be executed.

Another option would be to not use asyncio for making requests, like using grequests, threads, or something like that which will work with synchronous functions.

Answered By: sytech

You might consider using arq.

Created by the maintainer of Pydantic, it is not the same thing, but was inspired on rq.

Besides, it’s still Redis and queues (with asyncio now).

From the docs:

Job queues and RPC in python with asyncio and redis.

arq was conceived as a simple, modern and performant successor to rq.

Simple usage:

import asyncio
from aiohttp import ClientSession
from arq import create_pool
from arq.connections import RedisSettings

async def download_content(ctx, url):
    session: ClientSession = ctx['session']
    async with session.get(url) as response:
        content = await response.text()
        print(f'{url}: {content:.80}...')
    return len(content)

async def startup(ctx):
    ctx['session'] = ClientSession()

async def shutdown(ctx):
    await ctx['session'].close()

async def main():
    redis = await create_pool(RedisSettings())
    for url in ('https://facebook.com', 'https://microsoft.com', 'https://github.com'):
        await redis.enqueue_job('download_content', url)

# WorkerSettings defines the settings to use when creating the work,
# it's used by the arq cli.
# For a list of available settings, see https://arq-docs.helpmanual.io/#arq.worker.Worker
class WorkerSettings:
    functions = [download_content]
    on_startup = startup
    on_shutdown = shutdown

if __name__ == '__main__':
    asyncio.run(main())
Answered By: Ramon Dias

Following up on @sytech’s answer: what he suggested is now supported in RQ after the introduction of this PR: https://github.com/rq/rq/pull/1405
You don’t need to do anything extra, as long as your job function is an async coroutine (async def get_prices).

Note however that this doesn’t mean that the worker is asynchronous, but rather that it can run job functions that are coroutines: as expected, it will block until the coroutine is done without doing anything else. The coroutine is run asynchronously.

Answered By: Gera Zenobi
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.