Running an async background task in Tornado

Question:

Reading the Tornado documentation, it’s very clear how to call an async function to return a response:

class GenAsyncHandler(RequestHandler):
    @gen.coroutine
    def get(self):
        http_client = AsyncHTTPClient()
        response = yield http_client.fetch("http://example.com")
        do_something_with_response(response)
        self.render("template.html")

What’s lacking is how should a call be made asynchronously to a background task that has no relevance to the current request:

class GenAsyncHandler(RequestHandler):
    @gen.coroutine
    def _background_task():
        pass  # do lots of background stuff

    @gen.coroutine
    def get(self):
        _dont_care = yield self._background_task()
        self.render("template.html")

This code would be expected to work, except that it runs synchronously and the request waits on it until it’s finished.

What is the right way to asynchronously call this task, while immediately returning the current request?

Asked By: Yuval Adam

||

Answers:

Simply do:

self._background_task()

The _background_task coroutine returns a Future which is unresolved until the coroutine completes. If you don’t yield the Future, and instead simply execute the next line immediately, then get() doesn’t wait for _background_task to finish.

An interesting detail is that, until _background_task finishes, it maintains a reference to self. (Don’t forget to add self as a parameter, by the way.) Your RequestHandler won’t be garbage collected until after _background_task completes.

Update: Since Tornado 4.0 (July 2014), the below functionality is available in the IOLoop.spawn_callback method.

Unfortunately it’s kind of tricky. You need to both detach the background task from the current request (so that a failure in the background task doesn’t result in a random exception thrown into the request) and ensure that something is listening to the background task’s result (to log its errors if nothing else). This means something like this:

from tornado.ioloop import IOLoop
from tornado.stack_context import run_in_stack_context, NullContext
IOLoop.current().add_future(run_in_stack_context(NullContext(), self._background_task),
                            lambda f: f.result())

Something like this will probably be added to tornado itself in the future.

Answered By: Ben Darnell

I recommend using toro. It provides a relatively simple mechanism for setting up a background queue of tasks.

The following code (put in queue.py for example), starts a simple “worker()” that simply waits until there is something in his queue. If you call queue.add(function,async,*args,**kwargs) this adds an item to the queue which will wake up worker() which then kicks off the task.

I added the async parameter so that this can support background tasks wrapped in @gen.coroutine and those without.

import toro,tornado.gen
queue = toro.Queue()
@tornado.gen.coroutine
def add(function,async,*args,**kwargs):
   item = dict(function=function,async=async,args=args,kwargs=kwargs)
   yield queue.put(item)

@tornado.gen.coroutine
def worker():
   while True:
      print("worker() sleeping until I get next item")
      item = yield queue.get()
      print("worker() waking up to process: %s" % item)
      try:
         if item['async']:
            yield item['function'](*item['args'],**item['kwargs'])
         else:
            item['function'](*item['args'],**item['kwargs'])
      except Exception as e:
         print("worker() failed to run item: %s, received exception:n%s" % (item,e))

@tornado.gen.coroutine
def start():
   yield worker()

In your main tornado app:

import queue
queue.start()

And now you can schedule a back ground task quite simply:

def my_func(arg1,somekwarg=None):
   print("in my_func() with %s %s" % (arg1,somekwarg))

queue.add(my_func,False,somearg,somekwarg=someval)
Answered By: Mike N

I have a time-consuming task in post request, maybe more than 30 minutes need, but client required return a result immediately.

First, I used IOLoop.current().spawn_callback. It works! but! If the first request task is running, second request task blocked! Because all tasks are in main event loop when use spawn_callback, so one task is synchronous execution, other tasks blocked.

Last, I use tornado.concurrent. Example:

import datetime
import time

from tornado.ioloop import IOLoop
import tornado.web
from tornado import concurrent

executor = concurrent.futures.ThreadPoolExecutor(8)


class Handler(tornado.web.RequestHandler):

    def get(self):
        def task(arg):
            for i in range(10):
                time.sleep(1)
                print(arg, i)

        executor.submit(task, datetime.datetime.now())
        self.write('request accepted')


def make_app():
    return tornado.web.Application([
        (r"/", Handler),
    ])


if __name__ == "__main__":
    app = make_app()
    app.listen(8000, '0.0.0.0')
    IOLoop.current().start()

and visit http://127.0.0.1:8000, you can see it’s run ok:

2017-01-17 22:42:10.983632 0
2017-01-17 22:42:10.983632 1
2017-01-17 22:42:10.983632 2
2017-01-17 22:42:13.710145 0
2017-01-17 22:42:10.983632 3
2017-01-17 22:42:13.710145 1
2017-01-17 22:42:10.983632 4
2017-01-17 22:42:13.710145 2
2017-01-17 22:42:10.983632 5
2017-01-17 22:42:16.694966 0
2017-01-17 22:42:13.710145 3
2017-01-17 22:42:10.983632 6
2017-01-17 22:42:16.694966 1
2017-01-17 22:42:13.710145 4
2017-01-17 22:42:10.983632 7
2017-01-17 22:42:16.694966 2
2017-01-17 22:42:13.710145 5
2017-01-17 22:42:10.983632 8
2017-01-17 22:42:16.694966 3
2017-01-17 22:42:13.710145 6
2017-01-17 22:42:19.790646 0
2017-01-17 22:42:10.983632 9
2017-01-17 22:42:16.694966 4
2017-01-17 22:42:13.710145 7
2017-01-17 22:42:19.790646 1
2017-01-17 22:42:16.694966 5
2017-01-17 22:42:13.710145 8
2017-01-17 22:42:19.790646 2
2017-01-17 22:42:16.694966 6
2017-01-17 22:42:13.710145 9
2017-01-17 22:42:19.790646 3
2017-01-17 22:42:16.694966 7
2017-01-17 22:42:19.790646 4
2017-01-17 22:42:16.694966 8
2017-01-17 22:42:19.790646 5
2017-01-17 22:42:16.694966 9
2017-01-17 22:42:19.790646 6
2017-01-17 22:42:19.790646 7
2017-01-17 22:42:19.790646 8
2017-01-17 22:42:19.790646 9

Want to help everyone!

Answered By: Legolas Bloom

Here’s my answer in 2019!

Start with some slow non-blocking code. See: http://www.tornadoweb.org/en/stable/faq.html#id2

async def _do_slow_task(self, pk):
    await asyncio.sleep(pk)
    logger.info(f'Finished slow task after {pk} seconds')

See here to understand the difference between async and blocking: http://www.tornadoweb.org/en/stable/guide/async.html#asynchronous-and-non-blocking-i-o

Then, make your request method a coroutine with the async/await syntax, which makes it non-blocking in order to handle multiple requests in parallel.

async def post(self):
    """Make a few requests with different pks and you should see that
        the numbers logged are in ascending order.
    """
    pk = self.get_query_argument('pk')
    try:
        record = await self.db_queryone(
            f"SELECT * FROM records WHERE id = {int(pk)};"
        )
    except Exception as e:
        self.set_status(400)
        self.write(str(e))
        return
    await self._do_slow_task(pk)
    self.write(f'Received {pk}')

Now, modify the method a bit to run in the background or

“fire and forget” a coroutine without waiting for its result

so that the client receives a response right away. See: http://www.tornadoweb.org/en/stable/guide/coroutines.html#how-to-call-a-coroutine

async def post(self):
    """Make a few requests with different pks and you should see responses
        right away, and eventually log messages with the numbers in
        ascending order.
    """
    pk = self.get_query_argument('pk')
    try:
        record = await self.db_queryone(
            f"SELECT * FROM records WHERE id = {int(pk)};"
        )
    except Exception as e:
        self.set_status(400)
        self.write(str(e))
        return
    IOLoop.current().spawn_callback(self._do_slow_task, pk)
    self.write(f'Received {pk}')
Answered By: valem
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.