Can you perform multi-threaded tasks within Django?

Question:

The sequence I would like to accomplish:

  1. A user clicks a button on a web page
  2. Some functions in model.py start to run. For example, gathering some data by crawling the internet
  3. When the functions are finished, the results are returned to the user.

Should I open a new thread inside of model.py to execute my functions? If so, how do I do this?

Asked By: Robert

||

Answers:

  1. Yes it can multi-thread, but generally one uses Celery to do the equivalent. You can read about how in the celery-django tutorial.
  2. It is rare that you actually want to force the user to wait for the website. While it’s better than risks a timeout.

Here’s an example of what you’re describing.

User sends request
Django receives => spawns a thread to do something else.
main thread finishes && other thread finishes 
... (later upon completion of both tasks)
response is sent to user as a package.

Better way:

User sends request
Django receives => lets Celery know "hey! do this!"
main thread finishes
response is sent to user
...(later)
user receives balance of transaction 
Answered By: cwallenpoole

If you don’t want to add some overkill framework to your project, you can simply use subprocess.Popen:

def my_command(request):
    command = '/my/command/to/run'  # Can even be 'python manage.py somecommand'
    subprocess.Popen(command, shell=True)
    command = '/other/command/to/run'
    subprocess.Popen(command, shell=True)
    return HttpResponse(status=204)

[edit] As mentioned in the comments, this will not start a background task and return the HttpResponse right away. It will execute both commands in parallel, and then return the HttpResponse once both are complete. Which is what OP asked.

Answered By: Thierry J.

As shown in this answer you can use the threading package to perform an asynchronous task. Everyone seems to recommend Celery, but it is often overkill for performing simple but long running tasks. I think it’s actually easier and more transparent to use threading.

Here’s a simple example for asyncing a crawler:

#views.py
import threading
from .models import Crawl

def startCrawl(request):
    task = Crawl()
    task.save()
    t = threading.Thread(target=doCrawl,args=[task.id])
    t.setDaemon(True)
    t.start()
    return JsonResponse({'id':task.id})

def checkCrawl(request,id):
    task = Crawl.objects.get(pk=id)
    return JsonResponse({'is_done':task.is_done, result:task.result})

def doCrawl(id):
    task = Crawl.objects.get(pk=id)
    # Do crawling, etc.

    task.result = result
    task.is_done = True
    task.save()

Your front end can make a request for startCrawl to start the crawl, it can make an Ajax request to check on it with checkCrawl which will return true and the result when it’s finished.


Update for Python3:

The documentation for the threading library recommends passing the daemon property as a keyword argument rather than using the setter:

t = threading.Thread(target=doCrawl,args=[task.id],daemon=True)
t.start()

Update for Python <3.7:

As discussed here, this bug can cause a slow memory leak that can overflow a long running server. The bug was fixed for Python 3.7 and above.

Answered By: nbwoodward
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.