Dump Django stack trace on Gunicorn timeout

Question:

I’m trying to debug rare hangs of a Django application. So far I failed to isolate the problem, it happens about once a day in production and Gunicorn restarts the process with a message:

[CRITICAL] WORKER TIMEOUT

Is there a way to configure Django or Gunicorn to dump a stack trace of a process that is restarted?

Asked By: Jan Wrobel

||

Answers:

Try setting your Gunicorn log to be more verbose, maybe set it to INFO or DEBUG that might shed more light in the log.

You could also take a look at Dog Slow, which will log slow requests. https://pypi.python.org/pypi/dogslow.

And for general logging win, try using Sentry: https://www.getsentry.com/welcome/.

Random question, any crons on the server running at that time, backups, that sorta thing?

Answered By: krak3n

timeout is not meant as a request timeout. It’s meant as a liveness check for workers. For sync workers, this functions as a request timeout because the worker cannot do anything other than process the request. The asynchronous workers heartbeat even while they are handling long running requests, so unless the worker blocks/freezes it won’t be killed.

Gunicorn has a function called worker_abort (see gunicorn docs below).

def worker_abort(worker):
    worker.log.info("worker received abort signal")
    import threading, sys, traceback
    id2name = dict([(th.ident, th.name) for th in threading.enumerate()])
    code = []
    for threadId, stack in sys._current_frames().items():
        code.append("n# Thread: %s(%d)" % (id2name.get(threadId,""), threadId))
        stack = traceback.extract_stack(stack)
        for filename, lineno, name, line in stack:
            code.append('File: "%s", line %d, in %s' % (filename, lineno, name))
            if line:
                code.append("  %s" % (line.strip()))
    worker.log.debug("n".join(code))

Called when a worker received the SIGABRT signal. This call generally happens on timeout. The callable needs to accept one instance variable for the initialised Worker.

Sources:

http://docs.gunicorn.org/en/stable/settings.html, https://github.com/benoitc/gunicorn/issues/1493, https://github.com/benoitc/gunicorn/blob/master/examples/example_config.py

Answered By: Sami Start

This will print a stack trace for the worker at the time that it gets killed. You’ll need to create a gunicorn config file, and paste the following function into it.

import traceback
import io

def worker_abort(worker):
    debug_info = io.StringIO()
    debug_info.write("Traceback at time of timeout:n")
    traceback.print_stack(file=debug_info)
    worker.log.critical(debug_info.getvalue())
Answered By: Nick ODell
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.