Retry Celery tasks with exponential back off

Question:

For a task like this:

from celery.decorators import task

@task()
def add(x, y):
    if not x or not y:
        raise Exception("test error")
    return self.wait_until_server_responds(

if it throws an exception and I want to retry it from the daemon side, how can apply an exponential back off algorithm, i.e. after 2^2, 2^3,2^4 etc seconds?

Also is the retry maintained from the server side, such that if the worker happens to get killed then next worker that spawns will take the retry task?

Asked By: Quintin Par

||

Answers:

The task.request.retries attribute contains the number of tries so far,
so you can use this to implement exponential back-off:

from celery.task import task

@task(bind=True, max_retries=3)
def update_status(self, auth, status):
    try:
        Twitter(auth).update_status(status)
    except Twitter.WhaleFail as exc:
        raise self.retry(exc=exc, countdown=2 ** self.request.retries)

To prevent a Thundering Herd Problem, you may consider adding a random jitter to your exponential backoff:

import random
self.retry(exc=exc, countdown=int(random.uniform(2, 4) ** self.request.retries))
Answered By: asksol

As of Celery 4.2 you can configure your tasks to use an exponential backoff automatically: http://docs.celeryproject.org/en/master/userguide/tasks.html#automatic-retry-for-known-exceptions

@app.task(autoretry_for=(Exception,), retry_backoff=2)
def add(x, y):
    ...

(This was already in the docs for Celery 4.1 but actually wasn’t released then, see merge request)

Answered By: Rupert Angermeier

FYI, celery has a util function to calculate exponential backoff time with jitter here, so you don’t need to write your own.

def get_exponential_backoff_interval(
    factor,
    retries,
    maximum,
    full_jitter=False
):
    """Calculate the exponential backoff wait time."""
    # Will be zero if factor equals 0
    countdown = min(maximum, factor * (2 ** retries))
    # Full jitter according to
    # https://www.awsarchitectureblog.com/2015/03/backoff.html
    if full_jitter:
        countdown = random.randrange(countdown + 1)
    # Adjust according to maximum wait time and account for negative values.
    return max(0, countdown)
Answered By: lgylym
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.