flask getting celery AsyncResult task via id returns incorrect task

Question:

I’m trying to get celery running with flask and then show the result, according to this tutorial: https://blog.miguelgrinberg.com/post/using-celery-with-flask. But after the task successfully finished, the flask app still sees a "pending" task. The way of grabbing the task via id apparently does not return the same task object.

When I hook into the longtask() function, task.state is first "PENDING" and then after 15 seconds it’s "SUCCESS" like it should be. Then the celery worker also returns the result, so that part works. But in the taskstatus() function where I get the task via task = long_task.AsyncResult(task_id), task.state always stays "PENDING" and other attributes like task.info stay None. Why does that happen and how can I access my task object properly?

Python 3.8.16 
Flask 2.2.2 
celery 5.2.7
RabbitMQ 3.11.9 

My system is Windows unfortunately but in general it should work according to this post. So I start my celery worker like this:

celery -A app.celery worker --loglevel=info --pool=eventlet

code:

import time
from flask import Flask, url_for, jsonify
from celery import Celery

app = Flask(__name__)

app.config['CELERY_BROKER_URL'] = 'amqp://celery:celery@localhost:5672/' 
app.config['result_backend'] = 'rpc://celery:celery@localhost:5672/' 

celery = Celery(app.name, broker=app.config['CELERY_BROKER_URL'])
celery.conf.update(app.config)
celery.set_default()

@celery.task(bind=True)
def long_task(self):
    for i in range(15):
        message = '{0} {1} {2}...'
        self.update_state(state='PROGRESS',
                          meta={'current': i, 'total': 15,
                                'status': message})
        time.sleep(1)
    return {'status': 'Done'}

@app.route('/longtask', methods=['POST'])
def longtask():
    task = long_task.apply_async()  # after 15 seconds: task.state == "SUCCESS"
    return jsonify({}), 202, {'Location': url_for('taskstatus', task_id=task.id)}

@app.route('/status/<task_id>')
def taskstatus(task_id):
    task = long_task.AsyncResult(task_id)  # task.state always "PENDING" 
    return jsonify({'result': task.state})

the celery worker returns a success after 15 seconds:

[2023-02-22 16:06:57,697: INFO/MainProcess] Task app.long_task[37a4e58c-857b-470c-823e-d6b9759458e3] received
[2023-02-22 16:07:11,847: INFO/MainProcess] Task app.long_task[37a4e58c-857b-470c-823e-d6b9759458e3] succeeded in 14.140999999945052s: {'status': 'Done'}
Asked By: Hi_its_me

||

Answers:

After 1,5 days of research and despair, I found the answer myself. The reason is the rpc backend which does not store the tasks (see this post), as described in the celery docs.

I’m still wondering, since ampq seems deprecated as backend, is there another way to use rabbitmq as backend in which task values are stored? Or do I have to setup redis or mysql now?

Edit:
I did setup redis and it works. Had to install Ubuntu and redis and then simply change the backend path:

app.config['CELERY_BROKER_URL'] = 'amqp://celery:celery@localhost:5672/'
app.config['result_backend'] = 'redis://localhost/' 
Answered By: Hi_its_me
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.