Celery does not release memory
Question:
It looks like celery does not release memory after task finished. Every time a task finishes, there would be 5m-10m memory leak. So with thousands of tasks, soon it will use up all memory.
BROKER_URL = 'amqp://user@localhost:5672/vhost'
# CELERY_RESULT_BACKEND = 'amqp://user@localhost:5672/vhost'
CELERY_IMPORTS = (
'tasks.tasks',
)
CELERY_IGNORE_RESULT = True
CELERY_DISABLE_RATE_LIMITS = True
# CELERY_ACKS_LATE = True
CELERY_TASK_RESULT_EXPIRES = 3600
# maximum time for a task to execute
CELERYD_TASK_TIME_LIMIT = 600
CELERY_DEFAULT_ROUTING_KEY = "default"
CELERY_DEFAULT_QUEUE = 'default'
CELERY_DEFAULT_EXCHANGE = "default"
CELERY_DEFAULT_EXCHANGE_TYPE = "direct"
# CELERYD_MAX_TASKS_PER_CHILD = 50
CELERY_DISABLE_RATE_LIMITS = True
CELERYD_CONCURRENCY = 2
Might be same with issue, but it does not has an answer:
RabbitMQ/Celery/Django Memory Leak?
I am not using django, and my packages are:
Chameleon==2.11
Fabric==1.6.0
Mako==0.8.0
MarkupSafe==0.15
MySQL-python==1.2.4
Paste==1.7.5.1
PasteDeploy==1.5.0
SQLAlchemy==0.8.1
WebOb==1.2.3
altgraph==0.10.2
amqp==1.0.11
anyjson==0.3.3
argparse==1.2.1
billiard==2.7.3.28
biplist==0.5
celery==3.0.19
chaussette==0.9
distribute==0.6.34
flower==0.5.1
gevent==0.13.8
greenlet==0.4.1
kombu==2.5.10
macholib==1.5.1
objgraph==1.7.2
paramiko==1.10.1
pycrypto==2.6
pyes==0.20.0
pyramid==1.4.1
python-dateutil==2.1
redis==2.7.6
repoze.lru==0.6
requests==1.2.3
six==1.3.0
tornado==3.1
translationstring==1.1
urllib3==1.6
venusian==1.0a8
wsgiref==0.1.2
zope.deprecation==4.0.2
zope.interface==4.0.5
I just added a test task like, test_string is a big string, and it still has memory leak:
@celery.task(ignore_result=True)
def process_crash_xml(test_string, client_ip, request_timestamp):
logger.info("%s %s" % (client_ip, request_timestamp))
test = [test_string] * 5
Answers:
You might be hitting this issue in librabbitmq
. Please check whether or not Celery is using librabbitmq>=1.0.1
.
A simple fix to try is: pip install librabbitmq>=1.0.1
.
It was this config option that made my worker does not release memory.
CELERYD_TASK_TIME_LIMIT = 600
This was an issue in celery which I think is fixed.
Please refer: https://github.com/celery/celery/issues/2927
set worker_max_tasks_per_child in your settings
There are two settings which can help you mitigate growing memory consumption of celery workers:
- Max tasks per child setting (v2.0+):
With this option you can configure the maximum number of tasks a worker can execute before it’s replaced by a new process. This is useful if you have memory leaks you have no control over for example from closed source C extensions.
- Max memory per child setting (v4.0+):
With this option you can configure the maximum amount of resident memory a worker can execute before it’s replaced by a new process.
This is useful if you have memory leaks you have no control over for example from closed source C extensions.
However, those options only work with the default pool (prefork).
For safe guarding against memory leaks for threads and gevent pools you can add an utility process called memmon, which is part of the superlance extension to supervisor.
Memmon can monitor all running worker processes and will restart them automatically when they exceed a predefined memory limit.
Here is an example configuration for your supervisor.conf:
[eventlistener:memmon]
command=/path/to/memmon -p worker=512MB
events=TICK_60
When you start your worker just set the max-tasks-per-child option like this to restart worker processes after every task:
celery -A app worker --loglevel=info --max-tasks-per-child=1
Here’s the documentation:
https://docs.celeryproject.org/en/latest/userguide/workers.html#max-tasks-per-child-setting
With this option you can configure the maximum amount of resident memory a worker can execute before it’s replaced by a new process.
This is useful if you have memory leaks you have no control over for example from closed source C extensions.
The option can be set using the workers –max-memory-per-child argument or using the worker_max_memory_per_child setting.
It looks like celery does not release memory after task finished. Every time a task finishes, there would be 5m-10m memory leak. So with thousands of tasks, soon it will use up all memory.
BROKER_URL = 'amqp://user@localhost:5672/vhost'
# CELERY_RESULT_BACKEND = 'amqp://user@localhost:5672/vhost'
CELERY_IMPORTS = (
'tasks.tasks',
)
CELERY_IGNORE_RESULT = True
CELERY_DISABLE_RATE_LIMITS = True
# CELERY_ACKS_LATE = True
CELERY_TASK_RESULT_EXPIRES = 3600
# maximum time for a task to execute
CELERYD_TASK_TIME_LIMIT = 600
CELERY_DEFAULT_ROUTING_KEY = "default"
CELERY_DEFAULT_QUEUE = 'default'
CELERY_DEFAULT_EXCHANGE = "default"
CELERY_DEFAULT_EXCHANGE_TYPE = "direct"
# CELERYD_MAX_TASKS_PER_CHILD = 50
CELERY_DISABLE_RATE_LIMITS = True
CELERYD_CONCURRENCY = 2
Might be same with issue, but it does not has an answer:
RabbitMQ/Celery/Django Memory Leak?
I am not using django, and my packages are:
Chameleon==2.11
Fabric==1.6.0
Mako==0.8.0
MarkupSafe==0.15
MySQL-python==1.2.4
Paste==1.7.5.1
PasteDeploy==1.5.0
SQLAlchemy==0.8.1
WebOb==1.2.3
altgraph==0.10.2
amqp==1.0.11
anyjson==0.3.3
argparse==1.2.1
billiard==2.7.3.28
biplist==0.5
celery==3.0.19
chaussette==0.9
distribute==0.6.34
flower==0.5.1
gevent==0.13.8
greenlet==0.4.1
kombu==2.5.10
macholib==1.5.1
objgraph==1.7.2
paramiko==1.10.1
pycrypto==2.6
pyes==0.20.0
pyramid==1.4.1
python-dateutil==2.1
redis==2.7.6
repoze.lru==0.6
requests==1.2.3
six==1.3.0
tornado==3.1
translationstring==1.1
urllib3==1.6
venusian==1.0a8
wsgiref==0.1.2
zope.deprecation==4.0.2
zope.interface==4.0.5
I just added a test task like, test_string is a big string, and it still has memory leak:
@celery.task(ignore_result=True)
def process_crash_xml(test_string, client_ip, request_timestamp):
logger.info("%s %s" % (client_ip, request_timestamp))
test = [test_string] * 5
You might be hitting this issue in librabbitmq
. Please check whether or not Celery is using librabbitmq>=1.0.1
.
A simple fix to try is: pip install librabbitmq>=1.0.1
.
It was this config option that made my worker does not release memory.
CELERYD_TASK_TIME_LIMIT = 600
This was an issue in celery which I think is fixed.
Please refer: https://github.com/celery/celery/issues/2927
set worker_max_tasks_per_child in your settings
There are two settings which can help you mitigate growing memory consumption of celery workers:
- Max tasks per child setting (v2.0+):
With this option you can configure the maximum number of tasks a worker can execute before it’s replaced by a new process. This is useful if you have memory leaks you have no control over for example from closed source C extensions.
- Max memory per child setting (v4.0+):
With this option you can configure the maximum amount of resident memory a worker can execute before it’s replaced by a new process.
This is useful if you have memory leaks you have no control over for example from closed source C extensions.
However, those options only work with the default pool (prefork).
For safe guarding against memory leaks for threads and gevent pools you can add an utility process called memmon, which is part of the superlance extension to supervisor.
Memmon can monitor all running worker processes and will restart them automatically when they exceed a predefined memory limit.
Here is an example configuration for your supervisor.conf:
[eventlistener:memmon]
command=/path/to/memmon -p worker=512MB
events=TICK_60
When you start your worker just set the max-tasks-per-child option like this to restart worker processes after every task:
celery -A app worker --loglevel=info --max-tasks-per-child=1
Here’s the documentation:
https://docs.celeryproject.org/en/latest/userguide/workers.html#max-tasks-per-child-setting
With this option you can configure the maximum amount of resident memory a worker can execute before it’s replaced by a new process.
This is useful if you have memory leaks you have no control over for example from closed source C extensions.
The option can be set using the workers –max-memory-per-child argument or using the worker_max_memory_per_child setting.