FastApi with gunicorn/uvicorn stops responding
Question:
I’m currently using FastApi with Gunicorn/Uvicorn as my server engine.
I’m using the following config for Gunicorn:
TIMEOUT 0
GRACEFUL_TIMEOUT 120
KEEP_ALIVE 5
WORKERS 10
Uvicorn has all default settings, and is started in docker container casually:
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
Everything is packed in docker container.
The problem is the following:
After some time (somewhere between 1 day and 1 week, depending on load) my app stops responding (even simple curl http://0.0.0.0:8000
command hangs forever). Docker container keeps working, there are no application errors in logs, and there are no connection issues, but none of my workers are getting the request (and so I’m never getting my response). It seems like my request is lost somewhere between server engine and my application. Any ideas how to fix it?
UPDATE: I’ve managed to reproduce this behaviour with custom locust load profile:
The scenario was the following:
- In first 15 minutes ramp up to 50 users (30 of them will send requests requiring GPU at 1 rps, and 20 will send requests that do not require GPU at 10 rps)
- Work for another 4 hours
As the plot shows, in about 30 minutes API stops responding. (And still, there are no error messages/warnings in output)
UPDATE 2:
Can there be any hidden memory leak or deadlock due to incorrect Gunicorn setup or bug (such as https://github.com/tiangolo/fastapi/issues/596)?
UPDATE 4:
I’ve got inside my container and executed ps
command. It shows:
PID TTY TIME CMD
120 pts/0 00:00:00 bash
134 pts/0 00:00:00 ps
Which means my Gunicorn server app just silently turned off. And also there is binary file named core
in the app directory, which obviously mens that something has crashed
Answers:
It was an Out Of Memory error (OOM). The leak was caused by elastic apm
middleware. I removed it, and leak disappeared.
I’m currently using FastApi with Gunicorn/Uvicorn as my server engine.
I’m using the following config for Gunicorn:
TIMEOUT 0
GRACEFUL_TIMEOUT 120
KEEP_ALIVE 5
WORKERS 10
Uvicorn has all default settings, and is started in docker container casually:
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
Everything is packed in docker container.
The problem is the following:
After some time (somewhere between 1 day and 1 week, depending on load) my app stops responding (even simple curl http://0.0.0.0:8000
command hangs forever). Docker container keeps working, there are no application errors in logs, and there are no connection issues, but none of my workers are getting the request (and so I’m never getting my response). It seems like my request is lost somewhere between server engine and my application. Any ideas how to fix it?
UPDATE: I’ve managed to reproduce this behaviour with custom locust load profile:
The scenario was the following:
- In first 15 minutes ramp up to 50 users (30 of them will send requests requiring GPU at 1 rps, and 20 will send requests that do not require GPU at 10 rps)
- Work for another 4 hours
As the plot shows, in about 30 minutes API stops responding. (And still, there are no error messages/warnings in output)
UPDATE 2:
Can there be any hidden memory leak or deadlock due to incorrect Gunicorn setup or bug (such as https://github.com/tiangolo/fastapi/issues/596)?
UPDATE 4:
I’ve got inside my container and executed ps
command. It shows:
PID TTY TIME CMD
120 pts/0 00:00:00 bash
134 pts/0 00:00:00 ps
Which means my Gunicorn server app just silently turned off. And also there is binary file named core
in the app directory, which obviously mens that something has crashed
It was an Out Of Memory error (OOM). The leak was caused by elastic apm
middleware. I removed it, and leak disappeared.