Gunicorn: stuck at booting new workers

Question:

I have a rather simple Flask application (using fastAPI) for loading a numpy array and defining some API endpoints.

import numpy as np
import pandas as pd
import logging

from fastapi import FastAPI

app = FastAPI()
logging.basicConfig(level=logging.DEBUG)

logging.info('Loading texts')
texts = pd.read_csv('cleaned.csv')
logging.info('Loading embeddings')
embeddings = np.load('laser-2020-04-30.npy') # 3.7G
logging.info('Loading completed!')

# some API endpoints below...

I can run this app with pure python3.7 without any issues. Also it runs fine while running vanilla gunicorn. The problem arises when running everything in a docker container (and using gunicorn). It seems to get stuck at loading the large numpy array and booting new workers.

[2020-05-11 08:33:20 +0000] [1] [INFO] Starting gunicorn 20.0.4
[2020-05-11 08:33:20 +0000] [1] [DEBUG] Arbiter booted
[2020-05-11 08:33:20 +0000] [1] [INFO] Listening at: http://0.0.0.0:80 (1)
[2020-05-11 08:33:20 +0000] [1] [INFO] Using worker: sync
[2020-05-11 08:33:20 +0000] [7] [INFO] Booting worker with pid: 7
[2020-05-11 08:33:20 +0000] [1] [DEBUG] 1 workers
INFO:root:Loading texts
INFO:root:Loading embeddings
[2020-05-11 08:33:35 +0000] [18] [INFO] Booting worker with pid: 18
INFO:root:Loading texts
INFO:root:Loading embeddings
[2020-05-11 08:33:51 +0000] [29] [INFO] Booting worker with pid: 29
INFO:root:Loading texts
INFO:root:Loading embeddings
[2020-05-11 08:34:05 +0000] [40] [INFO] Booting worker with pid: 40
INFO:root:Loading texts
INFO:root:Loading embeddings
[2020-05-11 08:34:19 +0000] [51] [INFO] Booting worker with pid: 51
INFO:root:Loading texts
INFO:root:Loading embeddings
[2020-05-11 08:34:36 +0000] [62] [INFO] Booting worker with pid: 62

I set the number of workers to 1 and increased the timeout to 900 seconds. However, it boots new workers every 10-15 sec.

The command of running the application in my Dockerfile looks like following

CMD ["gunicorn","-b 0.0.0.0:8080", "main:app", "--timeout 900", "--log-level", "debug", "--workers", "1", "--graceful-timeout", "900"]
Asked By: Isbister

||

Answers:

To solve this, I simply increased the available amount of RAM that the docker container could use. The default was 2G on my MacBook 2019 installation of docker. Since the numpy array was 3.7G this is why it could not load it.

docker run -m=8g -t my_docker
Answered By: Isbister
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.