Sharing Memory in Gunicorn?

Question:

I have a large read-only data structure (a graph loaded in networkx, though this shouldn’t be important) that I use in my web service. The webservice is built in Flask and then served through Gunicorn. Turns out that for every gunicorn worker I spin up, that worked holds its own copy of my data-structure. Thus, my ~700mb data structure which is perfectly manageable with one worker turns into a pretty big memory hog when I have 8 of them running. Is there any way I can share this data structure between gunicorn processes so I don’t have to waste so much memory?

Asked By: Eli

||

Answers:

It looks like the easiest way to do this is to tell gunicorn to preload your application using the preload_app option. This assumes that you can load the data structure as a module-level variable:

from flask import Flask
from your.application import CustomDataStructure

CUSTOM_DATA_STRUCTURE = CustomDataStructure('/data/lives/here')

# @app.routes, etc.

Alternatively, you could use a memory-mapped file (if you can wrap the shared memory with your custom data structure), gevent with gunicorn to ensure that you’re only using one process, or the multi-processing module to spin up your own data-structure server which you connect to using IPC.

Answered By: Sean Vieira
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.