How to make lightweight docker image for python app with pipenv
Question:
I can produce working image for my python app with following simple Dockerfile
:
FROM python:3.7
WORKDIR /myapp
COPY Pipfile* ./
RUN pip install pipenv
RUN pipenv install --system --deploy
COPY src .
CMD ["python3", "app.py"]
However, it will produce ~1 GB image, which can contain temporary files, and is heavy to deploy. And I only need full python image for building purposes. My app can successfully run on alpine variant, so I can make two-pass Dockerfile
:
FROM python:3.7 as builder
COPY Pipfile* ./
RUN pipenv lock --requirements > requirements.txt
RUN python3 -m venv /venv
RUN /venv/bin/pip install --upgrade pip
RUN /venv/bin/pip install -r requirements.txt
FROM python:3.7-alpine
COPY --from=builder /venv /venv
WORKDIR /myapp
COPY src .
CMD ["/venv/bin/python3", "app.py"]
So far so good, it also works, being 6 times smaller. But this scheme was considered as some “stub”, having some drawbacks:
- It has unnesesary extra
COPY --from=builder
step
- It does not utilizes
pipenv
but needs also pip
for installing (+1 extra step, pipenv lock
+pip install
is always slower than just pipenv install
)
- It does not install system-wide, but into
/venv
, which is to be avoided inside a container
- Minor: Build pollutes intermediate-images cache more, and requires downloading both image variants..
How to combine these two approaches, to get lightweitht alpine-based image with pipenv
, lacking mentioned drawbacks?
Or can you offer your production Dockerfile
ideas?
Answers:
How about,
FROM python:3.7-alpine
WORKDIR /myapp
COPY Pipfile* ./
RUN pip install --no-cache-dir pipenv &&
pipenv install --system --deploy --clear
COPY src .
CMD ["python3", "app.py"]
- It utilises the smaller Alpine version.
- You won’t have any unnecessary cache files left over using
--no-cache-dir
option for pip
and --clear
option for pipenv
.
- You also deploy outside of venv.
You can also add && pip uninstall pipenv -y
after pipenv install --system --deploy --clear
in the same RUN
command to eliminate space taken by pipenv
if that extra image size bothers you.
The problem comes when you need things like ciso8601
, or some libraries, requiring build process. Build tools are not “incorporated” into the both slim
and alpine
variants, for low-size footprint.
So to install deps, you’ll have to:
- Install build tools
- Deploy dependencies from Pipfile.lock system-wide
- Uninstall build tools and clean caches
And do that 3 actions inside a single RUN
layer, like following:
FROM python:3.7-slim
WORKDIR /app
# both files are explicitly required!
COPY Pipfile Pipfile.lock ./
RUN pip install pipenv &&
apt-get update &&
apt-get install -y --no-install-recommends gcc python3-dev libssl-dev &&
pipenv install --deploy --system &&
apt-get remove -y gcc python3-dev libssl-dev &&
apt-get autoremove -y &&
pip uninstall pipenv -y
COPY app ./
CMD ["python", "app.py"]
- Manipulating build system would cost you around 300MiB and some extra time
- Uninstalling pipenv would save you another 20MiB (which is 10% of resulting size).
- Separating
RUN
commands would not delete data from layers, and would result in ~500MiB image. That’s docker specifics.
So that would result in perfectly working ~200MiB sized image, which is
- 5 times less than original
python:3.7
, (that is >1.0GiB)
- Has no alpine incompabilities (these are typically tied to glibc replacement)
At the time, we’re fine with slim
(debian buster
) build variants, preferring slim
over alpine
(for most compatibility). If you’re really up to further size optimization, I’d recommend you to take a look at some excellent builds of these guys:
I am using micropipenv for the job, which describes itself as
A lightweight wrapper for pip to support requirements.txt, Pipenv and Poetry lock files or converting them to pip-tools compatible output. Designed for containerized Python applications but not limited to them.
An image created from it would look like the following.
Since the alpine base image lacks a toml parser we have to use the version of micropipenv that includes the toml extras (micropipenv[toml]
instead of micropipenv
).
FROM python:3.9-alpine
WORKDIR /myapp
COPY Pipfile Pipfile.lock ./
RUN
# Install dependencies
&& pip install --no-cache-dir micropipenv[toml]
&& micropipenv install --deploy
&& pip uninstall -y micropipenv[toml]
COPY src .
CMD ["python3", "app.py"]
It has unnecessary extra COPY –from=builder step
That directive is harmless and actually making your final stage image even more lightweight: only the virtualenv is copied, no building toolchains nor cached wheels nor even pipenv in the final stage!
It does not utilizes pipenv but needs also pip for installing (+1 extra step, pipenv lock+pip install is always slower than just pipenv install)
Generate the virtualenv with pipenv in the building stage!
FROM python:3 as builder
COPY Pipfile* /
RUN mkdir /.venv # The presence of a .venv folder triggers pipenv to use it by default
RUN pipenv install --deploy
FROM python:3-slim
COPY --from=builder /.venv /.venv
WORKDIR /myapp
COPY src .
CMD ["/.venv/bin/python3", "app.py"]
It does not install system-wide, but into /venv, which is to be avoided inside a container
While not using venvs inside docker is a common practice, there are still some benefits to them. And absolute 0 drawbacks. Stop listening to people saying venvs should not be used inside dockers. Pipenv current recommendation is to not issue system-wide installs in containers https://github.com/pypa/pipenv/pull/2762
Minor: Build pollutes intermediate-images cache more, and requires downloading both image variants..
Simply optimize your caches setup in the CI system.
BUT
For the love of God use the same platform for the building and final stages.
Either
- use
python:3-alpine
for both stages, bloating the building stage with as many apk
packages as you need
- settle for
python:3
for building and python:3-slim
for the final stage. It is not that big
Alpine images use musl
instead of libc
and that means a different ABI for python packages https://peps.python.org/pep-0656/. Do not mix alpine images with non-alpine images just as you would not mix python:3.A images with a different python:3.B.
Otherwise some of the components installed in the building stage will be unusable by the final stage.
I can produce working image for my python app with following simple Dockerfile
:
FROM python:3.7
WORKDIR /myapp
COPY Pipfile* ./
RUN pip install pipenv
RUN pipenv install --system --deploy
COPY src .
CMD ["python3", "app.py"]
However, it will produce ~1 GB image, which can contain temporary files, and is heavy to deploy. And I only need full python image for building purposes. My app can successfully run on alpine variant, so I can make two-pass Dockerfile
:
FROM python:3.7 as builder
COPY Pipfile* ./
RUN pipenv lock --requirements > requirements.txt
RUN python3 -m venv /venv
RUN /venv/bin/pip install --upgrade pip
RUN /venv/bin/pip install -r requirements.txt
FROM python:3.7-alpine
COPY --from=builder /venv /venv
WORKDIR /myapp
COPY src .
CMD ["/venv/bin/python3", "app.py"]
So far so good, it also works, being 6 times smaller. But this scheme was considered as some “stub”, having some drawbacks:
- It has unnesesary extra
COPY --from=builder
step - It does not utilizes
pipenv
but needs alsopip
for installing (+1 extra step,pipenv lock
+pip install
is always slower than justpipenv install
) - It does not install system-wide, but into
/venv
, which is to be avoided inside a container - Minor: Build pollutes intermediate-images cache more, and requires downloading both image variants..
How to combine these two approaches, to get lightweitht alpine-based image with pipenv
, lacking mentioned drawbacks?
Or can you offer your production Dockerfile
ideas?
How about,
FROM python:3.7-alpine
WORKDIR /myapp
COPY Pipfile* ./
RUN pip install --no-cache-dir pipenv &&
pipenv install --system --deploy --clear
COPY src .
CMD ["python3", "app.py"]
- It utilises the smaller Alpine version.
- You won’t have any unnecessary cache files left over using
--no-cache-dir
option forpip
and--clear
option forpipenv
. - You also deploy outside of venv.
You can also add && pip uninstall pipenv -y
after pipenv install --system --deploy --clear
in the same RUN
command to eliminate space taken by pipenv
if that extra image size bothers you.
The problem comes when you need things like ciso8601
, or some libraries, requiring build process. Build tools are not “incorporated” into the both slim
and alpine
variants, for low-size footprint.
So to install deps, you’ll have to:
- Install build tools
- Deploy dependencies from Pipfile.lock system-wide
- Uninstall build tools and clean caches
And do that 3 actions inside a single RUN
layer, like following:
FROM python:3.7-slim
WORKDIR /app
# both files are explicitly required!
COPY Pipfile Pipfile.lock ./
RUN pip install pipenv &&
apt-get update &&
apt-get install -y --no-install-recommends gcc python3-dev libssl-dev &&
pipenv install --deploy --system &&
apt-get remove -y gcc python3-dev libssl-dev &&
apt-get autoremove -y &&
pip uninstall pipenv -y
COPY app ./
CMD ["python", "app.py"]
- Manipulating build system would cost you around 300MiB and some extra time
- Uninstalling pipenv would save you another 20MiB (which is 10% of resulting size).
- Separating
RUN
commands would not delete data from layers, and would result in ~500MiB image. That’s docker specifics.
So that would result in perfectly working ~200MiB sized image, which is
- 5 times less than original
python:3.7
, (that is >1.0GiB) - Has no alpine incompabilities (these are typically tied to glibc replacement)
At the time, we’re fine with slim
(debian buster
) build variants, preferring slim
over alpine
(for most compatibility). If you’re really up to further size optimization, I’d recommend you to take a look at some excellent builds of these guys:
I am using micropipenv for the job, which describes itself as
A lightweight wrapper for pip to support requirements.txt, Pipenv and Poetry lock files or converting them to pip-tools compatible output. Designed for containerized Python applications but not limited to them.
An image created from it would look like the following.
Since the alpine base image lacks a toml parser we have to use the version of micropipenv that includes the toml extras (micropipenv[toml]
instead of micropipenv
).
FROM python:3.9-alpine
WORKDIR /myapp
COPY Pipfile Pipfile.lock ./
RUN
# Install dependencies
&& pip install --no-cache-dir micropipenv[toml]
&& micropipenv install --deploy
&& pip uninstall -y micropipenv[toml]
COPY src .
CMD ["python3", "app.py"]
It has unnecessary extra COPY –from=builder step
That directive is harmless and actually making your final stage image even more lightweight: only the virtualenv is copied, no building toolchains nor cached wheels nor even pipenv in the final stage!
It does not utilizes pipenv but needs also pip for installing (+1 extra step, pipenv lock+pip install is always slower than just pipenv install)
Generate the virtualenv with pipenv in the building stage!
FROM python:3 as builder
COPY Pipfile* /
RUN mkdir /.venv # The presence of a .venv folder triggers pipenv to use it by default
RUN pipenv install --deploy
FROM python:3-slim
COPY --from=builder /.venv /.venv
WORKDIR /myapp
COPY src .
CMD ["/.venv/bin/python3", "app.py"]
It does not install system-wide, but into /venv, which is to be avoided inside a container
While not using venvs inside docker is a common practice, there are still some benefits to them. And absolute 0 drawbacks. Stop listening to people saying venvs should not be used inside dockers. Pipenv current recommendation is to not issue system-wide installs in containers https://github.com/pypa/pipenv/pull/2762
Minor: Build pollutes intermediate-images cache more, and requires downloading both image variants..
Simply optimize your caches setup in the CI system.
BUT
For the love of God use the same platform for the building and final stages.
Either
- use
python:3-alpine
for both stages, bloating the building stage with as manyapk
packages as you need - settle for
python:3
for building andpython:3-slim
for the final stage. It is not that big
Alpine images use musl
instead of libc
and that means a different ABI for python packages https://peps.python.org/pep-0656/. Do not mix alpine images with non-alpine images just as you would not mix python:3.A images with a different python:3.B.
Otherwise some of the components installed in the building stage will be unusable by the final stage.