How to make lightweight docker image for python app with pipenv

Question:

I can produce working image for my python app with following simple Dockerfile:

FROM python:3.7
WORKDIR /myapp
COPY Pipfile* ./
RUN pip install pipenv
RUN pipenv install --system --deploy
COPY src .
CMD ["python3", "app.py"]

However, it will produce ~1 GB image, which can contain temporary files, and is heavy to deploy. And I only need full python image for building purposes. My app can successfully run on alpine variant, so I can make two-pass Dockerfile:

FROM python:3.7 as builder
COPY Pipfile* ./
RUN pipenv lock --requirements > requirements.txt
RUN python3 -m venv /venv
RUN /venv/bin/pip install --upgrade pip
RUN /venv/bin/pip install -r requirements.txt

FROM python:3.7-alpine
COPY --from=builder /venv /venv
WORKDIR /myapp
COPY src .
CMD ["/venv/bin/python3", "app.py"]

So far so good, it also works, being 6 times smaller. But this scheme was considered as some “stub”, having some drawbacks:

  • It has unnesesary extra COPY --from=builder step
  • It does not utilizes pipenv but needs also pip for installing (+1 extra step, pipenv lock+pip install is always slower than just pipenv install)
  • It does not install system-wide, but into /venv, which is to be avoided inside a container
  • Minor: Build pollutes intermediate-images cache more, and requires downloading both image variants..

How to combine these two approaches, to get lightweitht alpine-based image with pipenv, lacking mentioned drawbacks?

Or can you offer your production Dockerfile ideas?

Asked By: xakepp35

||

Answers:

How about,

FROM python:3.7-alpine

WORKDIR /myapp

COPY Pipfile* ./

RUN pip install --no-cache-dir pipenv && 
    pipenv install --system --deploy --clear

COPY src .
CMD ["python3", "app.py"]
  1. It utilises the smaller Alpine version.
  2. You won’t have any unnecessary cache files left over using --no-cache-dir option for pip and --clear option for pipenv.
  3. You also deploy outside of venv.

You can also add && pip uninstall pipenv -y after pipenv install --system --deploy --clear in the same RUN command to eliminate space taken by pipenv if that extra image size bothers you.

Answered By: Jay

The problem comes when you need things like ciso8601, or some libraries, requiring build process. Build tools are not “incorporated” into the both slim and alpine variants, for low-size footprint.

So to install deps, you’ll have to:

  • Install build tools
  • Deploy dependencies from Pipfile.lock system-wide
  • Uninstall build tools and clean caches

And do that 3 actions inside a single RUN layer, like following:

FROM python:3.7-slim

WORKDIR /app

# both files are explicitly required!
COPY Pipfile Pipfile.lock ./

RUN pip install pipenv && 
  apt-get update && 
  apt-get install -y --no-install-recommends gcc python3-dev libssl-dev && 
  pipenv install --deploy --system && 
  apt-get remove -y gcc python3-dev libssl-dev && 
  apt-get autoremove -y && 
  pip uninstall pipenv -y

COPY app ./

CMD ["python", "app.py"]
  • Manipulating build system would cost you around 300MiB and some extra time
  • Uninstalling pipenv would save you another 20MiB (which is 10% of resulting size).
  • Separating RUN commands would not delete data from layers, and would result in ~500MiB image. That’s docker specifics.

So that would result in perfectly working ~200MiB sized image, which is

  • 5 times less than original python:3.7, (that is >1.0GiB)
  • Has no alpine incompabilities (these are typically tied to glibc replacement)

At the time, we’re fine with slim (debian buster) build variants, preferring slim over alpine (for most compatibility). If you’re really up to further size optimization, I’d recommend you to take a look at some excellent builds of these guys:

Answered By: xakepp35

I am using micropipenv for the job, which describes itself as

A lightweight wrapper for pip to support requirements.txt, Pipenv and Poetry lock files or converting them to pip-tools compatible output. Designed for containerized Python applications but not limited to them.

An image created from it would look like the following.
Since the alpine base image lacks a toml parser we have to use the version of micropipenv that includes the toml extras (micropipenv[toml] instead of micropipenv).

FROM python:3.9-alpine

WORKDIR /myapp
COPY Pipfile Pipfile.lock ./

RUN 
  # Install dependencies
  && pip install --no-cache-dir micropipenv[toml] 
  && micropipenv install --deploy 
  && pip uninstall -y micropipenv[toml]

COPY src .
CMD ["python3", "app.py"]
Answered By: ofhouse

It has unnecessary extra COPY –from=builder step

That directive is harmless and actually making your final stage image even more lightweight: only the virtualenv is copied, no building toolchains nor cached wheels nor even pipenv in the final stage!

It does not utilizes pipenv but needs also pip for installing (+1 extra step, pipenv lock+pip install is always slower than just pipenv install)

Generate the virtualenv with pipenv in the building stage!

FROM python:3 as builder
COPY Pipfile* /
RUN mkdir /.venv  # The presence of a .venv folder triggers pipenv to use it by default
RUN pipenv install --deploy

FROM python:3-slim
COPY --from=builder /.venv /.venv
WORKDIR /myapp
COPY src .
CMD ["/.venv/bin/python3", "app.py"]

It does not install system-wide, but into /venv, which is to be avoided inside a container

While not using venvs inside docker is a common practice, there are still some benefits to them. And absolute 0 drawbacks. Stop listening to people saying venvs should not be used inside dockers. Pipenv current recommendation is to not issue system-wide installs in containers https://github.com/pypa/pipenv/pull/2762

Minor: Build pollutes intermediate-images cache more, and requires downloading both image variants..

Simply optimize your caches setup in the CI system.

BUT

For the love of God use the same platform for the building and final stages.

Either

  • use python:3-alpine for both stages, bloating the building stage with as many apk packages as you need
  • settle for python:3 for building and python:3-slim for the final stage. It is not that big

Alpine images use musl instead of libc and that means a different ABI for python packages https://peps.python.org/pep-0656/. Do not mix alpine images with non-alpine images just as you would not mix python:3.A images with a different python:3.B.

Otherwise some of the components installed in the building stage will be unusable by the final stage.

Answered By: N1ngu