Big size of python image in Docker

Question:

I want to test my app with Docker. So, I have this in Dockerfile:

FROM python:3-onbuild
CMD [ "python", "./test.py" ]

test.py:

print(123)

Then I run:

docker build -t my_test_app .

So, I have one big image. docker images return:

REPOSITORY          TAG                 IMAGE ID        CREATED    VIRTUAL SIZE
python              3-onbuild           b258eb0a5195    8 days ago 757 MB

Why is the file size so large?

Is that file size normal?

Asked By: Leon

||

Answers:

I just checked on my machine the standard ubuntu:trusty image is 188 MB and the image with all python stuff is 480MB. I see 800MB images quite often, those are usually ones that contain some meaningful application.

However, these images are based on our private images the official Docker library image seems much larger for some reason. They are aware of this fact and are trying to reduce it. Look at the discussion on the subject here

If you need a bit smaller image try this one ‘rouge8/pythons’ it is about 100MB smaller.

rouge8/pythons latest … 680.3 MB

Keep in mind, docker images are arranged as a hierarchical layer structure. So if you reuse the same underlying base image for many containers the size that each individual container adds is quite small. It will only be the difference between the base plus whatever you added into specific container.

Answered By: Vlad

Yes it’s normal size. The image contains an operating system image and various packages and that’s why the size.

Answered By: barunsthakur

They add various system packages for things like database clients, image file manipulation and XML parsing libraries. This is so there is no extra work a user has to do if they want to install Python packages for psycopg2, MySQLdb, Pillow or lxml. Adding those extra packages though means that the image will be fatter, which if you didn’t need those packages would be a waste of space.

They also don’t attempt to trim stuff out of the Python installation which isn’t really required, such as all the standard library test code directories. Even the .pyc files can be trimmed to save on space without any real impact as a web application generally loads up once for the life of the container, so having .pyc files doesn’t really benefit you much.

As a comparison, have a look at the ‘pythonX.Y-slim’ variants and the size of those. There isn’t though an onbuild variant for the slim images.

You could also look at my own Docker images for Python with bundled Apache/mod_wsgi support. These are trimmed and rely on additional packages being installed by the user as build hooks only if required. For those, the size of the Python 3.4 onbuild image specifically for a WSGI application is:

grahamdumpleton/mod-wsgi-docker python-3.4-onbuild ... 409.9 MB

The size given even includes Apache and mod_wsgi, giving you a proper production grade WSGI server with capabilities to handle static file content and much more.

If not running a WSGI application, start with the base image instead.

You can find the mod_wsgi docker images at:

Various blog posts about how to use these images for WSGI applications and constructing Docker images for Python and WSGI applications can be found linked from the image description on Docker hub. Also keep an eye on my blog site in general as I will be posting more about Docker and Python as time goes by.

Answered By: Graham Dumpleton

Alpine Linux is a very lean distro avaliable for Docker. Without Python, it’s around 5MB. With Python I’m getting images between 60 and 120 MB. The following Dockerfile yields a 110 MB image.

FROM alpine:3.4

RUN apk --update add 
      build-base python-dev 
      ca-certificates python 
      ttf-droid 
      py-pip 
      py-jinja2 
      py-twisted 
      py-dateutil 
      py-tz 
      py-requests 
      py-pillow 
      py-rrd && 
    pip install --upgrade arrow 
                          pymongo 
                          websocket-client 
                          XlsxWriter && 
    apk del build-base python-dev && 
    rm -rf /var/cache/apk/* && 
    adduser -D -u 1001 noroot

USER noroot

CMD ["/bin/sh"]

Also, it’s very well mantained.


A word of warning, though. Alpine uses musl libc instead of glibc, and some Python modules rely on glibc, but this usually isn’t a problem.

A bigger issue is, that because of this, manylinux wheels are not avaliable for Alpine, and therefore the modules need to be compiled upon installation (pip install). In some cases this can make a difference in build time between 20 seconds on Debian and 9 minutes or more on Alpine. The grpcio-module is notorious for that; it takes forever to compile.

There is a (somewhat unreliable) workaround where you tell Python that it is manylinux compatible.

Answered By: Daniel F

You can try the python:{version}-alpine version. It’s much smaller:

>> docker image ls |grep python
python    3.6-alpine     89.4 MB
python    3.6            689 MB
python    3.5            689 MB
python    3.5.2          687 MB
python    3.4            833 MB
python    2.7            676 MB

At time of writing it looks like the official image supports -alpine on all python versions.

https://hub.docker.com/_/python/

Answered By: toast38coza

Yes. It is normal.

I see there is an answer mentioning alpine. Alpine linux is quite minimal and it is quite common to see people suggest alpine in production.

However, I recently read this article: https://pythonspeed.com/articles/base-image-python-docker-images/, and the article suggest to use slim instead of alpine.

To use python slim, you can simply put this in top of your Dockerfile

FROM python:3.8-slim-buster

In the following image, I use fastapi + sqlalchemy + pipenv, and the size is relatively small compared to your original Python image.

enter image description here

Answered By: goFrendiAsgard
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.