What's the difference between Docker and Python virtualenv?

Question:

From what I understand about Docker, it’s a tool used for virtual environments. In their lingo, its called “containerization”. This is more or less what Python’s virtualenv does. However, you can use virtualenv in Docker. So, is it a virtual environment inside a virtual environment? I’m confused as to how this would even work, so could someone please clarify?

Asked By: danielschnoll

||

Answers:

Python virtual environment will “containerize” only Python runtime i.e. python interpreter and python libraries whereas Docker isolates the whole system (the whole file-system, all user-space libraries, network interfaces) . Therefore Docker is much closer to a Virtual Machine than virtual environment.

Answered By: jil

A virtualenv only encapsulates Python dependencies. A Docker container encapsulates an entire OS.

With a Python virtualenv, you can easily switch between Python versions and dependencies, but you’re stuck with your host OS.

With a Docker image, you can swap out the entire OS – install and run Python on Ubuntu, Debian, Alpine, even Windows Server Core.

There are Docker images out there with every combination of OS and Python versions you can think of, ready to pull down and use on any system with Docker installed.

Answered By: sp0gg

Adding to the above: there is a case for combining docker and venv: some OSs ship with python installed to provide ‘OS-near’ apps, e.g., to my knowledge, apt on debian (and its derivatives). The python venv enables a developer to ship a python app which requires a different interpreter version without affecting the shipped-with-the-OS python. Now, since Docker ‘isolates the whole OS’ as stated above, the same applies to a Docker image. Hence, in my view, if a Docker image is required/desired, it is best practice to create a venv inside the Docker image for your python app.

Answered By: Blindfreddy

"a virtual environment, a self-contained directory tree that contains a Python installation for a particular version of Python, plus a number of additional packages"

A docker container provides a higher level of abstraction/isolation, it can has its own "process space, file system, network space, ipc space, etc."

Answered By: Michael.Sun

A virtual environment is the integration of a collection of dependencies that ensure one or more applications can seamless work together. It provides a set of runtime guarantees for the applications of interest. Virtual environments isolate the given set of dependencies from the system’s applications allowing users and developers to have as many application contexts as desired.

In particular, Python virtual environments are designed to isolate a particular set of dependencies tied to a particular version of Python from the system Python. In this way, a user can have multiple system Python versions each of which can have a corresponding set of virtual environments each with independent dependencies. Because Python virtual environments only apply to Python, any non-Python application installed in the system will be seen by all Python virtual environments in the exact same way. In concrete terms, one can have Python3.7 to Python3.11 installed as system Pythons while at the same time having four Python3.10 virtual environments (venv1venv4) each with different versions of the requests library. Applications within venv2 may only work with the particular version of requests in that virtual environment and no other.

On the other hand, Anaconda virtual environments extend the set of dependencies beyond Python to include virtually any application. This means that they can include system applications (non-Python) together with the complete set of underlying libraries completely independent of system applications. For example, an anaconda virtual environment on, say, Apple Silicon with the HDF5 library will have the complete build of HDF5 for Apple Silicon only within the virtual environment. Any application that is run outside this Anaconda virtual environment will be completely ignorant of the existence of HDF5.

Docker, or containers in general, are a completely different technology only available on Linux machines in which applications are virtually isolated from one another by namespaces. Each container provides a virtual process space only accessible by applications in that container, which are externally managed by the container runtime (e.g. Docker, Containerd, podman etc.). Therefore, it is possible to have containers with Anaconda virtual environments. The container runtime can also be ported to non-Linux OSes but a Linux kernel will still be needed to enable the underlying process namespaces. This is why Docker works on Windows and macOS even though both do not natively support process namespaces.

In summary, there are two main types of virtualisation:

  • application dependency virtualisation (Python and Anaconda virtual environments), which can be though of as ‘static’ virtualisation, and
  • process virtualisation through containers, which can be thought of as ‘dynamic’ virtualisation.

But, I could be wrong…

Answered By: polarise
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.