Specific reasons to favor pip vs. conda when installing Python packages

Question:

I use miniconda as my default python installation. What is the current (2019) wisdom regarding when to install something with conda vs. pip?

My usual behavior is to install everything with pip, and only using conda if a package is not available through pip or the pip version doesn’t work correctly.

Are there advantages to always favoring conda install? Are there issues associated with mixing the two installers? What factors should I be considering?


OBJECTIVITY: This is not an opinion-based question! My question is when I have the option to install a python package with pip or conda, how do I make an informed decision? Not “tell me which is better, but Why would I use one over the other, and will oscillating back & forth cause problems / inefficiencies?”

Asked By: Dustin Michels

||

Answers:

I find I use conda first simply because it installs the binary, than try pip if the package isn’t there. For instance psycopg2 is far easier to install in conda than pip.

https://jakevdp.github.io/blog/2016/08/25/conda-myths-and-misconceptions/

Pip, which stands for Pip Installs Packages, is Python’s officially-sanctioned package manager, and is most commonly used to install packages published on the Python Package Index (PyPI). Both pip and PyPI are governed and supported by the Python Packaging Authority (PyPA).

In short, pip is a general-purpose manager for Python packages; conda is a language-agnostic cross-platform environment manager. For the user, the most salient distinction is probably this: pip installs python packages within any environment; conda installs any package within conda environments. If all you are doing is installing Python packages within an isolated environment, conda and pip+virtualenv are mostly interchangeable, modulo some difference in dependency handling and package availability. By isolated environment I mean a conda-env or virtualenv, in which you can install packages without modifying your system Python installation.

If we focus on just installation of Python packages, conda and pip serve different audiences and different purposes. If you want to, say, manage Python packages within an existing system Python installation, conda can’t help you: by design, it can only install packages within conda environments. If you want to, say, work with the many Python packages which rely on external dependencies (NumPy, SciPy, and Matplotlib are common examples), while tracking those dependencies in a meaningful way, pip can’t help you: by design, it manages Python packages and only Python packages.

Conda and pip are not competitors, but rather tools focused on different groups of users and patterns of use.

Answered By: eatmeimadanish

This is what I do:

  1. Activate your conda virutal env
  2. Use pip to install into your virtual env
  3. If you face any compatibility issues, use conda

I recently ran into this when numpy / matplotlib freaked out and I used the conda build to resolve the issue.

Answered By: Scott Skiles

Concur with eatmeimadanish. Conda first, then pip makes the most sense given your *conda starting point.

The TL;DR Backstory

Anaconda (the distribution) and Conda (the package manager) were designed to solve installation and integration problems that the status quo did not.

The status quo here covers enormous ground: whatever combination of Python binaries (either OS-provided or downloaded from Python.org), system level package installers (e.g. apt get, yum, homebrew), Python-focused package installers (e.g. easy_install and pip), and setup frameworks (e.g. setuptools and distutils) you might happen to be using. And this status quo has evolved mightily over the years, with some parts (e.g. easy_install, distutils) falling away, and new parts (e.g. wheels, twine) coming onboard. It’s not seen the vast and persistent flux of the JavaScript ecosystem, but Python packaging and install has never really been “a solved problem,” and the preferred contestants for solving the problem(s) have varied greatly over time. You could argue that some or most of the problems that the native Python tools used to have are now basically solved. At least some are, but the *conda community would very much disagree that they’ve been eclipsed.

If you are starting from Anaconda or miniconda, I assume you enjoy their virtues (their arguably higher ease of installation, better integration, etc.). Otherwise you’d probably choose a more “stock” or vanilla Python distribution, or maybe a different “better than the base Python because X, Y, and Z” distribution (e.g. ActivePython, Enthought Canopy, …). Given that, I’d think you’d want to use conda first, falling back to pip, rather than the other way around.

You can of course try installing each package with pip and fall back to conda only if pip disappoints, but that seems to circumvent your primary choice of starting with and favoring the *conda tools and ecosystem.

Answered By: Jonathan Eunice

When using an anaconda ecosystem, you should always prefer conda before pip.

The docs specifically mention this: (emphasis mine)

If a package is not available from conda or Anaconda.org, you may be
able to find and install the package with another package manager like
pip.

Pip packages do not have all the features of conda packages and we
recommend first trying to install any package with conda.
If the
package is unavailable through conda, try installing it with pip. The
differences between pip and conda packages cause certain unavoidable
limits in compatibility
but conda works hard to be as compatible with
pip as possible.

Using conda packages can help your environment stay consistent, especially if you require managing a lot of dependencies within the same environment (or don’t have an environment but use the base environment directly).

Answered By: Paritosh Singh

Note: The following recommendations are now part of the official documentation.


"What is the current (2019) wisdom regarding when to install something with conda vs. pip?"

Anaconda Inc’s Jonathan Helmus sums this up quite nicely in the post "Using Pip in a Conda Environment." Here’s an excerpt from the final best practices recommendation:

Best Practices Checklist

Use pip only after conda

  • install as many requirements as possible with conda, then use pip
  • pip should be run with --upgrade-strategy "only-if-needed" (the default)
  • Do not use pip with the --user argument, avoid all “users” installs

Use Conda environments for isolation

  • create a Conda environment to isolate any changes pip makes
  • environments take up little space thanks to hard links
  • care should be taken to avoid running pip in the root [base] environment

Recreate the environment if changes are needed

  • once pip has been used conda will be unaware of the changes
  • to install additional Conda packages it is best to recreate the environment

Store conda and pip requirements in text files

  • package requirements can be passed to conda via the --file argument
  • pip accepts a list of Python packages with -r or --requirements
  • conda env will export or create environments based on a file with conda and pip requirements
Answered By: merv

As add-on to @eatmeimadanisch and @merve’s recommendation “use conda first, thenn try pip”,
here is the corresponding code to run this from the command line of a linux system:

while read requirement; do conda install --yes $requirement || pip install $requirement; done < requirements.txt

This assumes that all packages with desired package number are put into a file called “requirements.txt”. The entries look like this for example:

matplotlib==2.0.0
numpy==1.18.1

Note that the equal sign is double (==), not single (=).

Answered By: Agile Bean