I use miniconda as my default python installation. What is the current (2019) wisdom regarding when to install something with conda vs. pip?
My usual behavior is to install everything with pip, and only using conda if a package is not available through pip or the pip version doesn’t work correctly.
Are there advantages to always favoring
conda install? Are there issues associated with mixing the two installers? What factors should I be considering?
OBJECTIVITY: This is not an opinion-based question! My question is when I have the option to install a python package with
conda, how do I make an informed decision? Not “tell me which is better, but “Why would I use one over the other, and will oscillating back & forth cause problems / inefficiencies?”
I find I use conda first simply because it installs the binary, than try pip if the package isn’t there. For instance psycopg2 is far easier to install in conda than pip.
Pip, which stands for Pip Installs Packages, is Python’s officially-sanctioned package manager, and is most commonly used to install packages published on the Python Package Index (PyPI). Both pip and PyPI are governed and supported by the Python Packaging Authority (PyPA).
In short, pip is a general-purpose manager for Python packages; conda is a language-agnostic cross-platform environment manager. For the user, the most salient distinction is probably this: pip installs python packages within any environment; conda installs any package within conda environments. If all you are doing is installing Python packages within an isolated environment, conda and pip+virtualenv are mostly interchangeable, modulo some difference in dependency handling and package availability. By isolated environment I mean a conda-env or virtualenv, in which you can install packages without modifying your system Python installation.
If we focus on just installation of Python packages, conda and pip serve different audiences and different purposes. If you want to, say, manage Python packages within an existing system Python installation, conda can’t help you: by design, it can only install packages within conda environments. If you want to, say, work with the many Python packages which rely on external dependencies (NumPy, SciPy, and Matplotlib are common examples), while tracking those dependencies in a meaningful way, pip can’t help you: by design, it manages Python packages and only Python packages.
Conda and pip are not competitors, but rather tools focused on different groups of users and patterns of use.
This is what I do:
I recently ran into this when numpy / matplotlib freaked out and I used the conda build to resolve the issue.
Concur with eatmeimadanish. Conda first, then pip makes the most sense given your *conda starting point.
The TL;DR Backstory
Anaconda (the distribution) and Conda (the package manager) were designed to solve installation and integration problems that the status quo did not.
If you are starting from Anaconda or miniconda, I assume you enjoy their virtues (their arguably higher ease of installation, better integration, etc.). Otherwise you’d probably choose a more “stock” or vanilla Python distribution, or maybe a different “better than the base Python because X, Y, and Z” distribution (e.g. ActivePython, Enthought Canopy, …). Given that, I’d think you’d want to use conda first, falling back to pip, rather than the other way around.
You can of course try installing each package with pip and fall back to conda only if pip disappoints, but that seems to circumvent your primary choice of starting with and favoring the *conda tools and ecosystem.
When using an anaconda ecosystem, you should always prefer conda before pip.
The docs specifically mention this: (emphasis mine)
If a package is not available from conda or Anaconda.org, you may be
able to find and install the package with another package manager like
Pip packages do not have all the features of conda packages and we
recommend first trying to install any package with conda. If the
package is unavailable through conda, try installing it with pip. The
differences between pip and conda packages cause certain unavoidable
limits in compatibility but conda works hard to be as compatible with
pip as possible.
Using conda packages can help your environment stay consistent, especially if you require managing a lot of dependencies within the same environment (or don’t have an environment but use the base environment directly).
Note: The following recommendations are now part of the official documentation.
"What is the current (2019) wisdom regarding when to install something with
Anaconda Inc’s Jonathan Helmus sums this up quite nicely in the post "Using Pip in a Conda Environment." Here’s an excerpt from the final best practices recommendation:
Best Practices Checklist
- install as many requirements as possible with
conda, then use
- pip should be run with
--upgrade-strategy "only-if-needed"(the default)
- Do not use
--userargument, avoid all “users” installs
Use Conda environments for isolation
- create a Conda environment to isolate any changes
- environments take up little space thanks to hard links
- care should be taken to avoid running
pipin the root [base] environment
Recreate the environment if changes are needed
piphas been used
condawill be unaware of the changes
- to install additional Conda packages it is best to recreate the environment
piprequirements in text files
- package requirements can be passed to
pipaccepts a list of Python packages with
conda envwill export or create environments based on a file with
As add-on to @eatmeimadanisch and @merve’s recommendation “use conda first, thenn try pip”,
here is the corresponding code to run this from the command line of a linux system:
while read requirement; do conda install --yes $requirement || pip install $requirement; done < requirements.txt
This assumes that all packages with desired package number are put into a file called “requirements.txt”. The entries look like this for example:
Note that the equal sign is double (==), not single (=).