Equivalent of `package.json' and `package-lock.json` for `pip`
Question:
Package managers for JavaScript
like npm
and yarn
use a package.json
to specify ‘top-level’ dependencies, and create a lock-file to keep track of the specific versions of all packages (i.e. top-level and sub-level dependencies) that are installed as a result.
In addition, the package.json
allows us to make a distinction between types of top-level dependencies, such as production and development.
For Python
, on the other hand, we have pip
. I suppose the pip
equivalent of a lock
-file would be the result of pip freeze > requirements.txt
.
However, if you maintain only this single requirements.txt
file, it is difficult to distinguish between top-level and sub-level dependencies (you would need for e.g. pipdeptree -r
to figure those out). This can be a real pain if you want to remove or change top-level dependencies, as it is easy to be left with orphaned packages (as far as I know, pip
does not remove sub-dependencies when you pip uninstall
a package).
Now, I wonder: Is there some convention for dealing with different types of these requirements
files and distinguishing between top-level and sub-level dependencies with pip
?
For example, I can imagine having a requirements-prod.txt
which contains only the top-level requirements for the production environment, as the (simplified) equivalent of package.json
, and a requirements-prod.lock
, which contains the output of pip freeze
, and acts as my lock
-file. In addition I could have a requirements-dev.txt
for development dependencies, and so on and so forth.
I would like to know if this is the way to go, or if there is a better approach.
p.s. The same question could be asked for conda
‘s environment.yml
.
Answers:
There are at least three good options available today:
-
Poetry uses pyproject.toml
and poetry.lock
files, much in the same way that package.json
and lock files work in the JavaScript world.
This is now my preferred solution.
-
Pipenv uses Pipfile
and Pipfile.lock
, also much like you describe the JavaScript files.
Both Poetry and Pipenv do more than just dependency management. Out of the box, they also create and maintain virtual environments for your projects.
-
pip-tools
provides pip-compile
and pip-sync
commands. Here, requirements.in
lists your direct dependencies, often with loose version constraints and pip-compile
generates locked down requirements.txt
files from your .in
files.
This used to be my preferred solution. It’s backwards-compatible (the generated requirements.txt
can be processed by pip
) and the pip-sync
tool ensures that the virtualenv exactly matches the locked versions, removing things that aren’t in your "lock" file.
I had the same question and I came up with a more generic and simple solution. I am using the well-known requirements.txt
for all explicit dependencies and requirements.lock
as a list of all packages including sub dependencies.
I personally like to manage python
, pip
and setuptools
via the distributions builtin package manager and install pip dependencies inside a virtual environment.
Usually you would start installing all directly required dependencies. This will pull in all sub dependencies as well. If you are not using a virtual environment make sure to add the --user
flag.
# If you already have a requirements file
pip3 install -r requirements.txt
# If you start from scratch
pip3 install <package>
If you want to upgrade your packages, you have multiple options here as well. Since I am using a virtual environment I will always update all packages. However you are free to only update your direct requirements. If they need an update of their dependencies, those will be pulled in as well, everything else will be left untouched.
# Update all outdated packages (excluding pip and setuptools itself)
pip3 install -r <(pip3 list --outdated --format freeze --exclude pip setuptools | cut -d '=' -f1) --upgrade
# Update explicitly installed packages, update sub dependencies only if required.
pip3 install -r <(cut -d '=' -f1 requirements.txt) --upgrade
Now we come to the tricky part: Saving back our requirements file. Make sure that the previous requirements file is checked into git, so if anything goes wrong you have a backup.
Remember that we want to differentiate between packages explicitly installed (requirements.txt
) and packages including their dependencies (requirements.lock
).
If you have not yet setup a requirements.txt I suggest running the following command. Note that it will not include sub dependencies if they are already satisfied by another package. This means requests
will not be included in the list, if it was already satisfied by another package. You might still want to add that manually, if your script explicitly relies on such a package.
pip3 list --not-required --format freeze --exclude pip --exclude setuptools > requirements.txt
If you already have a requirements.txt
you can update it by using this sed trick. This will leave all sub-dependencies outside, which we will only include in the requirements.lock
in the next step.
pip3 freeze -r requirements.txt | sed -n '/## The following requirements were added by pip freeze:/q;p' | sponge requirements.txt
Finally we can output all dependencies to a requirements.lock
file which will be our complete list of all packages and versions. If we have issues to reproduce an issue, we can always come back to this lock file and run our code with the previously working dependencies.
# It is important to use the -r option here, so pip will differenciate between directly required packages and dependencies.
pip3 freeze -r requirements.txt > requirements.lock
Package managers for JavaScript
like npm
and yarn
use a package.json
to specify ‘top-level’ dependencies, and create a lock-file to keep track of the specific versions of all packages (i.e. top-level and sub-level dependencies) that are installed as a result.
In addition, the package.json
allows us to make a distinction between types of top-level dependencies, such as production and development.
For Python
, on the other hand, we have pip
. I suppose the pip
equivalent of a lock
-file would be the result of pip freeze > requirements.txt
.
However, if you maintain only this single requirements.txt
file, it is difficult to distinguish between top-level and sub-level dependencies (you would need for e.g. pipdeptree -r
to figure those out). This can be a real pain if you want to remove or change top-level dependencies, as it is easy to be left with orphaned packages (as far as I know, pip
does not remove sub-dependencies when you pip uninstall
a package).
Now, I wonder: Is there some convention for dealing with different types of these requirements
files and distinguishing between top-level and sub-level dependencies with pip
?
For example, I can imagine having a requirements-prod.txt
which contains only the top-level requirements for the production environment, as the (simplified) equivalent of package.json
, and a requirements-prod.lock
, which contains the output of pip freeze
, and acts as my lock
-file. In addition I could have a requirements-dev.txt
for development dependencies, and so on and so forth.
I would like to know if this is the way to go, or if there is a better approach.
p.s. The same question could be asked for conda
‘s environment.yml
.
There are at least three good options available today:
-
Poetry uses
pyproject.toml
andpoetry.lock
files, much in the same way thatpackage.json
and lock files work in the JavaScript world.This is now my preferred solution.
-
Pipenv uses
Pipfile
andPipfile.lock
, also much like you describe the JavaScript files.
Both Poetry and Pipenv do more than just dependency management. Out of the box, they also create and maintain virtual environments for your projects.
-
pip-tools
providespip-compile
andpip-sync
commands. Here,requirements.in
lists your direct dependencies, often with loose version constraints andpip-compile
generates locked downrequirements.txt
files from your.in
files.This used to be my preferred solution. It’s backwards-compatible (the generated
requirements.txt
can be processed bypip
) and thepip-sync
tool ensures that the virtualenv exactly matches the locked versions, removing things that aren’t in your "lock" file.
I had the same question and I came up with a more generic and simple solution. I am using the well-known requirements.txt
for all explicit dependencies and requirements.lock
as a list of all packages including sub dependencies.
I personally like to manage python
, pip
and setuptools
via the distributions builtin package manager and install pip dependencies inside a virtual environment.
Usually you would start installing all directly required dependencies. This will pull in all sub dependencies as well. If you are not using a virtual environment make sure to add the --user
flag.
# If you already have a requirements file
pip3 install -r requirements.txt
# If you start from scratch
pip3 install <package>
If you want to upgrade your packages, you have multiple options here as well. Since I am using a virtual environment I will always update all packages. However you are free to only update your direct requirements. If they need an update of their dependencies, those will be pulled in as well, everything else will be left untouched.
# Update all outdated packages (excluding pip and setuptools itself)
pip3 install -r <(pip3 list --outdated --format freeze --exclude pip setuptools | cut -d '=' -f1) --upgrade
# Update explicitly installed packages, update sub dependencies only if required.
pip3 install -r <(cut -d '=' -f1 requirements.txt) --upgrade
Now we come to the tricky part: Saving back our requirements file. Make sure that the previous requirements file is checked into git, so if anything goes wrong you have a backup.
Remember that we want to differentiate between packages explicitly installed (requirements.txt
) and packages including their dependencies (requirements.lock
).
If you have not yet setup a requirements.txt I suggest running the following command. Note that it will not include sub dependencies if they are already satisfied by another package. This means requests
will not be included in the list, if it was already satisfied by another package. You might still want to add that manually, if your script explicitly relies on such a package.
pip3 list --not-required --format freeze --exclude pip --exclude setuptools > requirements.txt
If you already have a requirements.txt
you can update it by using this sed trick. This will leave all sub-dependencies outside, which we will only include in the requirements.lock
in the next step.
pip3 freeze -r requirements.txt | sed -n '/## The following requirements were added by pip freeze:/q;p' | sponge requirements.txt
Finally we can output all dependencies to a requirements.lock
file which will be our complete list of all packages and versions. If we have issues to reproduce an issue, we can always come back to this lock file and run our code with the previously working dependencies.
# It is important to use the -r option here, so pip will differenciate between directly required packages and dependencies.
pip3 freeze -r requirements.txt > requirements.lock