Equivalent of `package.json' and `package-lock.json` for `pip`

Question:

Package managers for JavaScript like npm and yarn use a package.json to specify ‘top-level’ dependencies, and create a lock-file to keep track of the specific versions of all packages (i.e. top-level and sub-level dependencies) that are installed as a result.

In addition, the package.json allows us to make a distinction between types of top-level dependencies, such as production and development.

For Python, on the other hand, we have pip. I suppose the pip equivalent of a lock-file would be the result of pip freeze > requirements.txt.

However, if you maintain only this single requirements.txt file, it is difficult to distinguish between top-level and sub-level dependencies (you would need for e.g. pipdeptree -r to figure those out). This can be a real pain if you want to remove or change top-level dependencies, as it is easy to be left with orphaned packages (as far as I know, pip does not remove sub-dependencies when you pip uninstall a package).

Now, I wonder: Is there some convention for dealing with different types of these requirements files and distinguishing between top-level and sub-level dependencies with pip?

For example, I can imagine having a requirements-prod.txt which contains only the top-level requirements for the production environment, as the (simplified) equivalent of package.json, and a requirements-prod.lock, which contains the output of pip freeze, and acts as my lock-file. In addition I could have a requirements-dev.txt for development dependencies, and so on and so forth.

I would like to know if this is the way to go, or if there is a better approach.

p.s. The same question could be asked for conda‘s environment.yml.

Asked By: djvg

||

Answers:

There are at least three good options available today:

  1. Poetry uses pyproject.toml and poetry.lock files, much in the same way that package.json and lock files work in the JavaScript world.

    This is now my preferred solution.

  2. Pipenv uses Pipfile and Pipfile.lock, also much like you describe the JavaScript files.

Both Poetry and Pipenv do more than just dependency management. Out of the box, they also create and maintain virtual environments for your projects.

  1. pip-tools provides pip-compile and pip-sync commands. Here, requirements.in lists your direct dependencies, often with loose version constraints and pip-compile generates locked down requirements.txt files from your .in files.

    This used to be my preferred solution. It’s backwards-compatible (the generated requirements.txt can be processed by pip) and the pip-sync tool ensures that the virtualenv exactly matches the locked versions, removing things that aren’t in your "lock" file.

Answered By: Chris

I had the same question and I came up with a more generic and simple solution. I am using the well-known requirements.txt for all explicit dependencies and requirements.lock as a list of all packages including sub dependencies.

I personally like to manage python, pip and setuptools via the distributions builtin package manager and install pip dependencies inside a virtual environment.

Usually you would start installing all directly required dependencies. This will pull in all sub dependencies as well. If you are not using a virtual environment make sure to add the --user flag.

# If you already have a requirements file
pip3 install -r requirements.txt

# If you start from scratch
pip3 install <package>

If you want to upgrade your packages, you have multiple options here as well. Since I am using a virtual environment I will always update all packages. However you are free to only update your direct requirements. If they need an update of their dependencies, those will be pulled in as well, everything else will be left untouched.

# Update all outdated packages (excluding pip and setuptools itself)
pip3 install -r <(pip3 list --outdated --format freeze --exclude pip setuptools | cut -d '=' -f1) --upgrade

# Update explicitly installed packages, update sub dependencies only if required.
pip3 install -r <(cut -d '=' -f1 requirements.txt) --upgrade

Now we come to the tricky part: Saving back our requirements file. Make sure that the previous requirements file is checked into git, so if anything goes wrong you have a backup.

Remember that we want to differentiate between packages explicitly installed (requirements.txt) and packages including their dependencies (requirements.lock).

If you have not yet setup a requirements.txt I suggest running the following command. Note that it will not include sub dependencies if they are already satisfied by another package. This means requests will not be included in the list, if it was already satisfied by another package. You might still want to add that manually, if your script explicitly relies on such a package.

pip3 list --not-required --format freeze --exclude pip --exclude setuptools > requirements.txt

If you already have a requirements.txt you can update it by using this sed trick. This will leave all sub-dependencies outside, which we will only include in the requirements.lock in the next step.

pip3 freeze -r requirements.txt | sed -n '/## The following requirements were added by pip freeze:/q;p' | sponge requirements.txt

Finally we can output all dependencies to a requirements.lock file which will be our complete list of all packages and versions. If we have issues to reproduce an issue, we can always come back to this lock file and run our code with the previously working dependencies.

# It is important to use the -r option here, so pip will differenciate between directly required packages and dependencies.
pip3 freeze -r requirements.txt > requirements.lock
Answered By: NicoHood