How to have python libraries already installed in python project?

Question:

I am working on a python project that requires a few libraries. This project will further be shared with other people.

The problem I have is that I can’t use the usual pip install 'library' as part of my code because the project could be shared with offline computers and the work proxy could block the download.

So what I first thought of was installing .whl files and running pip install 'my_file.whl' but this is limited since some .whl files work on some computers but not on others, so this couldn’t be the solution of my problem.

I tried sharing my project with another project and i had an error with a .whl file working on one computer but not the other.

What I am looking for is to have all the libraries I need to be already downloaded before sharing my project. So that when the project is shared, the peers can launch it without needing to download the libraries.

Is this possible or is there something else that can solve my problem ?

Asked By: clemdcz

||

Answers:

There are different approaches to the issue here, depending on what the constraints are:


1. Defined Online Dependencies

It is a good practice to define the dependencies of your project (not only when shared). Python offers different methods for this.

In this scenario every developer has access to a pypi repository via the network. Usually the official main mirrors (i.e. via internet). New packages need to be pulled individually from here, whenever there are changes.
Repository (internet) access is only needed when pulling new packages.

Below the most common ones:

1.1 requirements.txt

The requirements.txt is a plain text list of required packages and versions, e.g.

# requirements.txt
matplotlib==3.6.2
numpy==1.23.5
scipy==1.9.3

When you check this in along with your source code, users can freely decide how to install it. The mosty simple (and most convoluted way) is to install it in the base python environment via

pip install -r requirements.txt

You can even automatically generate such a file, if you lost track with pipreqs. The result is usually very good. However, a manual cleanup afterwards is recommended.

Benefits:

  • Package dependency is clear
  • Installation is a one line task

Downsides:

  • Possible conflicts with multiple projects
  • Not sure that everyone has the exact same version if flexibility is allowed (default)

1.2 Pipenv

There is a nice and almost complete Answer to Pipenv. Also the Pipenv documentation itself is very good.
In a nutshell: Pipenv allows you to have virtual environments. Thus, version conflicts from different projects are gone for good. Also, the Pipfile used to define such an environment allows seperation of production and development dependencies.

Users now only need to run the following commands in the folder with the source code:

pip install pipenv # only needed first time
pipenv install

And then, to activate the virtual environment:

pipenv shell

Benefits:

  • Seperation between projects
  • Seperation of development/testing and production packages
  • Everyone uses the exact same version of the packages
  • Configuration is flexible but easy

Downsides:

  • Users need to activate the environment

1.3 conda environment

If you are using anaconda, a conda environment definition can be also shared as a configuration file. See this SO answer for details.

This scenario is as the pipenv one, but with anaconda as package manager. It is recommended not to mix pip and conda.

1.4 setup.py

When you are anyway implementing a library, you want to have a look on how to configure the dependencies via the setup.py file.


2. Defined local dependencies

In this scenario the developpers do not have access to the internet. (E.g. they are "air-gapped" in a special network where they cannot communicate to the outside world. In this case all the scenarios from 1. can still be used. But now we need to setup our own mirror/proxy. There are good guides (and even comlplete of the shelf software) out there, depending on the scenario (above) you want to use. Examples are:

Benefits:

  • Users don’t need internet access
  • Packages on the local proxy can be trusted (cannot be corrupted / deleted anymore)
  • The clean and flexible scenarios from above can be used for setup

Downsides:

  • Network connection to the proxy is still required
  • Maintenance of the proxy is extra effort

3. Turn key environments

Last, but not least, there are solutions to share the complete and installed environment between users/computers.

3.1 Copy virtual-env folders

If (and only if) all users (are forced to) use an identical setup (OS, install paths, uses paths, libraries, LOCALS, …) then one can copy the virtual environments for pipenv (1.2) or conda (1.3) between PCs.

These "pre-compiled" environments are very fragile, as a sall change can cause the setup to malfunction. So this is really not recommended.

Benefits:

  • Can be shared between users without network (e.g. USB stick)

Downsides:

  • Very fragile

3.2 Virtualisation

The cleanest way to support this is some kind of virtualisation technique (virtual machine, docker container, etc.).
Install python and the dependencies needed and share the complete container.

Benefits:

  • Users can just use the provided container

Downsides:

  • Complex setup
  • Complex maintenance
  • Virtualisation layer needed
  • Code and environment may become convoluted

Note: This answer is compiled from the summary of (mostly my) comments

Answered By: Cpt.Hook
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.