How to install diff version of a package (transformers) without internet in kaggle notebook w/o killing the kernel while keeping variables in memory?

Question:

I have prepared an inference pipeline for a Kaggle competition and it has to be executed without internet connection.

I’m trying to use different versions of transformers but I had some issues regarding the installation part.

Kaggle’s default transformers version is 4.26.1. I start with installing a different branch of transformers (4.18.0.dev0) like this.

!pip install ./packages/sacremoses-0.0.53
!pip install /directory/to/packages/transformers-4.18.0.dev0-py3-none-any.whl --find-links /directory/to/packages

It installs transformers-4.18.0.dev0 without any problem. I use this version of the package and do the inference with some models. Then I want to use another package open_clip_torch-2.16.0 which is compatible with transformers-4.27.3, so I install them by simply doing

!pip install /directory/to/packages/transformers-4.27.3-py3-none-any.whl --no-index --find-links /directory/to/packages
!pip install /directory/to/packages/open_clip_torch-2.16.0-py3-none-any.whl --no-index --find-links /directory/to/packages/

I get a prompt of Successfully installed transformers-4.27.3 and open_clip_torch-2.16.0.

!pip list | grep transformers outputs transformers 4.27.3 but when I do

import transformers
transformers.__version__

the version is '4.18.0.dev0'. I can’t use open_clip because of that reason. Some of the codes are breaking because it uses the old version of transformers even though I installed a newer version. How can I resolve this issue?

Asked By: gunesevitan

||

Answers:

Following https://www.kaggle.com/code/samuelepino/pip-installing-packages-with-no-internet

  1. From https://pypi.org/project/transformers/#files, download transformers-4.27.3-py3-none-any.whl

  2. Upload .whl file to Kaggle as dataset

  3. ! pip install -U transformers --no-index --find-links=/kaggle/input/transformers-wheels

  4. Restart the kernel’s runtime, with one of these tricks: https://stackoverflow.com/a/37993787/610569 or https://realpython.com/lessons/reloading-module/; from the comments, looks like the importlib.reload() works.

  5. Check the transformers version

Answered By: alvas

When you initially import a module in a Python environment it is cached in sys.modules. Subsequent imports are not read from the disk but from the cache, for this reason you are not seeing the new version of the module being loaded.

import sys
import transformers
sys.modules['transformers'].__version__

A possible solution is to attempt to reload the module using importlib.reload.

import importlib
importlib.reload(transformers)
sys.modules['transformers'].__version__

Read the documentation so that you are aware of the caveats of using this method.

Answered By: Dan Nagle