How to cache pip packages within Azure Pipelines

Question:

Although this source provides a lot of information on caching within Azure pipelines, it is not clear how to cache Python pip packages for a Python project.

How to proceed if one is willing to cache Pip packages on an Azure pipelines build?

According to this, it may be so that pip cache will be enabled by default in the future. As far as I know it is not yet the case.

Answers:

I used the pre-commit documentation as inspiration:

and configured the following Python pipeline with Anaconda:

pool:
  vmImage: 'ubuntu-latest'

variables:
  CONDA_ENV: foobar-env
  CONDA_HOME: /usr/share/miniconda/envs/$(CONDA_ENV)/

steps:
- script: echo "##vso[task.prependpath]$CONDA/bin"
  displayName: Add conda to PATH

- task: Cache@2
  displayName: Use cached Anaconda environment
  inputs:
    key: conda | environment.yml
    path: $(CONDA_HOME)
    cacheHitVar: CONDA_CACHE_RESTORED

- script: conda env create --file environment.yml
  displayName: Create Anaconda environment (if not restored from cache)
  condition: eq(variables.CONDA_CACHE_RESTORED, 'false')

- script: |
    source activate $(CONDA_ENV)
    pytest
  displayName: Run unit tests
Answered By: Marek Grzenkowicz

To cache a standard pip install use this:

variables:
  # variables are automatically exported as environment variables
  # so this will override pip's default cache dir
  - name: pip_cache_dir
    value: $(Pipeline.Workspace)/.pip

steps:
  - task: Cache@2
    inputs:
      key: 'pip | "$(Agent.OS)" | requirements.txt'
      restoreKeys: |
        pip | "$(Agent.OS)"
      path: $(pip_cache_dir)
    displayName: Cache pip

  - script: |
      pip install -r requirements.txt
     displayName: "pip install"
Answered By: Ted Elliott

I wasn’t very happy with the standard pip cache implementation that is mentioned in the official documentation. You basically always install your dependencies normally, which means that pip will perform loads of checks that take up time. Pip will find the cached builds (*.whl, *.tar.gz) eventually, but it all takes up time. You can opt to use venv or conda instead, but for me it lead to buggy situations with unexpected behaviour. What I ended up doing instead was using pip download and pip install separately:

variables:
  pipDownloadDir: $(Pipeline.Workspace)/.pip

steps:
- task: Cache@2
  displayName: Load cache
  inputs:
    key: 'pip | "$(Agent.OS)" | requirements.txt'
    path: $(pipDownloadDir)
    cacheHitVar: cacheRestored

- script: pip download -r requirements.txt --dest=$(pipDownloadDir)
  displayName: "Download requirements"
  condition: eq(variables.cacheRestored, 'false')

- script: pip install -r requirements.txt --no-index --find-links=$(pipDownloadDir)
  displayName: "Install requirements"
Answered By: J. Paalman