Github Actions – Retrieving Dependencies from Cache

Question:

I have set up a Github Action to launch a python project in my own repo.

The project needs external libraries and it executes efficiently if, in every run, I install them.

I tried however to cache them after the installation and to retrieve them from the cache in every run but this is not working; in every run, the cache is hit but the dependencies are still getting installed.

Am I missing a step?

name: Daily Python Script

on:
  schedule:
    - cron: "0 19 * * *"
  workflow_dispatch: #Manual trigger
  
jobs:
  build:
    runs-on: ubuntu-latest
    
    permissions:
        # Give the default GITHUB_TOKEN write permission to commit and push the
        # added or changed files to the repository.
        contents: write

    steps:
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: 3.9

      - name: Update Node.js to v16
        uses: actions/setup-node@v3
        with:
          node-version: 16

      - name: Checkout Code
        uses: actions/checkout@v4

      - name: Restore Cached Dependencies
        uses: actions/cache@v3
        with:
          path: ~/.cache/pip
          key: ${{ runner.os }}-pip-${{ hashFiles('requirements.txt') }}
                    
      - name: Debug Cache
        run: |
            echo "Cache hit: ${{ steps.cache-dependencies.outputs.cache-hit }}"
            ls -l ~/.cache/pip

      - name: Install Dependencies
        if: steps.cache-dependencies.outputs.cache-hit != 'true'
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt
        
      - name: Run Python Script
        run: python main_v1.py

      - name: Commit and Push Logs
        uses: stefanzweifel/git-auto-commit-action@v4
        with:
          commit_message: Automated Change
          repository: .
          commit_user_name: My GitHub Actions Bot # defaults to "github-actions[bot]"
          commit_user_email: [email protected] # defaults to "41898282+github-actions[bot]@users.noreply.github.com"
          commit_author: Author <[email protected]> # defaults to author of the commit that triggered the run

The requirement.txt file includes the following:

numpy==1.21.6
oauth2client==4.1.3
pandas==1.5.3
pytz==2022.7.1
Requests==2.31.0
selenium==4.9.0
trycourier==4.4.0
Asked By: tatojunior

||

Answers:

  • Step "Restore Cached Dependencies" does not have id set [to cache-dependencies]
  • Without id, there’s no way to get output values from that step
  • ${{ steps.cache-dependencies.outputs.cache-hit }} will always return empty string
  • Step "Install Dependencies" condition becomes if: "" != "true" and evaluates to true
  • Dependencies are installed over and over again making user sad
Answered By: Samira

SOLVED

My original approach was not up do date and the reason for the ModuleNotFoundError is that setup-python caches and retrieves dependencies under the hood since Nov 23rd, 2021.
(https://github.blog/changelog/2021-11-23-github-actions-setup-python-now-supports-dependency-caching/)

More information can be found here: https://docs.github.com/en/actions/using-workflows/caching-dependencies-to-speed-up-workflows

Here is the updated yml file:

name: Daily Python Script

on:
  schedule:
    - cron: "0 19 * * *"
  workflow_dispatch: #Manual trigger

jobs:
  build:
    runs-on: ubuntu-latest

permissions:
    # Give the default GITHUB_TOKEN write permission to commit and push the
    # added or changed files to the repository.
    contents: write

steps:
  - name: Checkout
    uses: actions/checkout@v4

  - name: Set up Python
    uses: actions/setup-python@v4
    with:
      python-version: '3.9'
      cache: 'pip' # caching pip dependencies

  - name: Install Dependencies
    run: |
      python -m pip install --upgrade pip
      pip install -r requirements.txt
    
  - name: Run Python Script
    run: python main_v1.py

  - name: Commit and Push Logs
    uses: stefanzweifel/git-auto-commit-action@v4
    with:
      commit_message: Automated Change
      repository: .
      commit_user_name: My GitHub Actions Bot # defaults to "github-actions[bot]"
      commit_user_email: [email protected] # defaults to "41898282+github-actions[bot]@users.noreply.github.com"
      commit_author: Author <[email protected]> # defaults to author of the commit that triggered the run

In the build log, if the cache is hit and the requirements.txt has not changed, when the Install Dependencies is reached, the Action will show the dependencies being retrieved from the cache:

Collecting numpy==1.21.6 (from -r requirements.txt (line 1))
  Using cached numpy-1.21.6-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB) 

User is happy again!

Answered By: tatojunior