pytest, xdist, and sharing generated file dependencies

Question:

I have multiple tests that need require an expensive-to-generate file.
I’d like the file to be re-generated on every test run, but no more than once.
To complicate the matter, both these tests as well as the file depend on an input parameter.

def expensive(param) -> Path:
    # Generate file and return its path.

@mark.parametrize('input', TEST_DATA)
class TestClass:

    def test_one(self, input) -> None:
        check_expensive1(expensive(input))

    def test_two(self, input) -> None:
        check_expensive2(expensive(input))

How can make sure that this file is not regenerated across threads even when running these tests in parallel?
For context, I’m porting test infrastructure that Makefiles to pytest.

I’d be OK, with using file-based locks to synchronize, but I’m sure someone else has had this problem and would rather use an existing solution.

Using functools.cache works great for a single thread. Fixtures with scope="module" doesn’t work at all, because the parameter input is at function scope.

Asked By: nishantjr

||

Answers:

There’s an existing solution in the pytest-xdist documentation section "Making session-scoped fixtures execute only once":

import json

import pytest
from filelock import FileLock


@pytest.fixture(scope="session")
def session_data(tmp_path_factory, worker_id):
    if worker_id == "master":
        # not executing in with multiple workers, just produce the data and let
        # pytest's fixture caching do its job
        return produce_expensive_data()

    # get the temp directory shared by all workers
    root_tmp_dir = tmp_path_factory.getbasetemp().parent

    fn = root_tmp_dir / "data.json"
    with FileLock(str(fn) + ".lock"):
        if fn.is_file():
            data = json.loads(fn.read_text())
        else:
            data = produce_expensive_data()
            fn.write_text(json.dumps(data))
    return data

Note that filelock is not part of the standard library, but is available from PyPI. You can find the documentation here.

Answered By: cjs
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.