Pytest where to store expected data

Question:

Testing function I need to pass parameters and see the output matches the expected output.

It is easy when function’s response is just a small array or a one-line string which can be defined inside the test function, but suppose function I test modifies a config file which can be huge. Or the resulting array is something 4 lines long if I define it explicitly. Where do I store that so my tests remain clean and easy to maintain?

Right now if that is string I just put a file near the .py test and do open() it inside the test:

def test_if_it_works():
    with open('expected_asnwer_from_some_function.txt') as res_file:
        expected_data = res_file.read()
    input_data = ... # Maybe loaded from a file as well
    assert expected_data == if_it_works(input_data)

I see many problems with such approach, like the problem of maintaining this file up to date. It looks bad as well.
I can make things probably better moving this to a fixture:

@pytest.fixture
def expected_data()
    with open('expected_asnwer_from_some_function.txt') as res_file:
        expected_data = res_file.read()
    return expected_data

@pytest.fixture
def input_data()
    return '1,2,3,4'

def test_if_it_works(input_data, expected_data):
    assert expected_data == if_it_works(input_data)

That just moves the problem to another place and usually I need to test if function works in case of empty input, input with a single item or multiple items, so I should create one big fixture including all three cases or multiple fixtures. In the end code gets quite messy.

If a function expects a complicated dictionary as an input or gives back the dictionary of the same huge size test code becomes ugly:

 @pytest.fixture
 def input_data():
     # It's just an example
     return {['one_value': 3, 'one_value': 3, 'one_value': 3,
     'anotherky': 3, 'somedata': 'somestring'], 
      ['login': 3, 'ip_address': 32, 'value': 53, 
      'one_value': 3], ['one_vae': 3, 'password': 13, 'lue': 3]}

It’s quite hard to read tests with such fixtures and keep them up to date.

Update

After searching a while I found a library which solved a part of a problem when instead of big config files I had large HTML responses. It’s betamax.

For easier usage I created a fixture:

from betamax import Betamax

@pytest.fixture
def session(request):
    session = requests.Session()
    recorder = Betamax(session)
    recorder.use_cassette(os.path.join(os.path.dirname(__file__), 'fixtures', request.function.__name__)
    recorder.start()
    request.addfinalizer(recorder.stop)
    return session

So now in my tests I just use the session fixture and every request I make is being serialized automatically to the fixtures/test_name.json file so the next time I execute the test instead of doing a real HTTP request library loads it from the filesystem:

def test_if_response_is_ok(session):
   r = session.get("http://google.com")

It’s quite handy because in order to keep these fixtures up to date I just need to clean the fixtures folder and rerun my tests.

Asked By: Glueon

||

Answers:

Think if the whole contents of the config file really needs to be tested.

If only several values or substrings must be checked, prepare an expected template for that config. The tested places will be marked as “variables” with some special syntax. Then prepare a separate expected list of the values for the variables in the template. This expected list can be stored as a separate file or directly in the source code.

Example for the template:

ALLOWED_HOSTS = ['{host}']
DEBUG = {debug}
DEFAULT_FROM_EMAIL = '{email}'

Here, the template variables are placed inside curly braces.

The expected values can look like:

host = www.example.com
debug = False
email = [email protected]

or even as a simple comma-separated list:

www.example.com, False, [email protected]

Then your testing code can produce the expected file from the template by replacing the variables with the expected values. And the expected file is compared with the actual one.

Maintaining the template and expected values separately has and advantage that you can have many testing data sets using the same template.

Testing only variables

An even better approach is that the config generation method produces only needed values for the config file. These values can be easily inserted into the template by another method. But the advantage is that the testing code can directly compare all config variables separately and in clear way.

Templates

While it is easy to replace the variables with needed values in the template, there are ready template libraries, which allow to do it only in one line. Here are just a few examples: Django, Jinja, Mako

If you only have a few tests, then why not include the data as a string literal:

expected_data = """
Your data here...
"""

If you have a handful, or the expected data is really long, I think your use of fixtures makes sense.

However, if you have many, then perhaps a different solution would be better. In fact, for one project I have over one hundred input and expected-output files. So I built my own testing framework (more or less). I used Nose, but PyTest would work as well. I created a test generator which walked the directory of test files. For each input file, a test was yielded which compared the actual output with the expected output (PyTest calls it parametrizing). Then I documented my framework so others could use it. To review and/or edit the tests, you only edit the input and/or expected output files and never need to look at the python test file. To enable different input files to to have different options defined, I also crated a YAML config file for each directory (JSON would work as well to keep the dependencies down). The YAML data consists of a dictionary where each key is the name of the input file and the value is a dictionary of keywords that will get passed to the function being tested along with the input file. If you’re interested, here is the source code and documentation. I recently played with the idea of defining the options as Unittests here (requires only the built-in unittest lib) but I’m not sure if I like it.

Answered By: Waylan

I had a similar problem once, where I have to test configuration file against an expected file. That’s how I fixed it:

  1. Create a folder with the same name of your test module and at the same location. Put all your expected files inside that folder.

    test_foo/
        expected_config_1.ini
        expected_config_2.ini
    test_foo.py
    
  2. Create a fixture responsible for moving the contents of this folder to a temporary file. I did use of tmpdir fixture for this matter.

    from __future__ import unicode_literals
    from distutils import dir_util
    from pytest import fixture
    import os
    
    
    @fixture
    def datadir(tmpdir, request):
        '''
        Fixture responsible for searching a folder with the same name of test
        module and, if available, moving all contents to a temporary directory so
        tests can use them freely.
        '''
        filename = request.module.__file__
        test_dir, _ = os.path.splitext(filename)
    
        if os.path.isdir(test_dir):
            dir_util.copy_tree(test_dir, bytes(tmpdir))
    
        return tmpdir
    

    Important: If you are using Python 3, replace dir_util.copy_tree(test_dir, bytes(tmpdir)) with dir_util.copy_tree(test_dir, str(tmpdir)).

  3. Use your new fixture.

    def test_foo(datadir):
        expected_config_1 = datadir.join('expected_config_1.ini')
        expected_config_2 = datadir.join('expected_config_2.ini')
    

Remember: datadir is just the same as tmpdir fixture, plus the ability of working with your expected files placed into the a folder with the very name of test module.

Answered By: Fabio Menegazzo

I believe pytest-datafiles can be of great help. Unfortunately, it seems not to be maintained much anymore. For the time being, it’s working nicely.

Here’s a simple example taken from the docs:

import os
import pytest

@pytest.mark.datafiles('/opt/big_files/film1.mp4')
def test_fast_forward(datafiles):
    path = str(datafiles)  # Convert from py.path object to path (str)
    assert len(os.listdir(path)) == 1
    assert os.path.isfile(os.path.join(path, 'film1.mp4'))
    #assert some_operation(os.path.join(path, 'film1.mp4')) == expected_result

    # Using py.path syntax
    assert len(datafiles.listdir()) == 1
    assert (datafiles / 'film1.mp4').check(file=1)
Answered By: Dror
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.