Using pytest to reuse the same dataframe across modules in a class

Question:

I am trying to reuse the same dataframe in pytest. I have initialised it in the init method but I would like to change it into a pytest fixture then pass it through to each of the methods. I am struggling to use pytest to apply this.

import pytest
import utils
import numpy as np
import pandas as pd
from datetime import datetime
class TestGetDatesCorrespondingToReference:
    # @pytest.fixture(
    # scope="module",
    # params=df_input)
    def __init__(self) -> None:
        self.df_input = pd.DataFrame({
                'table_id' : ['all_legs_predictions_20220914_20220607',
                              'all_legs_predictions_20210914_20210607'] ,
                'prefix' : [datetime(2022, 9, 14, 0, 0),
                            datetime(2021, 9, 14, 0, 0)] ,
                'suffix' : [datetime(2022, 6, 7, 0, 0), 
                            datetime(2021, 6, 7, 0, 0)]
                })

    
    def test_get_dates_corresponding_to_reference(self):
        actual = self.df_input
            .pipe(utils.get_dates_corresponding_to_reference,"20221105" )
        
        expected = pd.DataFrame({
                'table_id' : ['all_legs_predictions_20220914_20220607'] ,
                'prefix' : [datetime(2022, 9, 14, 0, 0)],
                'suffix' : [datetime(2022, 6, 7, 0, 0)] })
        pd.testing.assert_frame_equal(actual, expected)

Asked By: user4933

||

Answers:

You should define a new function with @pytest.fixture decorator.

Usually fixtures are defined in file name conftest.py. However you are allowed to define it in test files as well.

So based on your example I believe you want to achieve something like this:

class TestGetDatesCorrespondingToReference:
    def test_get_dates_corresponding_to_reference(self, some_dataframe):
        actual = some_dataframe 
            .pipe(utils.get_dates_corresponding_to_reference, "20221105")
        expected = pd.DataFrame({
            'table_id': ['all_legs_predictions_20220914_20220607'],
            'prefix': [datetime(2022, 9, 14, 0, 0)],
            'suffix': [datetime(2022, 6, 7, 0, 0)]})
        pd.testing.assert_frame_equal(actual, expected)


@pytest.fixture(scope="module")
def some_dataframe():
    return pd.DataFrame({
        'table_id': ['all_legs_predictions_20220914_20220607',
                     'all_legs_predictions_20210914_20210607'],
        'prefix': [datetime(2022, 9, 14, 0, 0),
                   datetime(2021, 9, 14, 0, 0)],
        'suffix': [datetime(2022, 6, 7, 0, 0),
                   datetime(2021, 6, 7, 0, 0)]
    })

Please note that you use fixture name as test method’s argument and then it stores fixture’s output in this variable.

Also make sure that you want this scope of fixture. module means that you will calculate output of this fixture once per module. So if you change output dataframe in one test of module it will stay changed along all of tests of this module.

If you want to recalculate fixture value in each test, use function scope.

You can learn more about fixtures and its scopes here.

Answered By: pL3b