pandas unit test AssertionError: DataFrame.index are different

Question:

I have this function that I want to test:

def filter_df(df, column_name: str, skill: List):
    return df.query(f"{column_name} in {skill}")

This is my test:

def test_filter_df():
    df = pd.DataFrame({"col1": ["sap", "hi", "abc"], "col2": [3, 4, 4]})
    expected = pd.DataFrame({"col1": ["hi", "abc"], "col2": [4, 4]})
    assert_frame_equal(filter_df(df, "col1", ["hi", "abc"]), expected)

I’m getting a assert_frame_equal(filter_df(df, "col1", ["hi", "abc"]), expected) error, but I don’t see why the dataframes aren’t identical.

Asked By: Omega

||

Answers:

You need to reset the index in filter_df:

df.query(f"{column_name} in {skill}").reset_index(drop=True)

At the moment the returned DF has the original index of the given rows which in your case is 1,2 and not 0,1 as in the expected DF

Alternatively, if this is intended behavior of the function, edit the expected DF to have the correct index

Answered By: SiP
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.