Confirming equality of two pandas dataframes?

Question:

How to assert that the following two dataframes df1 and df2 are equal?

import pandas as pd
df1 = pd.DataFrame([1, 2, 3])
df2 = pd.DataFrame([1.0, 2, 3])

The output of df1.equals(df2) is False.
As of now, I know two ways:

print (df1 == df2).all()[0]

or

df1 = df1.astype(float)
print df1.equals(df2)

It seems a little bit messy. Is there a better way to do this comparison?

Answers:

Using elegant @Divakar’s idea – numpy’s allclose() will do the main trick for numbers:

In [128]: df1
Out[128]:
   0    s  n
0  1  aaa  1
1  2  aaa  2
2  3  aaa  3

In [129]: df2
Out[129]:
     0    s    n
0  1.0  aaa  1.0
1  2.0  aaa  2.0
2  3.0  aaa  3.0

In [130]: (np.allclose(df1.select_dtypes(exclude=[object]), df2.select_dtypes(exclude=[object]))
   .....:  &
   .....:  df1.select_dtypes(include=[object]).equals(df2.select_dtypes(include=[object]))
   .....: )
Out[130]: True

select_dtypes() will help you to separate strings and all other numeric dtypes

You can use assert_frame_equal and not check the dtype of the columns.

# Pre v. 0.20.3
# from pandas.util.testing import assert_frame_equal

from pandas.testing import assert_frame_equal

assert_frame_equal(df1, df2, check_dtype=False)
Answered By: Alexander
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.