`df.select_dtypes` works with `float` but not `int`

Question:

I just came across this strange behaviour of pd.DataFrame.select_dtypes.

My pd.DataFrame is:

df = pd.DataFrame({'a': [1, 2, 3, 4], 'b': ['a', 'b', 'c', 'd'], 'c': [1.2, 3.4, 5.6, 7.8]})

Now if I want to select the numeric columns, I would do:

df.select_dtypes([int, float])

But the the output only contains the float column:

     c
0  1.2
1  3.4
2  5.6
3  7.8

Why is that? I listed both float and int, why doesn’t it list the integer column.

Here are the dtypes:

>>> df.dtypes
a      int64
b     object
c    float64
dtype: object
>>> 

As you can see, they’re both end with 64, but only float works.

More tests:

>>> df.select_dtypes(int)
Empty DataFrame
Columns: []
Index: [0, 1, 2, 3]
>>> df.select_dtypes(float)
     c
0  1.2
1  3.4
2  5.6
3  7.8
>>> 

Why does this happen?


I know I could just do:

df.select_dtypes(['int64', 'float64'])

But I want to know the reason for this behavior.

Asked By: U12-Forward

||

Answers:

If need all integers and all float columns check numpy types:

It means int16, int32, int64 match integer, same principe for floats:

print (df.select_dtypes(['integer', 'floating']))
   a    c
0  1  1.2
1  2  3.4
2  3  5.6
3  4  7.8

Reason: Found numpy types:

Warning

The int_ type does not inherit from the int built-in under Python 3, because type int is no longer a fixed-width integer type.

Answered By: jezrael
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.