Python using Pandas – Retrieving the name of all columns that contain numbers

Question:

I searched for a solution on the site, but I couldn’t find anything relevant, only outdated code. I am new to the Pandas library and I have the following dataframe as an example:

A B C D E
142 0.4 red 108 front
164 1.3 green 98 rear
71 -1.0 blue 234 front
109 0.2 black 120 front

I would like to extract the name of the columns that contain numbers (integers and floats). It is completely fine to use the first row to achieve this.
So the result should look like this: ['A', 'B', 'D']

I tried the following command to get some of the columns that contained numbers:

dataframe.loc[0, dataframe.dtypes == 'int64']

Out:
A 142
D 108

There are two problems with this. First of all, I just need the name of the columns, but not the values. Second, this captures only the integer columns. My next attempt just gave an error:

dataframe.loc[0, dataframe.dtypes == 'int64' or dataframe.dtypes == 'float64']
Asked By: Adrian

||

Answers:

Based on Marcelo’s comment, you can use:

from pandas.api.types import is_numeric_dtype

numeric_columns = []
for column in df.columns:
    if is_numeric_dtype(df[column]):
        numeric_columns.append(column)
print(numeric_columns)
Answered By: Minh-Long Luu

You can use .dtype then .kind while filtering the the column names with list comprehension.

# import pandas as pd
# df = pd.read_html('https://stackoverflow.com/questions/75909965')[0] # scraped your q

[c for c in df.columns if df[c].dtype.kind in 'iufc']

should return ['A', 'B', 'D']. [Note that 'iufc' covers signed and unsigned integers as well as real and complex floating-point numbers. Add b if you want to cover Booleans as well since they’re a subclass of int in python….]

Answered By: Driftr95

Using select_dtypes:

dataframe.select_dtypes('number').columns.tolist()

Output:

['A', 'B', 'D']
Answered By: mozway

Use the below function:

First it select all the numeric columns, then it finds the columns, which is finally converted into list.

df.select_dtypes(include="number").columns.to_list()
Answered By: Sajil Alakkalakath

Another possibles solution:

import re

df.columns[
    [re.match(r'^(int|float)', x.name) != None for x in df.dtypes]].to_list()

Output:

['A', 'B', 'D']
Answered By: PaulS
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.