find row positions and column names of cells contanining inf in pandas dataframe

Question:

How can I retrieve the column names and the rows of all the cells that contain inf in a multicolumns panda datarame df?

I have tried

inds = np.where(np.isinf(df)==True)

but I dont have the expected result

Asked By: gabboshow

||

Answers:

row positions:

df.index[np.isinf(df).any(1)]

column names:

df.columns.to_series()[np.isinf(df).any()]

Demo:

In [163]: df
Out[163]:
minor             AAPL                        GS
             Adj Close        Volume   Adj Close     Volume
Date
2017-03-01  139.789993  3.627240e+07  252.710007  5218300.0
2017-03-02  138.960007           inf         inf  3020000.0
2017-03-03  139.779999  2.110810e+07  252.889999  3163700.0
2017-03-06  139.339996           inf         inf  2462300.0
2017-03-07  139.520004  1.726750e+07  250.899994  2414900.0
2017-03-08  139.000000           inf         inf  3574400.0
2017-03-09  138.679993  2.206520e+07  250.179993  3055700.0
2017-03-10  139.139999  1.948800e+07  248.380005  3357800.0
2017-03-13  139.199997  1.704240e+07  248.160004  1782700.0

In [164]: df.index[np.isinf(df).any(1)]
Out[164]: DatetimeIndex(['2017-03-02', '2017-03-06', '2017-03-08'], dtype='datetime64[ns]', name='Date', freq=None)

In [165]: df.columns.to_series()[np.isinf(df).any()]
Out[165]:
minor
AAPL   Volume        (AAPL, Volume)
GS     Adj Close    (GS, Adj Close)
dtype: object

@MaxU’s answer is helpful, but if you have a df with non-numeric columns it will error out; here’s how to work around that:

numerics = ['int16', 'int32', 'int64', 'float16', 'float32', 'float64']
newdf = df.select_dtypes(include=numerics)
newdf.columns.to_series()[np.isinf(newdf).any()]
Answered By: dberi
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.