Only format df rows that are not NaN

Question

I have the following df:

df = pd.DataFrame({'A': [0.0137, 0.1987, 'NaN', 0.7653]})

Output:

I am trying to format each row from column A, using .iloc (because I have many columns in my actual code) into, e.g. 1.37%.

However, If I perform

df.iloc[:, 0] = (df.iloc[:, 0] * 100).astype(float).map('{:,.2f}%'.format)

All the NaN rows receive a trailing '%', yielding 'NaN%'

So if I try:

df.iloc[:, 0] = df.iloc[:, 0].apply(
        lambda x: (x * 100).astype(float).map('{:,.2f}%'.format) if x.notna()
        else None)

I get IndexError: single positional indexer is out-of-bounds.

How can I properly format every row of my df that is not a NaN?

Note: I’m specifically using df.iloc before the equal sign because I only want to inplace change those columns.

Asked By: Luiz Scheuer

||

Source

Answer 1

use df.loc to choose not NA rows and apply the logic you already have built

# your DF definition has 'NaN' as string, to converting it to np.nan
df.replace('NaN', np.nan, inplace=True)

# Select rows where the value for A is notna() and
# apply formatting

df.loc[df['A'].notna(), 'A']=(df.iloc[:, 0] * 100).astype(float).map('{:,.2f}%'.format)
df

    A
0   1.37%
1   19.87%
2   NaN
3   76.53%

Answered By: Naveed

Answer 2

Try this:

df.loc[~df['A'].isna(), 'A'] = (df.loc[~df['A'].isna(), 'A'] * 100).apply('{:,.2f}%'.format)

But careful, you are using NaN value as a string. I recommend to use numpy value. This should be:

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [0.0137, 0.1987, np.nan, 0.7653]})
df.loc[~df['A'].isna(), 'A'] = (df.loc[~df['A'].isna(), 'A'] * 100).apply('{:,.2f}%'.format)

Answered By: ErnestBidouille

Only format df rows that are not NaN

Question:

Answers: