case insensitive pandas.Series.replace

Question:

I want to replace some values in categorical data columns with np.nan. What is the best method for replacing values in a case-insensitive manner while maintaining the same categories (in the same order)?

import pandas as pd 
import numpy as np 

# set up a DF with ordered categories
values = ['one','two','three','na','Na','NA']
df = pd.DataFrame({
    'categ' : values
})
df['categ'] = df['categ'].astype('category')
df['categ'].cat.categories = values


# replace values
df['categ'].replace(
    to_replace='na',
    value=np.nan
)
Asked By: filups21

||

Answers:

Maybe replace before converting to category

import pandas as pd 
import numpy as np 

# set up a DF with ordered categories
values = ['one','two','three','na','Na','NA']
df = pd.DataFrame({
    'categ' : values
})


df['categ'] = df['categ'].str.lower().replace('na',np.nan)

Output

  categ
0    one
1    two
2  three
3    NaN
4    NaN
5    NaN
Answered By: Chris

You can also throw in a case insensitive regex flag, like so:

df['categ'].replace(
    to_replace=r'(?i:na)',
    regex=True,
    value=np.nan
)
Answered By: William Rosenbaum
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.