How to replace column values that are not in a particular range will null values using a conditional in python

Question:

I have a dataframe that contains a column for age. Some of the values are outside of my desired range and I want to replace them will null values. I want ages that are not in the range between 20 and 50 to be replaced with null values.

This is what I tried and it doesn’t seem to work

import pandas as pd
import numpy as np

age_range = (df['age'] < 20) | (df['age'] > 50)
df[age_range = np.nan]
Asked By: mb611

||

Answers:

Simple syntax error. Do this

import pandas as pd
import numpy as np

df = pd.DataFrame({'age': [18, 25, 35, 40, 55]})

age_range = (df['age'] < 20) | (df['age'] > 50)
df.loc[age_range, 'age'] = np.nan

print(df)

which gives

   age
0   NaN
1  25.0
2  35.0
3  40.0
4   NaN

You can do this:

import pandas as pd
import numpy as np

df = pd.DataFrame({'age': [18, 22, 35, 55, 42]})

df['age'] = np.where((df['age'] < 20) | (df['age'] > 50), np.nan, df['age'])
print(df)

Output:

    age
0   NaN
1  22.0
2  35.0
3   NaN
4  42.0
Answered By: Abdulmajeed