Replace string with np.nan if condition is met
Question:
I am trying to replace a string occurrence in a column if a condition is met.
My sample input dataset:
Series Name Type
Food ACG
Drinks FEG
Food at Home BON
I want to replace the strings on the Series Name column if the strings on the Type column are either ACG or BON with nan or blank. For that I tried the following code where I used conditions with not much success.
Code:
df.loc[((df['Type'] == 'ACG') | (df['Type'] == 'BON')),
df['Series Name'].replace(np.nan)]
Desired output:
Series Name Type
ACG
FEG
Food at Home BON
Answers:
Since you want to set the whole cell to nan, just do this:
df.loc[((df['Type'] == 'ACG') | (df['Type'] == 'BON')), 'Series Name'] = np.nan
Output:
Series Name Type
0 NaN ACG
1 Drinks FEG
2 NaN BON
Update:
Regarding to your question in the comments, if you only wanted to change parts of the string, you could use replace
like this:
#new input
df = pd.DataFrame({
'Series Name': ['Food to go', 'Fast Food', 'Food at Home'],
'Type': ['ACG', 'FEG', 'BON']
})
Series Name Type
0 Food to go ACG
1 Fast Food FEG
2 Food at Home BON
mask = df['Type'].isin(['ACG', 'BON'])
df.loc[mask, 'Series Name'] = (df.loc[mask, 'Series Name']
.replace(to_replace="Food", value='NEWVAL', regex=True))
print(df)
Series Name Type
0 NEWVAL to go ACG
1 Fast Food FEG
2 NEWVAL at Home BON
Another option is to use Series.mask
:
mask = df['Type'].isin(['ACG', 'BON'])
df['Series Name'] = df['Series Name'].mask(mask)
Output:
Series Name Type
0 NaN ACG
1 Drinks FEG
2 NaN BON
I am trying to replace a string occurrence in a column if a condition is met.
My sample input dataset:
Series Name Type
Food ACG
Drinks FEG
Food at Home BON
I want to replace the strings on the Series Name column if the strings on the Type column are either ACG or BON with nan or blank. For that I tried the following code where I used conditions with not much success.
Code:
df.loc[((df['Type'] == 'ACG') | (df['Type'] == 'BON')),
df['Series Name'].replace(np.nan)]
Desired output:
Series Name Type
ACG
FEG
Food at Home BON
Since you want to set the whole cell to nan, just do this:
df.loc[((df['Type'] == 'ACG') | (df['Type'] == 'BON')), 'Series Name'] = np.nan
Output:
Series Name Type
0 NaN ACG
1 Drinks FEG
2 NaN BON
Update:
Regarding to your question in the comments, if you only wanted to change parts of the string, you could use replace
like this:
#new input
df = pd.DataFrame({
'Series Name': ['Food to go', 'Fast Food', 'Food at Home'],
'Type': ['ACG', 'FEG', 'BON']
})
Series Name Type
0 Food to go ACG
1 Fast Food FEG
2 Food at Home BON
mask = df['Type'].isin(['ACG', 'BON'])
df.loc[mask, 'Series Name'] = (df.loc[mask, 'Series Name']
.replace(to_replace="Food", value='NEWVAL', regex=True))
print(df)
Series Name Type
0 NEWVAL to go ACG
1 Fast Food FEG
2 NEWVAL at Home BON
Another option is to use Series.mask
:
mask = df['Type'].isin(['ACG', 'BON'])
df['Series Name'] = df['Series Name'].mask(mask)
Output:
Series Name Type
0 NaN ACG
1 Drinks FEG
2 NaN BON