Apply a condition to multiple columns to replace the
Question:
I have a data frame with the below columns, some of the IGFRESAS columns have data in them, but I’ll only be replacing the data that doesn’t exist based on these conditions – see print screen
I want to use the condition EV_EM == 0 then copy EV_RND to all the IGFREAS4X columns.
I got this to work for one column:
df["IGFREAS41"]=np.where(df['EV_EM'] == 0, df['EV_RND'], df["IGFREAS41"])
I tried this:
columns = ["IGFREAS41", "IGFREAS43", "IGFREAS44", "IGFREAS42"]
np.where(df[‘EV_EM’] == 0, df[‘EV_RND’], df[columns])
I got this error:
—> 13 np.where(df[‘EV_EM’] == 0, df[‘EV_RND’], df[columns])
File <array_function internals>:180, in where(*args, **kwargs)
ValueError: operands could not be broadcast together with shapes (13,) (13,) (13,4)
Answers:
if I understand correctly the problem then the following works:
import pandas as pd
data={"EV_EM":[0,1,0,0,1,0,0,1,1],
"EV_RND":["EM Not Avaiable","","EM Not Avaiable1","EM Not Avaiable2","","EM Not Avaiable3","EM Not Avaiable4","",""],
"IGFREAS41":["","","","","","","","",""],
"IGFREAS42":["","","","","","","","",""],
"IGFREAS43":["","","","","","","","",""],
}
df=pd.DataFrame(data)
mask=df["EV_EM"]==0
cols_to_fill=[x for x in df.columns if x.startswith("IGFREAS")]
df.loc[mask,cols_to_fill]=df["EV_RND"]
I have a data frame with the below columns, some of the IGFRESAS columns have data in them, but I’ll only be replacing the data that doesn’t exist based on these conditions – see print screen
I want to use the condition EV_EM == 0 then copy EV_RND to all the IGFREAS4X columns.
I got this to work for one column:
df["IGFREAS41"]=np.where(df['EV_EM'] == 0, df['EV_RND'], df["IGFREAS41"])
I tried this:
columns = ["IGFREAS41", "IGFREAS43", "IGFREAS44", "IGFREAS42"]
np.where(df[‘EV_EM’] == 0, df[‘EV_RND’], df[columns])
I got this error:
—> 13 np.where(df[‘EV_EM’] == 0, df[‘EV_RND’], df[columns])
File <array_function internals>:180, in where(*args, **kwargs)
ValueError: operands could not be broadcast together with shapes (13,) (13,) (13,4)
if I understand correctly the problem then the following works:
import pandas as pd
data={"EV_EM":[0,1,0,0,1,0,0,1,1],
"EV_RND":["EM Not Avaiable","","EM Not Avaiable1","EM Not Avaiable2","","EM Not Avaiable3","EM Not Avaiable4","",""],
"IGFREAS41":["","","","","","","","",""],
"IGFREAS42":["","","","","","","","",""],
"IGFREAS43":["","","","","","","","",""],
}
df=pd.DataFrame(data)
mask=df["EV_EM"]==0
cols_to_fill=[x for x in df.columns if x.startswith("IGFREAS")]
df.loc[mask,cols_to_fill]=df["EV_RND"]