Applying a function to columns within a dataframe based on a condition
Question:
The goal of the below code is to loop through 5 columns of data and apply a function to that row within the column if a condition is met within the ‘condition’ column.
I have a data frame where I have set up a column as a result of a condition:
def WanCon(y):
if (30 <= y <= 48):
return 1
else:
return 2
time = four70['WY WEEK']
Wandata = pd.DataFrame(time)
Wandata['Condition'] = list(map(WanCon,Wandata['WY WEEK']))
I have then copied the values for 5 columns into the new data frame Wandata
where objects a-e are the x values within my below functions.
name = 'WANAP'
a = four40[name]
b = eight40[name]
c = four70[name]
d = eight70[name]
e = Historical[name]
Wandata[['440','840','470','870','Historical']] = [a,b,c,d,e]
Wandata.replace("",float('NaN'),inplace=True)
I then have two formula’s I would like to apply to the newly added columns based on the above condition where any row in condition where the value is one Waneq1()
is applied otherwise apply Waneq2()
:
def Waneq1(x):
return((model1.c[0]*x**3)+(model1.c[1]*x**2)+(model1.c[2]*x)+(model1.c[3]))
def Waneq2(x):
return((model2.c[0]*x**3)+(model2.c[1]*x**2)+(model2.c[2]*x)+(model2.c[3]))
for column in Wandata[['440','840','470','870','Historical']]:
if Wandata['Condition']==1:
Waneq1()
else:
Waneq2()
Im fairly new to python and this is the farthest I have gotten. I am wondering if any one knows of a better way to achieve this as step three is fairly challenging to me and i’ve hit a roadblock.
Answers:
It is unclear what specifically you are trying to do (we also don’t know what four40
etc. objects are), but the right way to apply conditions on a pandas dataframe
is generally:
wandata['output column'] = (model2.c[0]*x**3)+(model2.c[1]*x**2)+(model2.c[2]*x)+(model2.c[3]) # The default value of the column
wandata.loc[wandata['condition'] == 1, 'output columns'] = (model1.c[0]*x**3)+(model1.c[1]*x**2)+(model1.c[2]*x)+(model1.c[3]) # The value if the condition is true
I was able to find a solution to my problem by employing an .apply()
and lambda
statement within the df
.
The generalized solution is as below:
df['f(x)'] = df.apply(lambda r: function1(r['x']) if r['Condition'] == 1 else function2(r['x']), axis=1)
The goal of the below code is to loop through 5 columns of data and apply a function to that row within the column if a condition is met within the ‘condition’ column.
I have a data frame where I have set up a column as a result of a condition:
def WanCon(y):
if (30 <= y <= 48):
return 1
else:
return 2
time = four70['WY WEEK']
Wandata = pd.DataFrame(time)
Wandata['Condition'] = list(map(WanCon,Wandata['WY WEEK']))
I have then copied the values for 5 columns into the new data frame Wandata
where objects a-e are the x values within my below functions.
name = 'WANAP'
a = four40[name]
b = eight40[name]
c = four70[name]
d = eight70[name]
e = Historical[name]
Wandata[['440','840','470','870','Historical']] = [a,b,c,d,e]
Wandata.replace("",float('NaN'),inplace=True)
I then have two formula’s I would like to apply to the newly added columns based on the above condition where any row in condition where the value is one Waneq1()
is applied otherwise apply Waneq2()
:
def Waneq1(x):
return((model1.c[0]*x**3)+(model1.c[1]*x**2)+(model1.c[2]*x)+(model1.c[3]))
def Waneq2(x):
return((model2.c[0]*x**3)+(model2.c[1]*x**2)+(model2.c[2]*x)+(model2.c[3]))
for column in Wandata[['440','840','470','870','Historical']]:
if Wandata['Condition']==1:
Waneq1()
else:
Waneq2()
Im fairly new to python and this is the farthest I have gotten. I am wondering if any one knows of a better way to achieve this as step three is fairly challenging to me and i’ve hit a roadblock.
It is unclear what specifically you are trying to do (we also don’t know what four40
etc. objects are), but the right way to apply conditions on a pandas dataframe
is generally:
wandata['output column'] = (model2.c[0]*x**3)+(model2.c[1]*x**2)+(model2.c[2]*x)+(model2.c[3]) # The default value of the column
wandata.loc[wandata['condition'] == 1, 'output columns'] = (model1.c[0]*x**3)+(model1.c[1]*x**2)+(model1.c[2]*x)+(model1.c[3]) # The value if the condition is true
I was able to find a solution to my problem by employing an .apply()
and lambda
statement within the df
.
The generalized solution is as below:
df['f(x)'] = df.apply(lambda r: function1(r['x']) if r['Condition'] == 1 else function2(r['x']), axis=1)