How to replace variable with groupby using values from quantiles in python
Question:
I have a df as shown below:
>>> df.head()
group_type value
G1 125.23
G1 107.19
G1 117.37
G1 102.68
G2 185.58
G1 82.31
G2 21.82
G2 168.21
G2 134.17
G1 71.45
I have calculated the quantile values within each group as below:
>>> lowtail = df.groupby('group_type')['value'].quantile(0.25)
>>> lowtail
group_type
G1 103.8075
G2 59.0425
Name: value, dtype: float64
>>> hightail = df.groupby('group_type')['value'].quantile(0.75)
>>> hightail
group_type
G1 123.2650
G2 172.5525
Name: value, dtype: float64
Now, I need to replace the value
in df
within each group_type
with the calculated quantile values, lowtail
and hightail
based on the conditions if:
-
df.groupby(‘group_type’)[‘value’] < the value of corresponding group_type in lowtail then replace with lowtail value of the corresponding group_type
-
df.groupby(‘group_type’)[‘value’] > the value of corresponding group_type in hightail then replace with hightail value of the corresponding group_type
The desired output looks like:
group_type value new_value
G1 125.23 123.2650
G1 107.19 107.19
G1 117.37 117.37
G1 102.68 103.8075
G2 185.58 172.5525
G1 82.31 103.8075
G2 21.82 59.0425
G2 168.21 168.21
G2 134.17 134.17
G1 71.45 103.8075
I am able to do a simple replace with fixed values
df.loc[df[value] < lowtail, [value]] = lowtail
but could not condition and replace using the groupby
. Can anyone help here.
Answers:
Is that what you want to code ? :
grp = df.groupby('group_type')['value']
low, high = grp.quantile(0.25), grp.quantile(0.75)
def f(x):
if x.value < low[x.name]:
return low[x.name]
elif x.value > high[x.name]:
return high[x.name]
else:
return x.value
df['new_value'] = df.apply(f, axis=1)
I have a df as shown below:
>>> df.head()
group_type value
G1 125.23
G1 107.19
G1 117.37
G1 102.68
G2 185.58
G1 82.31
G2 21.82
G2 168.21
G2 134.17
G1 71.45
I have calculated the quantile values within each group as below:
>>> lowtail = df.groupby('group_type')['value'].quantile(0.25)
>>> lowtail
group_type
G1 103.8075
G2 59.0425
Name: value, dtype: float64
>>> hightail = df.groupby('group_type')['value'].quantile(0.75)
>>> hightail
group_type
G1 123.2650
G2 172.5525
Name: value, dtype: float64
Now, I need to replace the value
in df
within each group_type
with the calculated quantile values, lowtail
and hightail
based on the conditions if:
-
df.groupby(‘group_type’)[‘value’] < the value of corresponding group_type in lowtail then replace with lowtail value of the corresponding group_type
-
df.groupby(‘group_type’)[‘value’] > the value of corresponding group_type in hightail then replace with hightail value of the corresponding group_type
The desired output looks like:
group_type value new_value
G1 125.23 123.2650
G1 107.19 107.19
G1 117.37 117.37
G1 102.68 103.8075
G2 185.58 172.5525
G1 82.31 103.8075
G2 21.82 59.0425
G2 168.21 168.21
G2 134.17 134.17
G1 71.45 103.8075
I am able to do a simple replace with fixed values
df.loc[df[value] < lowtail, [value]] = lowtail
but could not condition and replace using the groupby
. Can anyone help here.
Is that what you want to code ? :
grp = df.groupby('group_type')['value']
low, high = grp.quantile(0.25), grp.quantile(0.75)
def f(x):
if x.value < low[x.name]:
return low[x.name]
elif x.value > high[x.name]:
return high[x.name]
else:
return x.value
df['new_value'] = df.apply(f, axis=1)