Perform certain mathematical operation based on label in a dataframe

Question:

I have a dataframe, something like this.

data = {'label':['y', 'x', 'z', 'y', 'z', 'x' ],
        'x_score': [0.35, 0.7, 0.05, 0.12, 0.2, 0.9],
         'y_score': [0.6, 0.2, 0.45, 0.58, 0.3, 0.05],
         'z_score': [0.05, 0.1, 0.5, 0.3, 0.5, 0.05]} 

df = pd.DataFrame(data)

Three operations will be performed on the dataframe based on the column label and will store the result in a separate column say, result

  • If the label is x then simply x_score will be stored in the result column.
  • If the label is z then -1×(z_score) or negative of the z_score will be stored in the result column.
  • If the label is y then (x_score – y_score) will be stored in the result column.

The output should look like this,

    label   x_score y_score z_score  result
0     y       0.35    0.60    0.05   0.30
1     x       0.70    0.20    0.10   0.70
2     z       0.05    0.45    0.50  -0.50
3     y       0.12    0.58    0.30  -0.18
4     z       0.20    0.30    0.50  -0.50
5     x       0.90    0.05    0.05   0.90

Please help me with this.

Asked By: Starlord22

||

Answers:

You can use np.select:

label = df['label']
condlist = [label.eq('x'), label.eq('z'), label.eq('y')]
choicelist = [df['x_score'], - df['z_score'], df['x_score'] - df['y_score']]

df['result'] = np.select(condlist, choicelist, default=np.nan)
  label  x_score  y_score  z_score  result
0     y     0.35     0.60     0.05   -0.25
1     x     0.70     0.20     0.10    0.70
2     z     0.05     0.45     0.50   -0.50
3     y     0.12     0.58     0.30   -0.46
4     z     0.20     0.30     0.50   -0.50
5     x     0.90     0.05     0.05    0.90
Answered By: Vladimir Fokow
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.