Conditional calculation python

Question:

A dataframe is given, and a calculation shall be done as stated below:

DF as-is

enter image description here

DF to-be

enter image description here

Column Epsilon shall be calculated:

  • Where beta = same values, the two rows belong together. (10 -> 1,1 or 20 -> 2,2)
  • Column epsilon of such a pair is the sum of the 2 values (12+5=17; 13+6=19)
  • Print the result (17, 19) in the row of the pair (1,1 or 2,2) where delta = yes
  • Set epsilon of the row to NULL where delta = no
    row 50 is unchanged, since it does not belong to any pair.

I hope i was clear 🙂 if not, please ask me. Tnx for any input!

Asked By: hugo999

||

Answers:

You can use a boolean mask. Sort values by delta column then keep the track of duplicated beta values.

m = df.assign(delta=df['delta'].eq('x')).sort_values('delta').duplicated('beta')
epsilon = df.groupby(['beta'])['epsilon'].transform(sum)

df.loc[~m, 'epsilon'] = epsilon
df.loc[m, 'epsilon'] = 0

Output:

>>> df
   alpha  beta gamma delta  epsilon
0     10     1     x             17
1     20     1           x        0
2     30     2     x             19
3     40     2           x        0
4     50     3                   18

Input dataframe:

data = {'alpha': [10, 20, 30, 40, 50],
        'beta': [1, 1, 2, 2, 3],
        'gamma': ['x', '', 'x', '', ''],
        'delta': ['', 'x', '', 'x', ''],
        'epsilon': [12, 5, 13, 6, 18]}
df = pd.DataFrame(data)
print(df)

# Output
   alpha  beta gamma delta  epsilon
0     10     1     x             12
1     20     1           x        5
2     30     2     x             13
3     40     2           x        6
4     50     3                   18
Answered By: Corralien
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.