change values in dataframe row based on condition

Question:

I have this dataframe

    Region      2021    2022    2023
0   Europe      0.00    0.00    0.00
1   N.Amerca    0.50    0.50    0.50
2   N.Amerca    4.40    4.40    4.40
3   N.Amerca    0.00    8.00    8.00
4   Asia        0.00    0.00    1.75
5   Asia        0.00    0.00    0.00
6   Asia        0.00    0.00    2.00
7   N.Amerca    0.00    0.00    0.50
8   Eurpoe      6.00    6.00    6.00
9   Asia        7.50    7.50    7.50
10  Asia        3.75    3.75    3.75
11  Asia        3.50    3.50    3.50
12  Asia        3.80    3.80    3.80
13  Asia        0.00    0.00    0.00
14  Europe      6.52    6.52    6.52

Once a value in 2021 is found it should carry a 0 to the rest (2022 and 2023)
and if a value in 2022 is found -it should carry 0 to the rest. In other words, once value in found in columns 2021 and forth it should zero the rest on the right.

expected result would be:

    Region      2021    2022    2023
0   Europe      0.00    0.00    0.00
1   N.Amerca    0.50    0.00    0.00
2   N.Amerca    4.40    0.00    0.00
3   N.Amerca    0.00    8.00    0.00
4   Asia        0.00    0.00    1.75
5   Asia        0.00    0.00    0.00
6   Asia        0.00    0.00    2.00
7   N.Amerca    0.00    0.00    0.50
8   Eurpoe      6.00    0.00    0.00
9   Asia        7.50    0.00    0.00
10  Asia        3.75    0.00    0.00
11  Asia        3.50    0.00    0.00
12  Asia        3.80    0.00    0.00
13  Asia        0.00    0.00    0.00
14  Europe      6.52    0.00    0.00

I have tried to apply a lambda:

def foo(r):
   #if r['2021')>0: then 2020 and forth should be zero) 

df = df.apply(lambda x: foo(x), axis=1)

but the challange is that there are 2021 – to 2030 and the foo becomes a mess)

Asked By: ProcolHarum

||

Answers:

Let us try duplicated

df = df.mask(df.T.apply(pd.Series.duplicated).T,0)
Out[57]: 
      Region  2021  2022  2023
0     Europe  0.00   0.0  0.00
1   N.Amerca  0.50   0.0  0.00
2   N.Amerca  4.40   0.0  0.00
3   N.Amerca  0.00   8.0  0.00
4       Asia  0.00   0.0  1.75
5       Asia  0.00   0.0  0.00
6       Asia  0.00   0.0  2.00
7   N.Amerca  0.00   0.0  0.50
8     Eurpoe  6.00   0.0  0.00
9       Asia  7.50   0.0  0.00
10      Asia  3.75   0.0  0.00
11      Asia  3.50   0.0  0.00
12      Asia  3.80   0.0  0.00
13      Asia  0.00   0.0  0.00
14    Europe  6.52   0.0  0.00
Answered By: BENY

This is another way:

df2 = df.set_index('Region').diff(axis=1).reset_index()
df2['2021'] = df['2021']

or:

df.iloc[:,1:].where(df.iloc[:,1:].ne(0).cumsum(axis=1).eq(1),0)

Output:

    2021  2022  2023
0   0.00   0.0  0.00
1   0.50   0.0  0.00
2   4.40   0.0  0.00
3   0.00   8.0  0.00
4   0.00   0.0  1.75
5   0.00   0.0  0.00
6   0.00   0.0  2.00
7   0.00   0.0  0.50
8   6.00   0.0  0.00
9   7.50   0.0  0.00
10  3.75   0.0  0.00
11  3.50   0.0  0.00
12  3.80   0.0  0.00
13  0.00   0.0  0.00
14  6.52   0.0  0.00
Answered By: rhug123
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.