How to create a conditional column in Python that picks data from different column based on the year?

Question:

My dataset looks something like this:

Product ID Sales Year Sales_percentage_2016 Sales_percentage_2017 Sales_percentage_2018
1 2016 5 8 5
2 2017 7 9 6

Output should be something like this:

Product ID Sales Year Sales_Percentage
1 2016 5
2 2017 9

So that this was unwanted data can be removed and relevant can be kept.

Asked By: user17475933

||

Answers:

You can use indexing lookup:

# get Year to use for indexing
idx, cols = pd.factorize(df['Sales Year'])
# add prefix
cols = 'Sales_percentage_' + cols.astype(str)
# index
df['Sales'] = df.reindex(cols, axis=1).to_numpy()[np.arange(len(df)), idx]
# get rid of unnecessary columns
df = df.filter(regex=r'^(?!Sales_percentage_)')

output:

   Product ID  Sales Year  Sales
0           1        2016      5
1           2        2017      9
Answered By: mozway
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.