OneHotEncoding a (categorical) column but with the value of another column of the Dataframe (not with value "1")

Question:

(my first question on StackOverFlow, so please be indulgent).

I am coding a ANN on a set of data containing among others the following columns:

[... , 'labels_column', 'Content %']

I would like to have the labels_column to be Encoded (like with a OneHotEncoder, which I am using now) to numeric, but would like the values to be the ones from column 'Content %' and not 1

For example:

labels_column Content %
label_1 37
label_2 24
label_3 12
label_2 60

Turned after the Transform into:

label_1 label_2 label_3
37 0 0
0 24 0
0 0 12
0 60 0

And not:

label_1 label_2 label_3 Content %
1 0 0 37
0 1 0 24
0 0 1 12
0 1 0 60

Haven’t managed yet doing it with masks, or other tricks…

Thanks a lot for your help!

Asked By: Alex

||

Answers:

You could do a math/broadcasting trick:

df = pd.DataFrame({'labels_column': ['label_1','label_2','label_3','label_2'],
                   'Content %': [37, 24, 12, 60]})

pd.get_dummies(df['labels_column']) * df[['Content %']].values
Answered By: Z Li