New dataframe column using existing column names as values
Question:
I have the following pandas dataframe:
ID
Class
LR
XG
SV
BEST_R2
1
Class1
.76
.78
.99
.99
2
Class2
.92
.89
.91
.92
3
Class3
.87
.95
.87
.95
This is a dataframe with the R2 of each of a series of machine learning models (LR/XG/SV) for each ID. The column "BEST_R2" represents the best R2 score for that ID across models (.max(axis=1)). I need another column with the model name for best score. For example, the dataframe below. Any tips on how to achieve this programmatically?
ID
Class
LR
XG
SV
BEST_R2
BEST MODEL
1
Class1
.76
.78
.99
.99
SV
2
Class2
.92
.89
.91
.92
LR
3
Class3
.87
.95
.87
.95
XG
Answers:
Assuming that ID
is the index, you can do
df["Best Model"] = df.idxmax(axis=1)
Result:
LR XG SV BEST_R2 Best Model
ID
1 0.76 0.78 0.99 0.99 SV
2 0.92 0.89 0.91 0.92 LR
3 0.87 0.95 0.87 0.95 XG
I have the following pandas dataframe:
ID | Class | LR | XG | SV | BEST_R2 |
---|---|---|---|---|---|
1 | Class1 | .76 | .78 | .99 | .99 |
2 | Class2 | .92 | .89 | .91 | .92 |
3 | Class3 | .87 | .95 | .87 | .95 |
This is a dataframe with the R2 of each of a series of machine learning models (LR/XG/SV) for each ID. The column "BEST_R2" represents the best R2 score for that ID across models (.max(axis=1)). I need another column with the model name for best score. For example, the dataframe below. Any tips on how to achieve this programmatically?
ID | Class | LR | XG | SV | BEST_R2 | BEST MODEL |
---|---|---|---|---|---|---|
1 | Class1 | .76 | .78 | .99 | .99 | SV |
2 | Class2 | .92 | .89 | .91 | .92 | LR |
3 | Class3 | .87 | .95 | .87 | .95 | XG |
Assuming that ID
is the index, you can do
df["Best Model"] = df.idxmax(axis=1)
Result:
LR XG SV BEST_R2 Best Model
ID
1 0.76 0.78 0.99 0.99 SV
2 0.92 0.89 0.91 0.92 LR
3 0.87 0.95 0.87 0.95 XG