Python pandas – convert multiple columns to row

Question:

I have a table like this

id non-binary female male others
1 True False False False
2 False True False False
3 False False True False
4 True True True False

I want to combine columns into row like this –
If multiple columns have value True, only use the first column that has value True. Eg, for id=4, since the first True is in column "non-binary", then in the new table, set the gender value to non-binary.
How can it be converted to the table below by using Pandas?

id Gender
1 non-binary
2 female
3 male
4 non-binary
Asked By: Lychee

||

Answers:

Since boolean values are numeric (True == 1, False == 0) you can use idxmax to get the index of the maximum value. This interestingly will correspond to the name of the column where the first True appears if you apply this along the column axis.

# set your index to the id column if it is not already
df = df.set_index('id') 

out = df.idxmax(axis='columns').rename('Gender')

print(out)
id
1    non-binary
2        female
3          male
4    non-binary
Name: Gender, dtype: object


# Can use .reset_index() or .to_frame() to convert back to a DataFrame 
#   (depending on what you want to do with your Index)
print(out.reset_index())
   id      Gender
0   1  non-binary
1   2      female
2   3        male
3   4  non-binary
Answered By: Cameron Riddell

I haven’t tested this. But it seems in pandas version 1.5.0 there is a new function called from_dummies:

Try this:

pd.from_dummies(df.astype(int))
Answered By: SomeDude

What about just using .loc:

df2 = pd.DataFrame({'id': df.id})
df2.loc[df['non-binary'].eq(True), 'Gender'] = 'non-binary'
df2.loc[df['non-binary'].eq(False) & df['female'].eq(True), 'Gender'] = 'female'
df2.loc[df['non-binary'].eq(False) & df['male'].eq(True), 'Gender'] = 'male'
df2.loc[df['non-binary'].eq(False) & df['others'].eq(True), 'Gender'] = 'other'

print(df2)

Output:

id Gender
0 1 non-binary
1 2 female
2 3 male
3 4 non-binary
Answered By: Carmoreno
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.