Change a dataframe of floats and objects into a binary dataframe whilst retaining string values of column and row headers

Question:

This is my dataset:

Name Test1 Test3 Test2 Quiz
Boo 0.9 0 0 1.0
Buzz 0.8 0.7 0 0
Bree 0 0 1.0 0

How I want my result dataset:

Name Test1 Test3 Test2 Quiz
Boo 1 0 0 1
Buzz 1 1 0 0
Bree 0 0 1 0

I tried the df.astype to int64 – but this changed all values below 1 to 0. I also tried:

df1 = df.apply(pd.to_numeric, errors='coerce')

but this caused my first column to become NaN values. I also tried:

df.where(df <= 0.4, 1, inplace=True)

but I got an error saying this isn’t possible between str and float. I had set_index() in the Name column, so ideally this error shouldn’t come. I can’t seem to figure this out, need major help :((

Asked By: Mimikyu o_0

||

Answers:

It depends of treshold – if need 1 if values greater like 0.4 compare for boolean mask and convert to integers for True, False to 1,0 mapping:

#if necessary
#df = df.set_index('Name')

df1 = df.apply(pd.to_numeric, errors='coerce').gt(0.4).astype(int)
print (df1)
      Test1  Test3  Test2  Quiz
Name                           
Boo       1      0      0     1
Buzz      1      1      0     0
Bree      0      0      1     0
Answered By: jezrael
df.set_index('Name').astype('float').gt(0.4).astype('int').reset_index()

output:

    Name    Test1   Test3   Test2   Quiz
0   Boo     1       0       0       1
1   Buzz    1       1       0       0
2   Bree    0       0       1       0
Answered By: Panda Kim