Pandas mapping to TRUE/FALSE as String, not Boolean

Question:

When I try to convert some columns in a pandas dataframe from ‘0’ and ‘1’ to ‘FALSE’ and ‘TRUE’, pandas automatically detects dtype as boolean. I want to keep dtype as string, with the strings ‘TRUE’ and ‘FALSE’.

booleanColumns = pandasDF.select_dtypes(include=[bool]).columns.values.tolist()
booleanDictionary = {'1': 'TRUE', '0': 'FALSE'}

pandasDF.to_string(columns = booleanColumns)

for column in booleanColumns:
    pandasDF[column].map(booleanDictionary)

Unfortunately, python automatically converts dtype to boolean with the last operation. How can I prevent this?

Asked By: Dendrobates

||

Answers:

If need replace boolean values True and False:

booleandf = pandasDF.select_dtypes(include=[bool])
booleanDictionary = {True: 'TRUE', False: 'FALSE'}

for column in booleandf:
    pandasDF[column] = pandasDF[column].map(booleanDictionary)

Sample:

pandasDF = pd.DataFrame({'A':[True,False,True],
                   'B':[4,5,6],
                   'C':[False,True,False]})

print (pandasDF)
       A  B      C
0   True  4  False
1  False  5   True
2   True  6  False

booleandf = pandasDF.select_dtypes(include=[bool])
booleanDictionary = {True: 'TRUE', False: 'FALSE'}

#loop by df is loop by columns, same as for column in booleandf.columns:
for column in booleandf:
    pandasDF[column] = pandasDF[column].map(booleanDictionary)

print (pandasDF)
       A  B      C
0   TRUE  4  FALSE
1  FALSE  5   TRUE
2   TRUE  6  FALSE

EDIT:

Simplier solution with replace by dict:

booleanDictionary = {True: 'TRUE', False: 'FALSE'}
pandasDF = pandasDF.replace(booleanDictionary)
print (pandasDF)
       A  B      C
0   TRUE  4  FALSE
1  FALSE  5   TRUE
2   TRUE  6  FALSE
Answered By: jezrael

You can replace values in multiple columns in a single replace call.

mapping = {'1': 'TRUE', '0': 'FALSE'}
df[['A','B']] = df[['A','B']].replace(mapping)

If you’re changing boolean columns into 'TRUE', 'FALSE' strings, then no need to replace, just change dtype.

df[['A', 'B']] = df[['A','B']].astype(str).apply(lambda x: x.str.upper())
Answered By: cottontail