Pandas replace a character in all column names

Question:

I have data frames with column names (coming from .csv files) containing ( and ) and I’d like to replace them with _.

How can I do that in place for all columns?

Asked By: Cedric H.

||

Answers:

Use str.replace:

df.columns = df.columns.str.replace("[()]", "_")

Sample:

df = pd.DataFrame({'(A)':[1,2,3],
                   '(B)':[4,5,6],
                   'C)':[7,8,9]})

print (df)
   (A)  (B)  C)
0    1    4   7
1    2    5   8
2    3    6   9

df.columns = df.columns.str.replace(r"[()]", "_")
print (df)
   _A_  _B_  C_
0    1    4   7
1    2    5   8
2    3    6   9
Answered By: jezrael

The square brackets are used to demarcate a range of characters you want extracted. for example:

r"[Nn]ational"

will extract both occurences where we have “National” and “national” i.e it extracts N or n.

Answered By: agbalutemi

Older pandas versions don’t work with the accepted answer above. Something like this is needed:

df.columns = [c.replace("[()]", "_") for c in list(df.columns)]
Answered By: JamesR
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.