Python text replacement function
Question:
I’m using this function to clean up my columns. However, somehow I’m deleting numbers, which I don’t want to do. So for example here, when applied, I get: "standard_access_requested_application_rolegroup_ld_e"
Any help would be great. Thanks.
def text_replacement(x):
"""
This function formats the field names so that they are more SQL friendly
"""
for key, value in custom_fields_dict.items():
pattern = re.compile(key, re.IGNORECASE)
x = pattern.sub(value, x).lower().replace('fields.','').replace(' ','_').replace('™','')
x = re.sub(r"[()[]&^%$#@!-:'/]",'',x)
return x
text_replacement("standard_access_requested_application:_'role/group': ]ld_10706(e)™")
The application of the function:
#Replace the columns in the dataframe
new_columns = []
for i in df.columns:
new_columns.append(text_replacement(i))
df.columns = new_columns
Answers:
The !-:
part of your pattern represents a character in the range between !
and :
, which apparently includes digits.
If you put escape characters
before !
, -
and :
, this would work:
x = re.sub(r"[()[]&^%$#@!-:'/]", '', x)
[1]: 'standard_access_requested_application_rolegroup ld_10706e'
I only used the regex pattern, but not the rest of your code, so your own result may vary, but the digits would be saved.
I’m using this function to clean up my columns. However, somehow I’m deleting numbers, which I don’t want to do. So for example here, when applied, I get: "standard_access_requested_application_rolegroup_ld_e"
Any help would be great. Thanks.
def text_replacement(x):
"""
This function formats the field names so that they are more SQL friendly
"""
for key, value in custom_fields_dict.items():
pattern = re.compile(key, re.IGNORECASE)
x = pattern.sub(value, x).lower().replace('fields.','').replace(' ','_').replace('™','')
x = re.sub(r"[()[]&^%$#@!-:'/]",'',x)
return x
text_replacement("standard_access_requested_application:_'role/group': ]ld_10706(e)™")
The application of the function:
#Replace the columns in the dataframe
new_columns = []
for i in df.columns:
new_columns.append(text_replacement(i))
df.columns = new_columns
The !-:
part of your pattern represents a character in the range between !
and :
, which apparently includes digits.
If you put escape characters before
!
, -
and :
, this would work:
x = re.sub(r"[()[]&^%$#@!-:'/]", '', x)
[1]: 'standard_access_requested_application_rolegroup ld_10706e'
I only used the regex pattern, but not the rest of your code, so your own result may vary, but the digits would be saved.