How can I whitelist characters from a string in python 3?

Question:

My question is quite simple, I am trying to strip any character that is not A-Z, or 0-9 from a string.

Basically this is the process I am trying to do:

whitelist=['a',...'z', '0',...'9']

name = '_abcd!?123'

name.strip(whitelist)

print(name)

>>> abcd123

What’s important to know is that I can’t just only print valid characters in name. I need to actually use the variable in its changed state.

Asked By: NeverEndingCycle

||

Answers:

You can use re.sub and provide a pattern that exactly matches what you are trying to remove:

import re
result = re.sub('[^a-zA-Z0-9]', '', '_abcd!?123')

Output:

'abcd123'
Answered By: Ajax1234

Use string with a list comprehension

import string
whitelist = set(string.ascii_lowercase + string.digits)
name = ''.join(c for c in name if c in whitelist)
Answered By: Boris Verkhovskiy

You can use simple regex:

new_string = re.sub('[chars to remove]', '', old_string)

Please also note that strings are immutable. You need to assign a new variable in order to change one.

Answered By: Alec
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.