How do I get rid of repeating special characters with regular expressions?
Question:
I want to get rid of all the repetitive dots except the ones that were one dot.
Sources:
(1) "a... b."
(2) "a....... b... c."
Results I want:
(1) "a b."
(2) "a b c."
Code:
import re
a = "a... b."
b = "a....... b... c."
result = re.sub("[^a-zA-Z0-9 \.{1}]", "", a)
print(result)
result = re.sub("[^a-zA-Z0-9 \.{1}]", "", b)
print(result)
result = re.sub("[^a-zA-Z0-9 ][\.{2,}]", "", a)
print(result)
result = re.sub("[^a-zA-Z0-9 ][\.{2,}]", "", b)
print(result)
Doesn’t work.
How can I do to get my results?
Answers:
Below code can do the needed task
import re
result = re.sub("\.{2,}","","a....b....c.d....e.")
print(result)
Result will be-
abc.de.
You can use
re.sub(r'.{2,}|[^a-zA-Z0-9.s]', '', text)
See the regex demo.
Details:
.{2,}
– two or more dots
|
– or
[^a-zA-Z0-9.s]
– any char other than an ASCII letter, digit, any whitespace or .
chars.
This will work:
import re
a = "a... b."
b = "a....... b... c."
result = re.sub("\.{2,}","", a)
print(result)
result = re.sub("\.{2,}","", b)
print(result)
I want to get rid of all the repetitive dots except the ones that were one dot.
Sources:
(1) "a... b."
(2) "a....... b... c."
Results I want:
(1) "a b."
(2) "a b c."
Code:
import re
a = "a... b."
b = "a....... b... c."
result = re.sub("[^a-zA-Z0-9 \.{1}]", "", a)
print(result)
result = re.sub("[^a-zA-Z0-9 \.{1}]", "", b)
print(result)
result = re.sub("[^a-zA-Z0-9 ][\.{2,}]", "", a)
print(result)
result = re.sub("[^a-zA-Z0-9 ][\.{2,}]", "", b)
print(result)
Doesn’t work.
How can I do to get my results?
Below code can do the needed task
import re
result = re.sub("\.{2,}","","a....b....c.d....e.")
print(result)
Result will be-
abc.de.
You can use
re.sub(r'.{2,}|[^a-zA-Z0-9.s]', '', text)
See the regex demo.
Details:
.{2,}
– two or more dots|
– or[^a-zA-Z0-9.s]
– any char other than an ASCII letter, digit, any whitespace or.
chars.
This will work:
import re
a = "a... b."
b = "a....... b... c."
result = re.sub("\.{2,}","", a)
print(result)
result = re.sub("\.{2,}","", b)
print(result)