How do I get rid of repeating special characters with regular expressions?

Question:

I want to get rid of all the repetitive dots except the ones that were one dot.

Sources:

(1) "a... b."
(2) "a....... b... c."

Results I want:

(1) "a b."
(2) "a b c."

Code:

import re

a = "a... b."
b = "a....... b... c."

result = re.sub("[^a-zA-Z0-9 \.{1}]", "", a)
print(result)

result = re.sub("[^a-zA-Z0-9 \.{1}]", "", b)
print(result)

result = re.sub("[^a-zA-Z0-9 ][\.{2,}]", "", a)
print(result)

result = re.sub("[^a-zA-Z0-9 ][\.{2,}]", "", b)
print(result)

Doesn’t work.

How can I do to get my results?

Asked By: fringetos

||

Answers:

Below code can do the needed task

import re
result = re.sub("\.{2,}","","a....b....c.d....e.")
print(result)

Result will be-
abc.de.

Answered By: Prashant Gupta

You can use

re.sub(r'.{2,}|[^a-zA-Z0-9.s]', '', text)

See the regex demo.

Details:

  • .{2,} – two or more dots
  • | – or
  • [^a-zA-Z0-9.s] – any char other than an ASCII letter, digit, any whitespace or . chars.
Answered By: Wiktor Stribiżew

This will work:

import re
    
a = "a... b."
b = "a....... b... c."
    
result = re.sub("\.{2,}","", a)
print(result)
    
result = re.sub("\.{2,}","", b)
print(result)
Answered By: Justin Edwards
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.