How to remove all punctuations from string except those in decimal numbers in Python?

Question:

Input is the code I’m using, Output is the output I’m getting, required output is the output I want.

Input:

import regex as re  
keyword = 'Auto: tab suspender 2.0 pro'
keyword = re.sub(r'[^ws]','', keyword)
words = re.findall('w+', keyword)
print(keyword)
print(len(words))
words

Output:

Auto tab suspender 20 pro
5
['Auto', 'tab', 'suspender', '20', 'pro']

Required Output:

Auto tab suspender 2.0 pro
5
['Auto', 'tab', 'suspender', '2.0', 'pro']
Asked By: sonar

||

Answers:

I would use re.findall here:

keyword = 'Auto: tab suspender 2.0 pro'
matches = re.findall(r'd+(?:.d+)?|w+', keyword)
print(matches)  # ['Auto', 'tab', 'suspender', '2.0', 'pro']

The regex pattern used here first attempts to match an integer or float, and that failing will look for words:

  • d+ match an integer
  • (?:.d+)? or maybe a float
  • | OR
  • w+ match a word
Answered By: Tim Biegeleisen
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.