Python – How to Remove Words That Started With Number and Contain Period

Question:

What is the best way to remove words in a string that start with numbers and contain periods in Python?

this_string = 'lorum3 ipsum 15.2.3.9.7 bar foo 1. v more text 46 2. here and even more text here v7.8.989'

If I use Regex:

re.sub('[0-9]*.w*', '', this_string)

The result will be:

'lorum3 ipsum  bar foo  v more text 46  here and even more text here v'

I’m expecting the word v7.8.989 not to be removed, since it’s started with a letter.

It will be great if the removed words aren’t adding the unneeded space. My Regex code above still adds space.

Asked By: Gedanggoreng

||

Answers:

You can use this regex to match the strings you want to remove:

(?:^|s)[0-9]+.[0-9.]*(?=s|$)

It matches:

  • (?:^|s) : beginning of string or whitespace
  • [0-9]+ : at least one digit
  • . : a period
  • [0-9.]* : some number of digits and periods
  • (?=s|$) : a lookahead to assert end of string or whitespace

Regex demo

You can then replace any matches with the empty string. In python

this_string = 'lorum3 ipsum 15.2.3.9.7 bar foo 1. v more text 46 2. here and even more text here v7.8.989 and also 1.2.3c as well'
result = re.sub(r'(?:^|s)[0-9]+.[0-9.]*(?=s|$)', '', this_string)

Output:

lorum3 ipsum bar foo v more text 46 here and even more text here v7.8.989 and also 1.2.3c as well
Answered By: Nick

If you don’t want to use regex, you can also do it using simple string operations:

res = ''.join(['' if (e.startswith(('0','1','2','3','4','5','6','7','8','9')) and '.' in e) else e+' ' for e in this_string.split()])
Answered By: mahesh

You can try this regex:

(^|s)d[^s]*.+[^s]*

This matches strings like ‘7.a.0.1’ which contains letter extra.

Here is a demo.

Answered By: M..

If you can make use of a lookbehind, you can match the numbers and replace with an empty string:

(?<!S)d+.[d.]*(?!S)

Explanation

  • (?<!S) Assert a whitespace boundary to the left
  • d+.[d.]* Match 1+ digits, then a dot followed by optional digits or dots
  • (?!S) Assert a whitespace boundary to the right

Regex demo

If you want to match an optional leading whitespace char:

s?(?<!S)d+.[d.]*(?!S)

Regex demo

Answered By: The fourth bird
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.