Use regex to remove a substring that matches a beginning of a substring through the following comma

Question:

I haven’t found any helpful Regex tools to help me figure this complicated pattern out.

I have the following string:

Myfirstname Mylastname, Department of Mydepartment, Mytitle, The University of Me; 4-1-1, Hong,Bunk, Tokyo 113-8655, Japan E-mail:[email protected], Tel:00-00-222-1171,  Fax:00-00-225-3386

I am trying to learn enough Regex patterns to remove the substrings one at a time:

E-mail:[email protected]

Tel:00-00-222-1171

Fax:00-00-225-3386

So I think the correct pattern would be to remove a given word (ie., "E-mail", "Tel") all the way through the following comma.

Is type of dynamic pattern possible in Regex?

I am performing the match in Python, however, I don’t think that would matter too much.

Also, I know the data string looks comma separated, and it is. However there is no guarantee of preserving the order of those fields. That’s why I’m trying to use a Regex match.

Asked By: Brett

||

Answers:

How about this regex:

<YOUR_WORD>.*?(?=(,|($)))

Explanation:

  • It looks for the word specified in <YOUR_WORD> placeholder
  • It looks for any kind of character afterwards
  • The search stops when it hits one of the two options:
    • It finds the character ,
    • It finds an end of the line

So:

E-mail.*?(?=(,|($)))

Will result in:

E-mail:[email protected]

And

Fax.*?(?=(,|($)))

Will result in:

Fax:00-00-225-3386

If there are edge cases it misses – I would like to know, and whether it affects the performance/ is necessary.

Answered By: no_hex
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.