Regex: Provide match for beginning of a sentence ignoring new lines

Question:

string= "This is a sentence. Micky Mouse"

name= re.compile(f".?Micky Mouse")
name_match = name.search(string)
print(name_match)

I want to ensure that a match is only provided if "Micky Mouse" is at the beginning of a new sentence, i.e., only if it follows on a dot "."
However, there should also be a match irrespective of any new lines or spacings between "Micky Mouse" and the end of the previous sentence. So the following expression should also provide a match print("This is a sentence. nMicky Mouse")

Asked By: xxgaryxx

||

Answers:

The s flag matches for all whitespace characters including n.

Something like the following should do the trick

re.compile(".s?Mickey Mouse")
Answered By: imbuedHope

In order to be at the beginning of a sentence, and ignore any whitespace differences after it, prepend the match target with (?:^|.)s*.

  • (?:) -> it doesn’t create a group
  • ^|. -> either the beginning of the String ^ or | a literal dot .
  • s* -> any amount of whitespace, including newlines, spaces, tabs, etc.
import re

string= """This is a sentence. Micky Mouse. 
           Micky Mouse again. No Micky Mouse match here."""

pattern = re.compile(f"(?:^|.)s*Micky Mouse")
name_match = re.finditer(pattern, string)
print([match.group(0) for match in name_match])

output:

['. Micky Mouse', '. n           Micky Mouse']
Answered By: Onno Rouast

You can match optional whitespace chars after the dot:

.s*Micky Mouseb

The pattern matches:

  • .s* Match a dot and optional whitespace chars (that can also match a newline)
  • Micky Mouseb Match literally followed by a word boundary

Regex demo

Answered By: The fourth bird
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.