Regex – How to group multiple lines until line starts with a string?

Question:

I have a text file like the following which I am trying to create some regex for in Python:

CR INFO
CR INFO
Wed Aug 17

foo-bar name_10_Name-Child_test
foo-bar name_25_Name-out
foo-bar name_1000_Name-test_out

CR INFO
CR INFO
Wed Aug 17

foo-bar name_10_Name-Child_test
foo-bar name_25_Name-out
foo-bar name_1000_Name-test_out

Now I’m fairly new to regex so apologies if this is very simple.

I’m trying to capture the lines starting with foo-bar, and grouping them together. So for example, the 3 foo-bar lines in one group, then the 3 below the date go in to another.

I so far have the following regex (^foo-bars+[A-z0-9-]+) but that matches every foo-bar line to an individual group, rather than having 3 in one group. Regex flags on regex101.com are gm.

How can I group the 3 lines together until it meets either the "CR" string, or a double new line?

Many thanks.

Asked By: knight

||

Answers:

You can use

^foo-bars+[A-Za-z0-9-].*(?:n.+)*

Or, to make sure each next line start with foo-bar and whitespace:

^foo-bars+[A-Za-z0-9-].*(?:nfoo-bars.*)*

See the regex demo / regex demo #2. Use it with re.M / re.MULTILINE to make sure ^ matches the start of any line.

Details:

  • ^ – start of a line
  • foo-bar – a literal string
  • s+ – one or more whitespaces
  • [A-Za-z0-9-] – an alphanumeric or hyphen
  • .* – the rest of the line
  • (?:n.+)* – zero or more non-empty lines
  • (?:nfoo-bars.*)* – zero or more non-empty lines that start with foo-bar and whitespace.

Note that [A-z] matches more than just letters.

Answered By: Wiktor Stribiżew
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.