Regex – How to group multiple lines until line starts with a string?
Question:
I have a text file like the following which I am trying to create some regex for in Python:
CR INFO
CR INFO
Wed Aug 17
foo-bar name_10_Name-Child_test
foo-bar name_25_Name-out
foo-bar name_1000_Name-test_out
CR INFO
CR INFO
Wed Aug 17
foo-bar name_10_Name-Child_test
foo-bar name_25_Name-out
foo-bar name_1000_Name-test_out
Now I’m fairly new to regex so apologies if this is very simple.
I’m trying to capture the lines starting with foo-bar, and grouping them together. So for example, the 3 foo-bar lines in one group, then the 3 below the date go in to another.
I so far have the following regex (^foo-bars+[A-z0-9-]+)
but that matches every foo-bar line to an individual group, rather than having 3 in one group. Regex flags on regex101.com are gm.
How can I group the 3 lines together until it meets either the "CR" string, or a double new line?
Many thanks.
Answers:
You can use
^foo-bars+[A-Za-z0-9-].*(?:n.+)*
Or, to make sure each next line start with foo-bar
and whitespace:
^foo-bars+[A-Za-z0-9-].*(?:nfoo-bars.*)*
See the regex demo / regex demo #2. Use it with re.M
/ re.MULTILINE
to make sure ^
matches the start of any line.
Details:
^
– start of a line
foo-bar
– a literal string
s+
– one or more whitespaces
[A-Za-z0-9-]
– an alphanumeric or hyphen
.*
– the rest of the line
(?:n.+)*
– zero or more non-empty lines
(?:nfoo-bars.*)*
– zero or more non-empty lines that start with foo-bar
and whitespace.
Note that [A-z]
matches more than just letters.
I have a text file like the following which I am trying to create some regex for in Python:
CR INFO
CR INFO
Wed Aug 17
foo-bar name_10_Name-Child_test
foo-bar name_25_Name-out
foo-bar name_1000_Name-test_out
CR INFO
CR INFO
Wed Aug 17
foo-bar name_10_Name-Child_test
foo-bar name_25_Name-out
foo-bar name_1000_Name-test_out
Now I’m fairly new to regex so apologies if this is very simple.
I’m trying to capture the lines starting with foo-bar, and grouping them together. So for example, the 3 foo-bar lines in one group, then the 3 below the date go in to another.
I so far have the following regex (^foo-bars+[A-z0-9-]+)
but that matches every foo-bar line to an individual group, rather than having 3 in one group. Regex flags on regex101.com are gm.
How can I group the 3 lines together until it meets either the "CR" string, or a double new line?
Many thanks.
You can use
^foo-bars+[A-Za-z0-9-].*(?:n.+)*
Or, to make sure each next line start with foo-bar
and whitespace:
^foo-bars+[A-Za-z0-9-].*(?:nfoo-bars.*)*
See the regex demo / regex demo #2. Use it with re.M
/ re.MULTILINE
to make sure ^
matches the start of any line.
Details:
^
– start of a linefoo-bar
– a literal strings+
– one or more whitespaces[A-Za-z0-9-]
– an alphanumeric or hyphen.*
– the rest of the line(?:n.+)*
– zero or more non-empty lines(?:nfoo-bars.*)*
– zero or more non-empty lines that start withfoo-bar
and whitespace.
Note that [A-z]
matches more than just letters.