combining split with findall

Question

I’m splitting a string with some separator, but want the separator matches as well:

import re

s = "oren;moish30.4.200/-/v6.99.5/barbi"
print(re.split("d+.d+.d+", s))
print(re.findall("d+.d+.d+", s))

I can’t find an easy way to combine the 2 lists I get:

['oren;moish', '/-/v', '/barbi']
['30.4.200', '6.99.5']

Into the desired output:

['oren;moish', '30.4.200', '/-/v', '6.99.5', '/barbi']

||

Answer 1

Try this:

import re
s = "oren;moish30.4.200/-/v6.99.5/barbi"
print([x for y in re.findall(r"(?:([A-Za-z;/-]+)|(d+.d+.d+))", s) for x in y if x])

Result:

['oren;moish', '30.4.200', '/-/v', '6.99.5', '/barbi']

Answered By: user56700

Answer 2

Another solution (regex101):

s = "oren;moish30.4.200/-/v6.99.5/barbi"

x = re.findall(r"d+.d+.d+|.+?(?=d+.d+.d+|Z)", s)
print(x)

Prints:

['oren;moish', '30.4.200', '/-/v', '6.99.5', '/barbi']

Answer 3

From the re.split docs:

If capturing parentheses are used in pattern, then the text of all groups in the pattern are also returned as part of the resulting list.

So just wrap your regex in a capturing group:

print(re.split(r"(d+.d+.d+)", s))

Answered By: user2357112

Answer 4

You could use re.findall and a pattern to match:

d+.d+.d+|D+(?:d(?!d*.d+.d)D*)*

Explanation

d+.d+.d+ Match 3 times 1+ digits with a single dot in between
| Or
D+ Match 1+ chars other than a digit
(?: Non capture group to repeat as a whole part
- d(?!d*.d+.d) Match a single digit asserting not the digits and dots pattern to the right
- D* Match optional chars other than a digit
)* Close the non capture group and optionally repeat it

EXample

import re

s = "oren;moish30.4.200/-/v6.99.5/barbi"
pattern = r"d+.d+.d+|D+(?:d(?!d*.d+.d)D*)*"
print(re.findall(pattern, s))

Output

['oren;moish', '30.4.200', '/-/v', '6.99.5', '/barbi']

Question: