Python Regex – A lookbehind assertion has to be fixed width

Question:

I want to extract a certrain string from a path. The wanted string is always preceded by either _ASW or 10_BSW words. Additionally, the sought string consists of only letters and numbers.

So for example from the following 3 paths I want to extract strings Mod2000, ModA and ModB:

C:MyPath_ASWMod2000
C:MyPath10_BSWModASubDir
C:MyPath10_BSWModB

For that I have written a regex using Positive Lookbehind:

\(?<=(0_ASW|10_BSW)\)([A-Za-z0-9]+)

With this regex the 2nd group matches the sought string correctly and I am able to compile the regex in .NET(C#) without any errors. However, once I try to compile it in Python I get the following Regex Error: A lookbehind assertion has to be fixed width

From my understanding, the two words in the positive lookbehind, i.e. 0_ASW and 10_BSW ought to have the fixed length. The error is not clear to me because both words have a fixed length of 4 and 5 characters, respectively. If I try to make those 2 strings to have equal length, e.g. 3 character strings ASW and BSW, the regex compiles without the above error.

\(?<=(ASW|BSW)\)([A-Za-z0-9]+)

How do I fix this regex so that it compiles in Python as well?

You can find the demos here:

https://regex101.com/r/qfwfJJ/1

https://regex101.com/r/zAVk5Z/1

Asked By: SimpleThings

||

Answers:

\((0_ASW|10_BSW)\)([A-Za-z0-9]+)

https://regex101.com/r/e7vH34/1

Answered By: tomasborrella

You could also use a non-capturing group:

\(?:0_ASW|10_BSW)\(w+)

https://regex101.com/r/hYCRJf/1

If the regex matches, you’ll get the desired string in group(1).

Answered By: Eric Duminil

You can use a lookahead like this with an alternation, as for Python it has to be fixed width which they are not in your pattern.

b(?:(?<=\0_ASW\)|(?<=\10_BSW\))[A-Za-z0-9]+

See a regex101 demo.


If you can make use of the PyPi regex module, you match what you want then then you can use K to forget what is matches so far:

\(?:0_ASW|10_BSW)\K[A-Za-z0-9]+

See another regex101 demo.

Answered By: The fourth bird
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.