Which regular expression do I have to implement to extract text between two lines containing a string and an arbitrary number of digits?
Question:
That’s the code I have:
text = 'LIBRO 1ndsfsdfnasdasnfgfghfnLIBRO 21nhghjnghjhjknghjhknLIBRO 333'
result = re.findall(r'(?<=LIBRO d+n)(.*?)(?=nLIBRO)', text, re.DOTALL)
print(result)
and this is the error I get:
re.error: look-behind requires fixed-width pattern
the desired result is:
['dsfsdfnasdasnfgfghf', 'nhghjnghjhjknghjhk']
Answers:
You could use split
instead of findall
, removing the empty entries in the results, as there would be a result for what comes before the first LIBRO
:
result = [s.strip() for s in re.split(r'(?m)^LIBRO d+$', text) if s]
That’s the code I have:
text = 'LIBRO 1ndsfsdfnasdasnfgfghfnLIBRO 21nhghjnghjhjknghjhknLIBRO 333'
result = re.findall(r'(?<=LIBRO d+n)(.*?)(?=nLIBRO)', text, re.DOTALL)
print(result)
and this is the error I get:
re.error: look-behind requires fixed-width pattern
the desired result is:
['dsfsdfnasdasnfgfghf', 'nhghjnghjhjknghjhk']
You could use split
instead of findall
, removing the empty entries in the results, as there would be a result for what comes before the first LIBRO
:
result = [s.strip() for s in re.split(r'(?m)^LIBRO d+$', text) if s]