Catastrophic backtracking error with any single character or number?
Question:
First of all, I know the title is not as objective as it should be, I don’t get why the below error is occurring on python "flavor" in regex101 website.
Just to explain what I’m trying to do, I have to match any number after "item", followed by everything until "consumo estimado".
Regex:
^items*(d{0,})(.*?)consumo
Example text:
ITEM 1 – AGULHA DE PUNÇÃO
Agulha de punção 18 ga x 70 mm
Consumo Estimado Anual: 284
Ampla Participação
ITEM 2 – CATETER ANGIOGRAFICO PIGTAIL
Cateter angiográfico diagnóstico pigtail 5f x 100 cm
Consumo Estimado Anual: 210
Ampla Participação
ITEM 3 – Próteses Vasculares Dracon Reta 80 Cm
PROTESES VASCULARES ANELADA – Enxerto vascular reto constituído
em politetrafluoretileno (PTFE) extrudado e expandido construído com
suporte externo anelado que aumentam a resistência mecânica.
Tamanho
aproximado 8mm (diâmetro) x 70 -80 cm (comprimento)
Consumo Estimado Anual: 34
Ampla Participação
But after entering the word "consumo" followed by a space, I cant put anything else, resulting in "catastrophic backtracking"
Example Regex with error:
^items*(d{0,})(.*?)consumo e
^items*(d{0,})(.*?)consumo 1
The solution was to use .*? to capture everything between "consumo" and "estimado", which worked properly.
^items*(d{0,})(.*?)consumo.*?estimado
Why is this error occurring? I couldn’t find any explanation for it.
I already have the solution for the problem, but I just wanna know why the error happened.
https://regex101.com/r/uqm7ra/1
Edit 1:
As suggested, I have added the link to the current saved regex with the problem.
Edit 2:
As suggested, I also have tried to follow the "meta" when asking for anything here in Stack Overflow. Thanks for the advice!
I hope the question is better now.
Answers:
d{0,}
looks iffy, the regex engine will retry with fewer and fewer digits which can be catastrophic. Anchor it with (D.*?)?consumo
to prevent that.
Also, if you want a number, you mean {1,}
(or the more idiomatic and brief +
; similarly, {0,}
is customarily written *
).
^items*(d+)(D.*?)?consumo
First of all, I know the title is not as objective as it should be, I don’t get why the below error is occurring on python "flavor" in regex101 website.
Just to explain what I’m trying to do, I have to match any number after "item", followed by everything until "consumo estimado".
Regex:
^items*(d{0,})(.*?)consumo
Example text:
ITEM 1 – AGULHA DE PUNÇÃO
Agulha de punção 18 ga x 70 mm
Consumo Estimado Anual: 284
Ampla Participação
ITEM 2 – CATETER ANGIOGRAFICO PIGTAIL
Cateter angiográfico diagnóstico pigtail 5f x 100 cm
Consumo Estimado Anual: 210
Ampla Participação
ITEM 3 – Próteses Vasculares Dracon Reta 80 Cm
PROTESES VASCULARES ANELADA – Enxerto vascular reto constituído
em politetrafluoretileno (PTFE) extrudado e expandido construído com
suporte externo anelado que aumentam a resistência mecânica.
Tamanho
aproximado 8mm (diâmetro) x 70 -80 cm (comprimento)
Consumo Estimado Anual: 34
Ampla Participação
But after entering the word "consumo" followed by a space, I cant put anything else, resulting in "catastrophic backtracking"
Example Regex with error:
^items*(d{0,})(.*?)consumo e
^items*(d{0,})(.*?)consumo 1
The solution was to use .*? to capture everything between "consumo" and "estimado", which worked properly.
^items*(d{0,})(.*?)consumo.*?estimado
Why is this error occurring? I couldn’t find any explanation for it.
I already have the solution for the problem, but I just wanna know why the error happened.
https://regex101.com/r/uqm7ra/1
Edit 1:
As suggested, I have added the link to the current saved regex with the problem.
Edit 2:
As suggested, I also have tried to follow the "meta" when asking for anything here in Stack Overflow. Thanks for the advice!
I hope the question is better now.
d{0,}
looks iffy, the regex engine will retry with fewer and fewer digits which can be catastrophic. Anchor it with (D.*?)?consumo
to prevent that.
Also, if you want a number, you mean {1,}
(or the more idiomatic and brief +
; similarly, {0,}
is customarily written *
).
^items*(d+)(D.*?)?consumo