Your regex runs into catastrophic backtracking because you have nested quantifiers (([...]+)*
). Since your regex requires the string to end in /
(which fails on your example), the regex engine tries all permutations of the string in the vain hope to find a matching combination. That’s where it gets stuck.
To illustrate, let’s assume "A*BCD"
as the input to your regex and see what happens:
(\w+)
matchesA
. Good.\*
matches*
. Yay.[\w\s]+
matchesBCD
. OK./
fails to match (no characters left to match). OK, let’s back up one character./
fails to matchD
. Hum. Let’s back up some more.[\w\s]+
matchesBC
, and the repeated[\w\s]+
matchesD
./
fails to match. Back up./
fails to matchD
. Back up some more.[\w\s]+
matchesB
, and the repeated[\w\s]+
matchesCD
./
fails to match. Back up again./
fails to matchD
. Back up some more, again.- How about
[\w\s]+
matchesB
, repeated[\w\s]+
matchesC
, repeated[\w\s]+
matchesD
? No? Let’s try something else. [\w\s]+
matchesBC
. Let’s stop here and see what happens.- Darn,
/
still doesn’t matchD
. [\w\s]+
matchesB
.- Still no luck.
/
doesn’t matchC
. - Hey, the whole group is optional
(...)*
. - Nope,
/
still doesn’t matchB
. - OK, I give up.
Now that was a string of just three letters. Yours had about 30, trying all permutations of which would keep your computer busy until the end of days.
I suppose what you’re trying to do is to get the strings before/after *
, in which case, use
pattern = r"(\w+)\*([\w\s]+)$"