in reply to Re: pattern match hangs on malformed UTF-8 input
in thread pattern match hangs on malformed UTF-8 input

I can confirm that this hangs for me under perl-5.6.1, but runs fine under 5.8.0.

The problem is a bug in the regexp optimiser, but I'm not sure off the top of my head which one: #4541 is a possibility. (You can browse the bugs database or look up specific bugs at http://rt.perl.org/perlbug.)

Working around it doesn't appear to be easy: the best I could come up with was a convoluted attempt to convince the regexp engine that it doesn't know what class the first matched character can be:

s/(?=\d)\D*\d //;
which succeeds, and shouldn't be a lot slower than the original pattern would have been without optimisation.

Hugo