in reply to Re: how to restrict a regexp?
in thread how to restrict a regexp?

It's also possible to create a regexp that crashes Perl

There was a buffer overflow under some conditions that involve Unicode in the pattern, and a string that is matched against the pattern not being upgraded correctly.

There were patches for that overflow, and most operating systems should have been updated by now.

But it demonstrates that regexes are still a bit more fragile than normal scalars, and thus you should be extra carefull.

The only reliable way around regexes that take exponential time is to restrict the search time, and kill the process if it doesn't stop.

Replies are listed 'Best First'.
Re^3: how to restrict a regexp?
by tfoertsch (Beadle) on Mar 17, 2008 at 12:15 UTC
    Thanks for your answers. I have now wrapped the regexp in a block with "no re 'eval'" at start. It is then further wrapped in a nested eval block with an alarm set to avoid the exponetial time issue.

      This came up here quite awhile back and at that time someone pointed out that alarm/sleep would not work in this context. During the regex execution they were suspended. I do not know if this is still the case as the regex engine got a pretty major overhaul since then and the internal details were beyond me at the time.

      It is then further wrapped in a nested eval block with an alarm set to avoid the exponetial time issue.

      I was going to say that won't work because the signal won't be processed until after the match or substitution operator is done when using safe signals. Turns out it does.

      $ date; perl -e'"aaaaaaaaaaaaaaaaaaaaaa" =~ /a?a?a?a?a?a?a?a?a?a?a?a?a +?a?a?a?a?a?a?a?a?a?[b]/;'; date Mon Mar 17 16:44:46 PDT 2008 Mon Mar 17 16:44:54 PDT 2008 $ date; perl -e'alarm(1); "aaaaaaaaaaaaaaaaaaaaaa" =~ /a?a?a?a?a?a?a?a +?a?a?a?a?a?a?a?a?a?a?a?a?a?a?[b]/;'; date Mon Mar 17 16:45:02 PDT 2008 Alarm clock Mon Mar 17 16:45:03 PDT 2008

      Make sure to test this with your Perl. Use any equal number of "a"s and "a?"s as long as Perl takes more than 1 sec to execute the match. (You can Ctrl-C if it takes too long.)

      I have now wrapped the regexp in a block with "no re 'eval'" at start.

      Just to be clear, no re 'eval'; is the default so it's not needed, but it sure is a good practice to include it here.