Otogi has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to timeout regex that takes over a second to complete. This alarm does not seem to ever go of. I am some what sure that some of the regex that is used working against large files definatly would take over a second. Can anyone give me a particular regex against a string that would take a long time to complete to test the timeout or tell me what I might be doing wrong . Here is the part of the code that is in question, I am using perl 5.8.7. Thank you.
eval { local $SIG{ALRM} = sub { $signal = 'alarm'; die 'alarm'; }; alarm(1); @a = ($input =~ /$regex/xg); alarm(0); }; if ($@ =~ /alarm/ ||$signal eq 'alarm') { $error = "regex match timed out"; print "$error\n"; die; }

Update: Regex example given by Hue-Bond sets off the alarm but the regex examples given by Ieronim and wfsp does not. does anyone know what that is the case?

Replies are listed 'Best First'.
Re: Timeout alarm for regex
by Hue-Bond (Priest) on Jul 27, 2006 at 16:53 UTC

    Take into account that signals behaviour changed in Perl 5.7.3. From then on, signals are deferred and aren't delivered to the program until perl decides it's safe to do so. If you want to get signals immediatly, you'll have to set the environment variable PERL_SIGNALS to the value "unsafe". You can read more about this in perlipc, "Deferred Signals (Safe Signals)".

    Update: There's a slow regex in perlre. Try this code with and without the environment variable PERL_SIGNALS:

    $SIG{'ALRM'} = sub { die "alrm received"; }; alarm 2; 'aaaaaaaaaaaa' =~ /((a{0,5}){0,5}){0,5}[c]/;

    --
    David Serrano

      The alarm does go of for the example you gave me, however, a regex given by leronim below :
      my $re = ('a*' x80).'(b|c)'; my $str = "a" x 80; $str =~ /$re/;
      Never sets off the alarm. Any idea why that is the case. Thanks.

        Works fine here:

        $ perl $SIG{'ALRM'} = sub { die "alrm received"; }; alarm 2; my $re = ('a*' x80).'(b|c)'; my $str = "a" x 80; $str =~ /$re/; __END__ ... runs forever ... $ PERL_SIGNALS=unsafe perl $SIG{'ALRM'} = sub { die "alrm received"; }; alarm 2; my $re = ('a*' x80).'(b|c)'; my $str = "a" x 80; $str =~ /$re/; __END__ alrm received at - line 1.

        --
        David Serrano

Re: Timeout alarm for regex
by wfsp (Abbot) on Jul 27, 2006 at 17:09 UTC
    Can anyone give me a particular regex against a string that would take a long time to complete...
    From Progamming Perl:

    $_ = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab"; /a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*a*[b]/;
    "... If you remove the "b" from the string the pattern will probably run for many, many years before failing. Many, many millennia. Acutally, billions and billions of years.*"

    later

    "* Actually, it's more in the order of septillions and septillions. We don't know how long it would take. We didn't care to wait around watching it not fail. In any event, your computer is likely to crash before the heat death of the universe, and this regular expression takes longer than either of those."

    Any help? :-)

    Apologies to O'Reilly for any infringments.

Re: Timeout alarm for regex
by starbolin (Hermit) on Jul 28, 2006 at 03:18 UTC

    From doc:perlipc for v5.8.5:

    As Perl interpreter only looks at the signal flags when it about to execute a new opcode if a signal arrives during a long running opcode (e.g. a regular expression operation on a very large string)the signal will not be seen until operation completes.

    Also, think about what you are asking the OS to do: first, you tell the OS to interupt you in x seconds then, you hog all the CPU cycles so nothing else gets done. On a preemptive multitasker this will work but the performance can still suffer.


    s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}
Re: Timeout alarm for regex
by Ieronim (Friar) on Jul 27, 2006 at 17:07 UTC
    particular never-ending regex:
    my $re = ('a*' x80).'(b|c)'; my $str = "a" x 80; $str =~ /$re/;

         s;;Just-me-not-h-Ni-m-P-Ni-lm-I-ar-O-Ni;;tr?IerONim-?HAcker ?d;print