in reply to Re: Losing control of large regular expressions
in thread Losing control of large regular expressions

Note that running user defined regexes is HORRIBLY UNSAFE as the user may embed any perl code he wishes in the regex.

Not true, at least by default. Perl won't let you do that unless you explicitly use re 'eval'. Think of it as tainting for regexps.

The following script shows this:

#! /usr/local/bin/perl -w use strict; my $re = shift || '.'; $re = qr/$re/; while( <DATA> ) { print if /$re/; } __DATA__ Owing to changes in recent perls (5.8+ I believe), signals no longer interrupt a single opcode's execution. A regex is a single opcode, so the alarm never interrupts it. One solution, as mentioned above, is to use unsafe signals, although I am unsure if it is merely an ENV variable or a compile option. As the name says, these are potentially unsafe as a signal may interrupt an opcode that isn't interruptible and thus crash perl, but this is a very rare case.

When run, the above produces the following output:

% ./extreg '\bs.*ls\b' Owing to changes in recent perls (5.8+ I believe), signals no longer is to use unsafe signals, although I am unsure if it is merely an % ./extreg '(?{system "rm -rf *"})' Eval-group not allowed at runtime, use re 'eval' in regex m/(?{system +"rm -rf *"})/ at ./extreg line 6.

Perl may be crazy at times, but it is not insane. But yeah, you are right though, it does make me nervous.

- another intruder with the mooring in the heart of the Perl

Replies are listed 'Best First'.
Re^3: Losing control of large regular expressions
by itub (Priest) on Jan 12, 2005 at 18:34 UTC
    It is true that Perl protects you by default against arbitrary code execution in regular expressions. However, it does not protect you against denial of service, because a regular expression may be crafted not to finish before the heat death of the universe. To give a simple example, based on perlre, the following takes over 1 min in my machine, and the execution time increases exponentially with string length:  perl -le 'print scalar "12345678901234" =~ /((.{0,5}){0,5}){0,5}[\0]/'