richard5mith has asked for the wisdom of the Perl Monks concerning the following question:

I'm writing a little language in Perl (http://www.bearscript.com/docs/) and am adding support for doing regular expressions within the code.

The one I'm concerned about is the replace function, which is the equivelant of s///. This is because I want the user to be able to do backreferences, which as far as I can tell, based on another Perlmonks seeker of wisdom, is only possible using the /ee modifier.

This is essentially what I've got at the moment. This very simply converts a UK date to an ISO date.

$date = "10-09-2004"; print &replace($date, q"(\d+)-(\d+)-(\d+)", q"$3-$2-$1"), "\n"; sub replace { my $value = shift; my $this = shift; my $with = shift; my $modifiers = shift; $this = &pattern($this, $modifiers); if ($modifiers =~ /g/) { $value =~ s/$this/qq{qq{$with}}/gee; } else { $value =~ s/$this/qq{qq{$with}}/ee; } return $value; } sub pattern { my $pattern = shift; my $modifiers = shift; if ($modifiers =~ /[^gism]/) { die "Only m, i, s and g are valid pattern modifiers."; } $modifiers =~ s/g//; return qr"(?$modifiers)$pattern"; }

Now bare in mind that the first two lines there are essentially generated by my interpreter (well, it generates code that is equivelant to that), so it's always a q"" construct when passing the parameters to the functions (which are hand-written and not generated by the interpreter). So nothing is interpolated before it's passed through to the function.

My question is one of security. Since I'm essentially eval'ing the value of $with, what could they possibly include within the third parameter of the call to &replace that could do something nasty? I'm aware the user could peek at the Perl variables they normally don't see, but that's not an issue. And when I've tried to include calls to functions and other things within that parameter, I've not managed to get any unwanted side-effects. But maybe I'm just not creative enough. Am I missing something here, is this safe?

Replies are listed 'Best First'.
Re: Security with /ee modifier
by PodMaster (Abbot) on Sep 26, 2004 at 15:17 UTC
    The user can run arbitrary code (just like eval $tainted_data). If you want to allow that, I suggest you use the Safe module (example RegexLab (a wxPerl version)).

    If all you want to allow is backreferences, all you need is 1 eval, something like (untested):

    my %backrefs; my( @backrefs ) = $with =~ /\$(\d+)/g; if ($modifiers =~ /g/) { $value =~ s/$this/ no strict 'refs'; @backrefs{@backrefs} = map { ${$_} } @backrefs +; my $ret = $with; $ret =~ s'\$(\d+)'$backrefs{$1}'g; undef %backrefs; $ret; /ge; } else { ...

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

      Interesting, I like that approach.

      Although any time I put arbitary code in my example, it doesn't get run, even though I expected it to be. Whatever I put, other than variable names, just gets printed (other function calls, built-in function names, anything).

        ...Whatever I put...
        it has to be perl code that compiles, like $with = q|}; warn "\n# Hi" while 1; q{|; or $with = q|${warn "\n# hi" while 1}|;

        MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
        I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
        ** The third rule of perl club is a statement of fact: pod is sexy.

Re: Security with /ee modifier
by Zaxo (Archbishop) on Sep 26, 2004 at 15:02 UTC

    I think some nasties could be concocted with unlink or rename.

    I's not sure what you're doing with the nested quotations, s/$this/qq{qq{$with}}/gee. It seems like $with is all that's needed there to do what you want.

    This strikes me as a dangerous thing, not easily untainted.

    Added: Ah, now I see, the nested quotes isolate the replacement expression from actual execution. Nice!.

    $ perl -e'$with=q(print "baz");$_="foobar";s/(foo)(bar)/qq(qq($with))/ +ee;print' print "baz"$
    That's much less dangerous than I thought at first.

    Added again - Quote punctuation can be inserted with dire effects,

    perl -e'$with=q/$2$1);print "baz";(/;$_="foobar";s/(foo)(bar)/qq(qq($w +ith))/ee;print' baz$
    Uh-oh!

    After Compline,
    Zaxo

      Unlink just gets printed.

      $with on it's own doesn't work. The output is 1985. (2004 minus 9 minus 10). One qq{} does the same. Two qq{}'s gives the correct output.

      As taken from another perlmonk's post.

Re: Security with /ee modifier
by nobull (Friar) on Sep 26, 2004 at 20:28 UTC
    Others have answered the OP's original question about security but can I please ask people to stop promolgating this...
    s/$this/qq{qq{$with}}/eeg;
    It's wrong (even assuming $with can be trusted not to be malicious). It breaks even if $with contains an innocent closing brace. If you want to do this it's more resilient with a here doc.
    s{$this}{ chop( my $r = eval "<<_END_\n$with\n_END_" ); die $@ if $@; $r; }eg;
      Which *is* more resilient, but not bulletproof, as I've pointed out. It breaks if $with contains "\n_END_\n" or starts with "_END_\n".
        It can be made safer by escaping this case.
        print <<"_END_"; It works! \_END_ It really does. _END_
        which prints:
        It works!
        _END_
        It really does.
        

        It works because apparently you can escape underscores with a backslash, and still have them as just a backslash. If you don't trust this perl feature — I can't say I've seen it documented anywhere, you might feel safer using something else as a delimiter, something that actually starts with a \W character, like "*END*".

        print <<"*END*"; It works! \*END* It really does. *END*

        There isn't even a need to try and find something uniqueish. A plain "*" will do. The complete code can then become:

        $with =~ s/^\*$/\\*/mg; s{$this}{ my $r = eval qq[<<"*"\n$with\n*\n]; die $@ if $@; chop $r; $r; }eg;

      What you really should use is a &qquote function a la Data::Dumper. Such a function makes sure that the delimiter isn't in the input string.

      ihb

      Read argumentation in its context!