Pirax has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

In perldoc's re pragma documentation - use re 'taint' section stands:

This feature is useful when regexp operations on tainted data aren't meant to extract safe substrings, but to perform other transformations.
and additionaly:
...values returned by the m// operator in list context...

So operation like this:
my $var = $ARGV[0]; ## $var is tainted and has value 'aabb'; { use re 'taint'; $var =~ m/(aa)(bb)/; }
does not remove taint flag from $1 and $2 values. Everything as expected.

But similar operation:

my $var = $ARGV[0]; ## $var is tainted and has value 'aabb'; { use re 'taint'; $var =~ s/(aa)(bb)/$1/; }
removes taint flag from both $1 and $2 values.

Shouldn't this produce the same result? I know documentation says "values returned by the m// operator" but...

Or maybe there is another way for s/// operator to behave like m// with use re 'taint'?

Replies are listed 'Best First'.
Re: use re 'taint' with s/// operator
by BrowserUk (Patriarch) on Nov 19, 2010 at 13:29 UTC
    Or maybe there is another way for s/// operator to behave like m// with use re 'taint'?

    The idea of use 're' 'taint';, is to allow you to break up a tainted string into smaller pieces that will subsequently require being untainted separately. It allows the process of validation to be done safely in separate chunks.

    It doesn't make much sense to replace bits of a tainted string with other bits and continue to consider it tainted. It would mean what? That you didn't know what you'd you replaced things with?

    I think if you explained why you're asking the question, you're more likely to get an answer that addresses the real problem here.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Well its rather more philosophical question, than a real life situation... Why I would like to "replace bits of a tainted string with other bits and continue to consider it tainted"? Well simple example
      sub get_file_and_args { my ($path, ) = @_; my (@info, ); while (1) { last if ((@info = stat ($path))); last if ($path !~ s{^(/.+)(/+.*)}{$1}); push (@args, $2); } return ($path, \@args); }
      where $path = '/path/to/a///file/with///few//args'; and is tainted.

      Later on I want to force checking both $path and @args values (by tainting them) because I cant really be sure who and how is using them. Im not discussing if the same result can be achieved in any other 'better' or elegant way because the answer is 'yes, of course!' - I just want to show that there might be a reason "to replace bits of a tainted string with other bits and continue to consider it tainted".

        Well its rather more philosophical question, than a real life situation...

        The only answer to that is, there is a difference in philosophy between you and the author of the pragma :)


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      The idea of use 're' 'taint';, is to allow you to break up a tainted string into smaller pieces that will subsequently require being untainted separately. It allows the process of validation to be done safely in separate chunks.
      Really? In Perl land, there isn't such a thing as "untainting". Short of some XS code removing the flag, once a value is tainted, it remains tainted. "Untainting" variables just means assigning an untainted value to it.

      use re 'taint'; just makes sure that regexp derived values ($1, etc) are tainted as well.

      It doesn't make much sense to replace bits of a tainted string with other bits and continue to consider it tainted.
      I think it makes a lot of sense. If I replace bits of a tainted string, there are still bits that are tainted. Why shouldn't it still be tainted? After all, if I replace part of a tainted string with with substr, the result is still tainted.

      Note that if one does

      $var =~ s/(.*)/$1/;
      the taintedness of $var does not change, regardless whether use re 'taint'; is in effect or not. And so it should.
        there isn't such a thing as "untainting".

        Really?

        "Untainting" variables just means assigning an untainted value to it.

        There you go, you just defined it.

        Hint. It's the value that tainted, not the variable.

Re: use re 'taint' with s/// operator
by Anonymous Monk on Nov 19, 2010 at 13:25 UTC
    Where are you checking the taint flag?

    #!/usr/bin/perl -T -- use strict; use warnings; use Scalar::Util qw' tainted '; { use re qw' taint '; print '$^X tainted ', tainted($^X), "\n"; print '$1 tainted ', tainted($1), "\n"; print '"$1" tainted ', tainted("$1"), "\n"; print '($^X=~/(.)/) tainted ', tainted($^X=~/(.)/), "\n"; {print '(my($x)=$^X=~/(.)/) tainted ', tainted(my($x)=$^X=~/(.)/), + "\n";} print '$1 tainted ', tainted($1), "\n"; print '"$1" tainted ', tainted("$1"), "\n"; {print '(my($x)=$^X=~s/(.)/$1/) tainted ', tainted(my($x)=$^X=~s/( +.)/$1/), "\n";} print '$1 tainted ', tainted($1), "\n"; print '"$1" tainted ', tainted("$1"), "\n"; } print "\n", '='x33,"\n"; { no re qw' taint '; print '$^X tainted ', tainted($^X), "\n"; print '$1 tainted ', tainted($1), "\n"; print '"$1" tainted ', tainted("$1"), "\n"; print '($^X=~/(.)/) tainted ', tainted($^X=~/(.)/), "\n"; {print '(my($x)=$^X=~/(.)/) tainted ', tainted(my($x)=$^X=~/(.)/), + "\n";} print '$1 tainted ', tainted($1), "\n"; print '"$1" tainted ', tainted("$1"), "\n"; {print '(my($x)=$^X=~s/(.)/$1/) tainted ', tainted(my($x)=$^X=~s/( +.)/$1/), "\n";} print '$1 tainted ', tainted($1), "\n"; print '"$1" tainted ', tainted("$1"), "\n"; } __END__ $ perl -T re.taint.pl $^X tainted 1 $1 tainted 0 Use of uninitialized value $1 in string at re.taint.pl line 11. "$1" tainted 0 ($^X=~/(.)/) tainted 0 (my($x)=$^X=~/(.)/) tainted 1 $1 tainted 0 "$1" tainted 1 (my($x)=$^X=~s/(.)/$1/) tainted 1 $1 tainted 0 "$1" tainted 1 ================================= $^X tainted 1 $1 tainted 1 Use of uninitialized value $1 in string at re.taint.pl line 25. "$1" tainted 1 ($^X=~/(.)/) tainted 0 (my($x)=$^X=~/(.)/) tainted 0 $1 tainted 1 "$1" tainted 0 (my($x)=$^X=~s/(.)/$1/) tainted 0 $1 tainted 0 "$1" tainted 0