in reply to Re: pack() untaints data : bug or undocumented Perl 5.10 feature?
in thread pack() untaints data : bug or undocumented Perl 5.10 feature?

Perlsec starts of with enthusiastic claims as to it's protections but the text contains many caveats. Winnowing down to the grist, the following data sources are stated to be marked tainted.

All command line arguments, environment variables, locale information (see perllocale), results of certain system calls ("readdir()", "readlink()", the variable of "shmread()", the messages returned by "msgrcv()", the password, gcos and shell fields returned by the "getpwxxx()" calls), and all file input are marked as "tainted". Italics are mine.
This would seem to say that all other system calls are not tainted.

The following statement would seem to confuse some:

If an expression contains tainted data, any subexpression may be considered tainted, even if the value of the subexpression is not itself affected by the tainted data.
I could not find where in perldoc the term expression is defined. I found the definition for a statement, but a statement in generally not considered to be an expression. An expression is generally considered to be a sequences of identifiers and operators, although in some languages an expression may include functions. This would seem to leave the exact requirement for pack() or unpack() in an unclear state. I note that it does not say that a statement will, in itself, taint data. This would seem to be reserved for the expressed system calls above.

In a more specific vien, ambrus writes:

Not really related, but note that this command doesn't raise an insecure dependency error: perl -wTe '() = unpack $ARGV[0], 1e9;' p

Not sure what you are thinking here but it should seg-fault, regardless of where the 'p' pattern is received from you are asking to unpack a arbitrary memory offset. I far as I know perl does not ( yet ) promise to protect oneself from out of band memory accesses.

From perldoc -f unpack:

The "p" and "P" formats should be used with care. Since Perl has no way of checking whether the value passed to "unpack()" corresponds to a valid memory location, passing a pointer value that's not known to be valid is likely to have disastrous consequences.

In regards to the use of ARGV as a parameter to unpack, I would think that feeding a argument into unpack without validation could easily lead to bad consequences outside of whether the program is tainted or not, as clearly demonstrated my your example. The taint checks are included as a bookkeeping aid and not to prevent the programmer from hanging himself.


s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s |-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,, $|=1,select$,,$,,$,,1e-1;print;redo}
  • Comment on Re^2: pack() untaints data : bug or undocumented Perl 5.10 feature?

Replies are listed 'Best First'.
Re^3: pack() untaints data : bug or undocumented Perl 5.10 feature?
by BrowserUk (Patriarch) on Apr 06, 2008 at 05:59 UTC

    Seems to be a storm in a teacup to me too. Given that I can untaint input using

    echo print 'Some piece of nasty code'|perl -Tne"m[(.+)];$_=$1;eval;" Some piece of nasty code

    I don't see any additional risk by allowing my $untainted = unpack 'A*', $tainted;.

    Nor any lesser risk by not allowing it.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      In an abstract world where we didn't have a prior version, you'd be right. People would just know to untaint before using pack() or unpack() instead of after, and they'd do that.

      As it is, perlsec says that only by using a regex or by making the scalar a hash key can you untaint it. The behavior of pack "a*", $val and pack "A*", $val used to leave the value tainted. Now it does not.

      There is probably code somewhere that takes advantage of that fact, and under 5.10.0 that code is now less secure.

        There is probably code somewhere that takes advantage of that fact, and under 5.10.0 that code is now less secure.

        So, someone, somewhere might be unpacking a tainted string and relying upon the resultants continued tainted status to prevent them from ... ?

        I agree that departure from the documented behaviour is not good, but I would be a little leary about labelling it as a "security risk".


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.