Those criteria for when untainting is done and when not look random to me.
Could it be as simple as whether the regex engine is used internally, during the parsing of the unpack template?
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
Perlsec starts of with enthusiastic claims as to it's protections but the text contains many caveats. Winnowing down to the grist, the following data sources are stated to be marked tainted. All command line arguments, environment variables, locale information (see perllocale), results of certain system calls ("readdir()", "readlink()", the variable of "shmread()", the messages returned by "msgrcv()", the password, gcos and shell fields returned by the "getpwxxx()" calls), and all file input are marked as "tainted". Italics are mine.
This would seem to say that all other system calls are not tainted.
The following statement would seem to confuse some: If an expression contains tainted data, any subexpression may be considered tainted, even if the value of the subexpression is not itself affected by the tainted data. I could not find where in perldoc the term expression is defined. I found the definition for a statement, but a statement in generally not considered to be an expression. An expression is generally considered to be a sequences of identifiers and operators, although in some languages an expression may include functions. This would seem to leave the exact requirement for pack() or unpack() in an unclear state. I note that it does not say that a statement will, in itself, taint data. This would seem to be reserved for the expressed system calls above.
In a more specific vien, ambrus writes:
Not really related, but note that this command doesn't raise an insecure dependency error:
perl -wTe '() = unpack $ARGV[0], 1e9;' p
Not sure what you are thinking here but it should seg-fault, regardless of where the 'p' pattern is received from you are asking to unpack a arbitrary memory offset. I far as I know perl does not ( yet ) promise to protect oneself from out of band memory accesses.
From perldoc -f unpack:
The "p" and "P" formats should be used with care. Since Perl has no way of checking whether the value passed to "unpack()" corresponds to a valid memory location, passing a pointer value that's not known to be valid is likely to have disastrous consequences.
In regards to the use of ARGV as a parameter to unpack, I would think that feeding a argument into unpack without validation could easily lead to bad consequences outside of whether the program is tainted or not, as clearly demonstrated my your example. The taint checks are included as a bookkeeping aid and not to prevent the programmer from hanging himself.
s//----->\t/;$~="JAPH";s//\r<$~~/;{s|~$~-|-~$~|||s
|-$~~|$~~-|||s,<$~~,<~$~,,s,~$~>,$~~>,,
$|=1,select$,,$,,$,,1e-1;print;redo}
| [reply] |
echo print 'Some piece of nasty code'|perl -Tne"m[(.+)];$_=$1;eval;"
Some piece of nasty code
I don't see any additional risk by allowing my $untainted = unpack 'A*', $tainted;.
Nor any lesser risk by not allowing it.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
In an abstract world where we didn't have a prior version, you'd be right. People would just know to untaint before using pack() or unpack() instead of after, and they'd do that.
As it is, perlsec says that only by using a regex or by making the scalar a hash key can you untaint it. The behavior of pack "a*", $val and pack "A*", $val used to leave the value tainted. Now it does not.
There is probably code somewhere that takes advantage of that fact, and under 5.10.0 that code is now less secure.
| [reply] [d/l] [select] |
I'm not sure of the correct behavior, but the implementation and the docs sure don't agree at the moment. One or the other needs to change.
If it's the docs that change, a caveat about old code tested under older versions now changing semantics should be present. That's almost never a good thing to see in your docs. | [reply] |