Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I am getting some strange errors from URI::Escape and assume that I must be something stupid. I am getting the following errors in Apache's log file:

Use of uninitialized value in subroutine entry at /usr/local/lib/perl5 +/site_perl/5.6.1/URI/Escape.pm line 140. [Fri Apr 12 14:16:18 2002] [error] Can't use string ("") as a subrouti +ne ref while "strict refs" in use at /usr/local/lib/perl5/site_perl/5 +.6.1/URI/Escape.pm line 140. Possible unintended interpolation of @aol in string at (eval 187) line + 1. [Fri Apr 12 14:28:22 2002] [error] uri_escape: Global symbol "@aol" re +quires explicit package name at (eval 187) line 1. at /usr/local/apache/perl/some.cgi line 179 Possible unintended interpolation of @aol in string at (eval 185) line + 1.
The "@aol" appears nowhere in my code - it is part of an email address that some user is submitting. My code compiles and runs fine and uses both warnings and use strict - it works most of the time, but these errors sometimes occur.
Here are the relevant sections of my code which use URI::Escape (the param call is to Apache::Request)
#a bunch of lines like the following to get POST'd data: my $email = uri_escape($apr->param('email' ))||''; my $customerid = uri_escape($apr->param('customerID'))||''; #then get a ref to a hash of all posted values #it's redundant, i know, but it serves a purpose my $values = $apr->param; #later, to construct a redir, this hack: my $base_url = "http://foo.com/somecgi?"; my $middle = join('&', map { $_ . "=" .uri_escape($values->{$_}) } keys %$values ); my $tail = "&FilePath=$cleanfilename&transfer_source=http"; my $redir = $base . $middle . $tail; # sub to construct a log string or submitted info # anonymous to avoid warnings under mod_perl my $err_log = sub { my $up = shift; my @vars = ( $up, $clientb, $clientbv, $clientos, $trckingnum, $customerid, $xferid, $cgi,$customerid, $contactname, $email,$phone,'htttp' ); my $string = join(',', map {"\"" . $_ . "\""} map {uri_unescape($_)} @vars ); $string .= "\012"; return $string; };
It seems that somehow the strings that are being passed to URI::Escape, which eval's a regex in an anonymous subroutine, are being seen as code, specifically the "@" in email addresses, and causing an error - but only some of the time.

Environment is Apache 1.3.24 / mod_perl 1.26
Apache::Request 1.0
URI::Escape 1.18

Thanks for your help.

Edit kudra, 2002-04-16 Added readmore

Replies are listed 'Best First'.
Re: incorrect use of URI::Escape?
by Juerd (Abbot) on Apr 13, 2002 at 17:23 UTC

    It seems that somewhere you pass two arguments to uri_escape. This I can't confirm, because I have no idea how Apache::Request's param function works. Normally, I'd read the manual and tell you, or read the source if the manual lacks information about the return type. However, Apache::Request is not pure perl and its documentation doesn't say clearly _how_ it returns values.

    As long as you have this problem, it's probably better to take this script down, because if someone has an email address of foo@{[ `some nasty thing like rm -rf /` ]}, you're not going to like the results.

    An alternative to URI::Escape's uri_escape is:

    sub alt_uri_escape { my ($text) = @_; $text =~ s/([^A-Za-z0-9\-_.!~*'()])/sprintf "%%%02x", ord $1/ge; return $text; }
    Please not that this differs in two ways. It doesn't take an optional second parameter to define your own range, and it doesn't use a hash lookup, which is a bit more efficient than this sprintf eval.

    - Yes, I reinvent wheels.
    - Spam: Visit eurotraQ.
    

      You can use a closure for efficiency like this:

      { # use a closure to make a private memory for sub. my %escapes; sub uri_escape { my($text) = @_; # the first time the sub is called %escapes is undef # so we build a char->hex map. because we use a closure # %escapes survives from one calling of this function # to the next so we only map this once %escapes || do{ $escapes{chr $_} = sprintf("%%%02X", $_) for 0 +..255 }; return undef unless defined $text; $text =~ s/([^A-Za-z0-9\-_.!~*'()])/$escapes{$1}/g; return $text; } }

      Update

      Updated per Juerd's comments.

      cheers

      tachyon

      s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

        [^;\/?:@&=+\$,A-Za-z0-9\-_.!~*'()]

        You probably do not want to exclude ;, &, = and +.

        Although they're don't have to be encoded, according to the rfc, ; and & separate key/value pairs in a query string, = separates key and value in a key/value pair in a query string, and + often is a space (chr(32)) in query strings. Because query strings are a very common reason for using URI encoding, I think it's unwise to not encode these characters.

        The characters I mentioned are part of the "reserved" characters, and since version 1.16, URI::Escape does encode them (not encoding them cause a LOT of trouble in many situations).

        - Yes, I reinvent wheels.
        - Spam: Visit eurotraQ.
        

•Re: incorrect use of URI::Escape?
by merlyn (Sage) on Apr 13, 2002 at 17:25 UTC
    You're calling URI::escape in a dangerous way!
    uri_escape($values->{$_})
    If the values are more than one, you get multiple parameters passed. It's this second parameter that's causing fits, and can be a security leak!

    Change that to:

    uri_escape(scalar $values->{$_})
    (I think).

    -- Randal L. Schwartz, Perl hacker

      Hi merlyn

      I am not a great fan of URI::Escape because (as is noted in the pod) it is much slower than rolling your own (40-700% says the pod) and also has the ability to be called in a dangerous way. eval() always scares me in code because of what you can do if you pass an appropriate value into it. Worse as it is open source you can see just how to do it.

      use URI::Escape; uri_escape(1,'hacker])//; warn "Running arbitrary code!"; s/([hacker') +;

      I don't quite see how calling $values->{$_} could return a list but....

      cheers

      tachyon

      s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      Thanks merlyn, that seems to work.

      I am familiar with and agree with your position of not using home-grown code whose function is already performed by widely-used modules. I don't like, however, the way that URI::Escape evals whatever I pass it and I was thinking of using URI::Escape's internal regex myself directly on the values to give me more control, not to mention a modest perfomance benefit.
      #build a char->hex map for (0..255) { $escapes{chr($_)} = sprintf("%%%02X", $_); } $text =~ s/([^A-Za-z0-9\-_.!~*'()])/$escapes{$1}/g;

      Do you think this would be a mistake?

      uri_escape($values->{$_})
      If the values are more than one, you get multiple parameters passed. It's this second parameter that's causing fits, and can be a security leak!

      $values->{$_} is a scalar, as hash values can only be scalars. A scalar can be a reference, a number, a string or undef, but not multiple values without dereferencing. Or that's how I have always understood scalars. I don't think there's much point in explicitly putting a scalar in scalar context with the scalar operator.

      Could you please give me, if possible, an example of a scalar that returns a two-element list?

      - Yes, I reinvent wheels.
      - Spam: Visit eurotraQ.