in reply to incorrect use of URI::Escape?

It seems that somewhere you pass two arguments to uri_escape. This I can't confirm, because I have no idea how Apache::Request's param function works. Normally, I'd read the manual and tell you, or read the source if the manual lacks information about the return type. However, Apache::Request is not pure perl and its documentation doesn't say clearly _how_ it returns values.

As long as you have this problem, it's probably better to take this script down, because if someone has an email address of foo@{[ `some nasty thing like rm -rf /` ]}, you're not going to like the results.

An alternative to URI::Escape's uri_escape is:

sub alt_uri_escape { my ($text) = @_; $text =~ s/([^A-Za-z0-9\-_.!~*'()])/sprintf "%%%02x", ord $1/ge; return $text; }
Please not that this differs in two ways. It doesn't take an optional second parameter to define your own range, and it doesn't use a hash lookup, which is a bit more efficient than this sprintf eval.

- Yes, I reinvent wheels.
- Spam: Visit eurotraQ.

Replies are listed 'Best First'.
Re: Re: incorrect use of URI::Escape?
by tachyon (Chancellor) on Apr 13, 2002 at 18:13 UTC

    You can use a closure for efficiency like this:

    { # use a closure to make a private memory for sub. my %escapes; sub uri_escape { my($text) = @_; # the first time the sub is called %escapes is undef # so we build a char->hex map. because we use a closure # %escapes survives from one calling of this function # to the next so we only map this once %escapes || do{ $escapes{chr $_} = sprintf("%%%02X", $_) for 0 +..255 }; return undef unless defined $text; $text =~ s/([^A-Za-z0-9\-_.!~*'()])/$escapes{$1}/g; return $text; } }

    Update

    Updated per Juerd's comments.

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      [^;\/?:@&=+\$,A-Za-z0-9\-_.!~*'()]

      You probably do not want to exclude ;, &, = and +.

      Although they're don't have to be encoded, according to the rfc, ; and & separate key/value pairs in a query string, = separates key and value in a key/value pair in a query string, and + often is a space (chr(32)) in query strings. Because query strings are a very common reason for using URI encoding, I think it's unwise to not encode these characters.

      The characters I mentioned are part of the "reserved" characters, and since version 1.16, URI::Escape does encode them (not encoding them cause a LOT of trouble in many situations).

      - Yes, I reinvent wheels.
      - Spam: Visit eurotraQ.
      

        The code I offered is exactly the same as uri_escape() simply because if you are going to suggest an alternative it seems logical to KISS. I agree with you that URI:Escape is not optimal and always roll my own. Like you I do not exclude ; & + and = (also ?)

        I think a good working knowledge of URI encodings is an important thing for anyone who works with CGI

        cheers

        tachyon

        s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print