vit has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,
I am using $ENV{QUERY_STRING} in my CGI code. When I click on the link with, e.g. "men's_cloth", it returns
men%27s_cloth

In order to match to the name in my file I need to decode it to a normal form. Am I right to use this decoder
$q_string =~ s/\%([A-Fa-f0-9]{2})/pack('C', hex($1))/seg;
I checked it works in all my cases, but you may have some objections.

Replies are listed 'Best First'.
Re: Decoding part of URL
by moritz (Cardinal) on Aug 03, 2009 at 16:04 UTC
    This will not produce a decoded text string for non-latin1 chars in the URLs.

    I'd recommend one of the various CGI modules out there, they are usually very well tested and provide such functionality.

Re: Decoding part of URL
by Utilitarian (Vicar) on Aug 03, 2009 at 16:05 UTC
    Hi vit, this will decode url-encoded characters, however have you considered the CGI module?
    use CGI;
    with or without qw/:standard/ or CGI::Pretty if you want to out put formatted HTML
    Now param('men\'s_cloth'); returns the associated value or if this is a value, it will be avalable under the appropriate parameter.
    BTW, is taint (-t) on? because you should be suspicious of quote characters returned to your script.
Re: Decoding part of URL
by halfcountplus (Hermit) on Aug 03, 2009 at 19:04 UTC
    You are right -- all those codes do correspond to the ASCII table. Very clever!! IE, yes it will work in all cases where the characters are part of the ascii table, which as moritz indicates is not all characters.

    "use CGI" is generally the way to go tho. I've used this before, when I didn't want to use a module:
    sub CGI_convert { my @lut33 = ('!', '"', '#', '$', '%', '&', "'", '(', ')', '*', + '+', ',', '-', '.', '/'); my @lut58 = (':', ';', '<', '=', '>', '?', '@'); my @lut91 = ('[', '\\', ']', '^', '_', '`'); my @lut123 = ('{', '|', '}', '~'); my $flag = 0; my $string = pop; if ($string =~ /^%/) { $flag=1 } my @ray = split /%/,$string; foreach my $e (@ray) { unless ($flag) { $flag=1; next }; my $dsym = hex(substr($e,0,2)); my $symbol = undef; if (($dsym<48) && ($dsym>32)) { $dsym-=33; $symbol=$lu +t33[$dsym]; } if (($dsym<65) && ($dsym>57)) { $dsym-=58; $symbol=$lu +t58[$dsym]; } if (($dsym<97) && ($dsym>90)) { $dsym-=91; $symbol=$lu +t91[$dsym]; } if (($dsym<127) && ($dsym>122)) { $dsym-=123; $symbol= +$lut123[$dsym]; } unless (defined $symbol) { next }; substr($e,0,2)=$symbol; } return join "",@ray; }
    But Your formulation is considerably less long winded. Nice.