in reply to Cleaning up CGI input Data

I wasn't able to get the canonical de-cgi regex work, it seems to strip out any encoded characters. This is what I'm currently using:

s/%(..)/pack('c',hex($1))/ge;

Any thoughts on the difference between the two?

Replies are listed 'Best First'.
The Difference
by chromatic (Archbishop) on Mar 15, 2000 at 02:46 UTC
    The anonymous regex could be better written: $data =~ s/%[0-9a-fA-F]{2}/pack('c', hex($1)/eg; I believe.

    The difference is that, his first translates all + characters into spaces. His also only grabs two valid hexadecimal characters following a percentage sign, while your regex matches two of any characters (besides a new line) following a percentage sign. It's probably better to be more specific. Still, btrott is right. Using CGI.pm is the way to go.