First of all, it's generally a bad idea to parse CGI yourself; use the CGI module via

use CGI; my $req = CGI->new(); print $req->header(); print "<H1>Hello "; print $req->param('name'), "</H1>";

But your question isn't as much about CGI as it is about regular expressions. Let's take a look at each line :

5. $value =~ tr/+/ /;

The tr function translates one character at a time into another character. Here, it replaces every + into a space (as was done in reverse by the browser before the parameter was sent to you).

6. $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex ($1))/eg;

Here, all other characters that were urlencoded are decoded. Characters that are unsuitable for URLs, like \0, newlines and other stuff, get encoded via %xx, where xx is the hexadecimal value of the character. The regular expression replaces every percent sign that is followed by two characters out of the set A-Fa-f0-9 (the hexadecimal digits) with the character that has the value of the number given by the hexadecimal digits. %00 would be the encoding for the character \0, a single newline has the number 10 and would be encoded as %0A.

7. $value =~ s/\s/ /g;

Here, all whitespace is converted to blanks. This is not necessarily a good idea, or might come as a surprise, if you were sending arbitrary urlencoded characters to the RE in line 6, like a "tab" (value 9), it now got replaced by a space.

8. $value =~ s/<([^>]|\n)*>//g;

Here, it looks like the processor is trying to strip all HTML from the values, as that regular expression matches the following : An opening bracket <, followed by any characters except a closing bracket >, and then either a newline or the closing bracket.

9. $value =~ s/<//g;

Now, to be extra sure, all opening brackets are removed as well.

10. $value =~ s/>//g;

As are all closing brackets.

11. $FORM{$name} = $value;

And here, the %FORM hash is populated with the $name => $value pair.. If you don't know about hashes, they are also called associative arrays, dictionaries or lookup tables, and if none of these words make sense, they are like arrays, except that the index is not a number but a string.

Except for the HTML stripping, which might or might not be what you wanted, CGI.pm does the decoding of CGI parameters already and is certainly worth a look.

perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web

In reply to Re: Explanation of regexps for obtaining POST params by Corion
in thread Explanation of regexps for obtaining POST params by r_ibsen

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.