I stumbled across a bit of code with the following loop in it:
foreach (@pairs) { s/\s*(.*?)\s*/$1/; . . . }
That regex sent up a warning signal in my brain. Looking at it, I assumed the author was trying, albeit in a broken way, to strip out leading and trailing spaces. To further investigate its broken-ness, I wrote a small snippet:
push @strings, 'nospace'; push @strings, 'trailingspace '; push @strings, ' leadingspace'; push @strings, 'internal space'; push @strings, ' surroundedbyspace '; push @strings, ' spaces every where '; for my $wtf (@strings) { print "Before: '$wtf'\n"; $wtf =~ s/\s*(.*?)\s*/$1/; print "After: '$wtf'\n\n"; }
Which, output:
Before: 'nospace' After: 'nospace' Before: 'trailingspace ' After: 'trailingspace ' Before: ' leadingspace' After: 'leadingspace' Before: 'internal space' After: 'internal space' Before: ' surroundedbyspace ' After: 'surroundedbyspace ' Before: ' spaces every where ' After: 'spaces every where '
Which is, basically, what I expected. The regex is only replacing leading spaces... The things is, this code is located in CGI::Cookie in the raw_fetch subroutine (view the source here). Here is the code in the subroutine:
# Fetch a list of cookies from the environment or the incoming headers + and # return as a hash. The cookie values are not unescaped or altered in +any way. sub raw_fetch { my $class = shift; my $raw_cookie = get_raw_cookie(@_) or return; my %results; my($key,$value); my(@pairs) = split("; ?",$raw_cookie); foreach (@pairs) { s/\s*(.*?)\s*/$1/; if (/^([^=]+)=(.*)/) { $key = $1; $value = $2; } else { $key = $_; $value = ''; } $results{$key} = $value; } return \%results unless wantarray; return %results; }
So, is this regex doing something that I am missing, or is it a broken regex that was placed in a seldom-used subroutine that no one has bother correcting?
Can anyone shed some light...

enoch

update: Fixed a typo.

In reply to Weird Regex in CGI::Cookie by enoch

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.