I stumbled across a bit of code with the following loop in it:
foreach (@pairs) {
s/\s*(.*?)\s*/$1/;
.
.
.
}
That regex sent up a warning signal in my brain.
- Match 0 or more spaces greedily, followed by
- Capture into $1, any character (other than a new line) 0 or more times non-greedily, followed by
- Match 0 or more spaces greedily
- Replace everything matched with what was caught in $1
Looking at it, I assumed the author was trying, albeit in a broken way, to strip out leading and trailing spaces. To further investigate its broken-ness, I wrote a small snippet:
push @strings, 'nospace';
push @strings, 'trailingspace ';
push @strings, ' leadingspace';
push @strings, 'internal space';
push @strings, ' surroundedbyspace ';
push @strings, ' spaces every where ';
for my $wtf (@strings)
{
print "Before: '$wtf'\n";
$wtf =~ s/\s*(.*?)\s*/$1/;
print "After: '$wtf'\n\n";
}
Which, output:
Before: 'nospace'
After: 'nospace'
Before: 'trailingspace '
After: 'trailingspace '
Before: ' leadingspace'
After: 'leadingspace'
Before: 'internal space'
After: 'internal space'
Before: ' surroundedbyspace '
After: 'surroundedbyspace '
Before: ' spaces every where '
After: 'spaces every where '
Which is, basically, what I expected. The regex is only replacing leading spaces... The things is, this code is located in
CGI::Cookie in the
raw_fetch subroutine (view the source
here). Here is the code in the subroutine:
# Fetch a list of cookies from the environment or the incoming headers
+ and
# return as a hash. The cookie values are not unescaped or altered in
+any way.
sub raw_fetch {
my $class = shift;
my $raw_cookie = get_raw_cookie(@_) or return;
my %results;
my($key,$value);
my(@pairs) = split("; ?",$raw_cookie);
foreach (@pairs) {
s/\s*(.*?)\s*/$1/;
if (/^([^=]+)=(.*)/) {
$key = $1;
$value = $2;
}
else {
$key = $_;
$value = '';
}
$results{$key} = $value;
}
return \%results unless wantarray;
return %results;
}
So, is this regex doing something that I am missing, or is it a broken regex that was placed in a seldom-used subroutine that no one has bother correcting?
Can anyone shed some light...
enoch
update: Fixed a typo.