Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Can someone explain this seemingly odd regexp behavior? When a string does not match the pattern in the code below, I would expect to get nothing in return. And so happens - with the sole and annoying exception of when "\x0A" being the string to match. This is my test case:
use strict; use Data::Dump qw(dump); my @chars = ("A", "H", "\x0A"); my @found; foreach my $c (@chars) { @found = $c =~ /^([A-G])*$/; dump @found; } __END__ "A" () undef
As you can see, the string "A" matches the regular expression, and the character "A" is being stuffed into the array.
"H" doesn't match, and nothing is stuffed.
Then comes "\x0A", it doesn't match, but undef is being stuffed into my unsuspecting array.

The behaviour is the same on Win32 (Active State v5.8.8 build 819) as well as Solaris (v.5.8.x)

I guess it has something to do with "\x0A" being newline on *nix (and half a newline on Win32).
But still - is this to be considered a bug, or what?
Is it already well known and accepted?
How would you guys code around it?

A bit confunded,
/L

Replies are listed 'Best First'.
Re: I wan't take undef for an answer
by FunkyMonk (Bishop) on Aug 04, 2007 at 11:27 UTC
    perlre states:

    $ Match the end of the line (or before newline at the end)
    On *nix, \x0A is the same as \n. On windows, \r\n is translated to \n by the underlying C libraries, so \n works on both platforms. Remember that [A-G]* will match zero or more A-G's. I'd probably change * to +.

Re: I wan't take undef for an answer
by GrandFather (Saint) on Aug 04, 2007 at 11:42 UTC

    Try matching the physical end of the string using \z: /^([A-G])*\z/;.


    DWIM is Perl's answer to Gödel
Re: I wan't take undef for an answer
by Anonymous Monk on Aug 04, 2007 at 12:11 UTC
    Thank you FunkyMonk for providing an explanation, and GrandFather for a clever solution!

    A bit more enlightened (and a lot happier),
    /L