Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks.

I have a text file with only one string, 'pl' (no quotes - just those two letters. I don't know why the following regex is matching:

p.{2,6}


A 'p' followed by two to six (inclusive) characters?

Thanks.

Replies are listed 'Best First'.
Re: Confusing regex match.
by daxim (Curate) on Oct 22, 2013 at 17:55 UTC
Re: Confusing regex match.
by AnomalousMonk (Archbishop) on Oct 22, 2013 at 18:07 UTC

    An extraneous character was my thought also, with the usual suspect being a newline, the data being read from a file (but in that case, the regex must be modified with  /s so that "dot matches all"), or perhaps some kind of "end-of-text" character. Did you do a hex dump of the file yet?

    >perl -wMstrict -le "my $s = qq{pl\n}; ;; print 'match sans /s' if $s =~ /p.{2,6}/; print 'match with /s' if $s =~ /p.{2,6}/s; " match with /s
Re: Confusing regex match.
by graff (Chancellor) on Oct 23, 2013 at 01:31 UTC
    My best guess is that the file contains CRLF line termination, and you're running the perl script on a non-CRLF system (unix, linux, macosx). Observe:
    perl -e 'print "pl\r\n"' > /tmp/j.txt perl -e 'open(I,"/tmp/j.txt");$/=undef;$_=<I>; print ". matches CR on $^O\n" if(/p.{2,6})'
    For me, that yields ". matches CR on darwin".
Re: Confusing regex match.
by kcott (Archbishop) on Oct 23, 2013 at 06:55 UTC

    I can reproduce what you describe. Firstly, two text files which only contain the string 'pl'.

    $ cat junk pl
    $ cat junk2 pl

    One matches your regex, the other doesn't:

    $ perl -nE 'say +(/p.{2,6}/) ? "match" : "no match"' junk match
    $ perl -nE 'say +(/p.{2,6}/) ? "match" : "no match"' junk2 no match

    A closer look inside the files explains why.

    $ cat -vet junk pl^M$
    $ cat -vet junk2 pl$

    [In case you're unfamiliar with the command, cat -vet, displays carriage returns as "^M" and newlines as "$".]

    -- Ken