Saladino has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, I'm trying to match Version number from an incoming network connection, the working string is 'Version: 0\r\n'.
I had hard time matching \r\n in the same regex and I extracted to :
$line =~ s/\n|\r//g;
And this apparently works.
Later I've tried:
my($version) = $line =~ m/^Version:\s(\d+)$/;
Which looks like it should match version number, but it doesn't.
However this one matches ok (with the extra space):
my($version) = $line =~ m/^Version:(\s\d+)$/;
And i don't understand why moving the \s from inside the selection to outside causes the refexp to fail.

Replies are listed 'Best First'.
Re: Can't match my regex
by Marshall (Canon) on Sep 04, 2009 at 23:00 UTC
    In my testing, I was unable to reproduce this behavior of  \s(\d+)$ vs  (\s\d+)$. I am using Perl 5.10.

    Perl does a good job of hiding this \r\n ugliness from the user. In your working string with single quotes, you are actually getting a backslash then letter r, etc.

     my ($version) = $line =~ m/^Version:\s(\d+)\\r\\n$/; does match 'Version: 0\r\n'. Things change if you use double quotes.

    I'm not quite sure, but I suspect that your test case string isn't doing what you expect. When Perl does a normal sort of a read, it will get rid of the \r (return) character. There is no need for you to do that yourself (except cases where you are reading binary bytes, etc).

    Anyway I don't think your \r test case is a good one as you will never see that \r. The $end of string anchor will work fine on Windows or Unix.

Re: Can't match my regex
by ww (Archbishop) on Sep 05, 2009 at 00:47 UTC

    Second Update: Aaargh!
    Wrong, wrong, wrong (except the genuflection)!

    On the other hand (but with a bow and ++ to Marshall's remarks regarding \r, \n and the "$end of string anchor"), I do see the same issue Update: stricken text replaced read OP incorrectly</replace> see an issue, both with perl v 5.8.7 under Linux (Ubuntu) and with perl v 5.8.8 (Build 819) under W2K.

    The issue is with the parenthesized ($version). See context.

    Simple minded testing

    #!/usr/bin/perl use strict; use warnings; #793602 my $line = "Version: 0\r\n"; my $line1 = "Version: 0\r\n"; my $line2 = "Version: 0\r\n"; my $line3 = "Version: 0\r\n"; $line =~ s/\n|\r//g; print "Line: |$line| ... \n"; $line1 =~ /^Version:\s(\d)$/; print "Line1: -|$line1|- ... \n"; $line2 =~ /^Version:(\s\d+)$/; print "Line2 with \\s inside capture: --|$line2|-- ... \n"; my ($version) = $line3 =~ /^Version:\s(\d)$/; print "Line3 (paren'ed \$version): ---|$version|--- ... \n"; =head OUTPUT Line: |Version: 0| ... Line1: -|Version: 0 |- ... Line2 with \s inside capture: --|Version: 0 |-- ... Use of uninitialized value in concatenation (.) or string at 793602.pl + line 22. Line3 (paren'ed $version): ---||--- ... =cut
      Well, I'm seeing this behaviour with perl 5.10.0 in Debian
Re: Can't match my regex
by Melly (Chaplain) on Sep 07, 2009 at 09:44 UTC

    It works for me - can you run the following from the command-line and confirm that you don't get '0' printed?

    perl -e "$l='Version: 0';($v)=$l=~/^Version:\s(\d)$/;print $v;"

    Apart from that, are you testing $version at some moment? If so, it will be false ('0'), whereas ' 0' is not false... e.g.

    perl -e "$l='Version: 0';($v)=$l=~/^Version:\s(\d)$/;print $v if $v;"

    will not print the version.

    map{$a=1-$_/10;map{$d=$a;$e=$b=$_/20-2;map{($d,$e)=(2*$d*$e+$a,$e**2 -$d**2+$b);$c=$d**2+$e**2>4?$d=8:_}1..50;print$c}0..59;print$/}0..20
    Tom Melly, pm (at) cursingmaggot (stop) co (stop) uk
Re: Can't match my regex
by Anonymous Monk on Sep 08, 2009 at 14:58 UTC
    "And i don't understand why moving the \s from inside the selection to outside causes the refexp to fail. "

    Because you're binding to the end-of-string and \r is matching as a whitespace character after the version number (and is in the value returned to $version)

    Why are you binding to the end at all?

    /^Version:\s*(\d+)/ seems adequate to me.

    -Greg