in reply to regexp need a match characters and spaces

I see the problem too with the \s+, which is scooping up the newline character. Any suggestions if this can be done with negative look-behind or something like that ?

Luca
  • Comment on Re: regexp need a match characters and spaces

Replies are listed 'Best First'.
Re^2: regexp need a match characters and spaces
by johngg (Canon) on Mar 23, 2006 at 14:48 UTC
    You could try the following as it seems to cope with spaces between the b and it's newline and with b with an immediate newline by making the spaces optional and also making a character class that excludes newline, unlike \s.

    Set up some text to try and set up the regular expression.

    #!/usr/local/bin/perl # use strict; use warnings; our @tryThese = ( "a 10\nb a2 s2\nc 30", "a 10\nb\nc 30", "a 10\nb \nc 30", "a 10\nb a2 s2 \nc 30"); our $c; our $rxAfterB = qr{(?m)^b[\x20\x09]*([^\n]*)};

    Loop over @tryThese printing out what we are testing, doing the match then showing the result.

    foreach my $a (@tryThese) { print "\n\$a contains ...\n"; { local $" = "<--\n"; print "@{[split /\n/, $a]}\n"; } ($c) = $a =~ /$rxAfterB/; print "c is -->$c<--\n\n"; }

    When run this produces

    $a contains ... a 10<-- b a2 s2<-- c 30 c is -->a2 s2<-- $a contains ... a 10<-- b<-- c 30 c is --><-- $a contains ... a 10<-- b <-- c 30 c is --><-- $a contains ... a 10<-- b a2 s2 <-- c 30 c is -->a2 s2 <--

    I hope this is of use.

    Cheers,

    JohnGG

    Update: The slashes around the compiled regular expression in the line ($c) = $a =~ /$rxAfterB/; are superfluous. The line should read

    ($c) = $a =~ $rxAfterB;

    JohnGG

Re^2: regexp need a match characters and spaces
by graff (Chancellor) on Mar 23, 2006 at 23:27 UTC
    I think the solution given by codeacrobat is the best one, but if you wanted to insist on working with the one scalar value containing multiple lines, maybe something like this:
    $a = " a 10 b a2 s2 c 30"; ( $c ) = $a =~ /^b[ \t]+(.*)/m; print "$c\n";
    That is, simply make sure you don't include "\n" among the kinds of white-space that can follow "b" in order to yield a match.

    (BTW, you want everything on the line that starts with "b", and the "m" modifier on the regex does not affect the behavior of "." -- it still will not match "\n", so the question mark and dollar sign in the OP version -- (.*?)$ -- are redundant here.)