in reply to Why multiline regex doesn't work?

#!/usr/bin/perl use strict; use warnings; use v5.20; my $s = <<'ENDSTR'; aaa : AAA bbb : BBB ccc : CCC ENDSTR my $m = 'bbb'; my $a = $1 if $s =~ s/^$m *: (.*?)$/$1/rsm; my $b = $1 if $s =~ s/^$m *: (.*)$/$1/rm; print "a: $a\n"; print "b: $b\n";

You see, $ only works at the end or just before a | all the m modifier does is match at the beginning/end of every line instead of the absolute beginning and absolute end. else $ gets confused with a variable, imagine: ~/(.)$./ ~/(.)\$./

Replies are listed 'Best First'.
Re^2: Why multiline regex doesn't work?
by nbd (Novice) on Jun 09, 2015 at 00:19 UTC
    I was guided by this part of perldoc:

    - m modifier (//m): Treat string as a set of multiple lines. '.' matches any character except "\n" . ^ and $ are able to match at the start or end of any line within the string.

    - both s and m modifiers (//sm): Treat string as a single long line, but detect multiple lines. '.' matches any character, even "\n" . ^ and $ , however, are able to match at the start or end of any line within the string.

    Does the correction in the code you made mean that Perl processes the multiline string line by line and not as a single string?

    UPDATE: I see that Perl process the string as a whole. The code from the first sight just looked as an awk line by line pattern matching. Thanks.

      You should also enable warnings (and strictures; see strict), especially if you are a Perl novice. Consider your first regex with warnings enabled:

      c:\@Work\Perl\monks>perl -le "use warnings; use strict; ;; my $s = qq{aaa : AAA\n} . qq{bbb : BBB\n} . qq}ccc : CCC\n} ; print qq{[[$s]]}; ;; my $m = 'bbb'; ;; my $t = $s =~ s/.*^$m *: (.*?)$.*/$1/rsm ; ;; print qq{[[$t]]}; " [[aaa : AAA bbb : BBB ccc : CCC ]] Use of uninitialized value $. in regexp compilation at -e line 1. [[BBB ccc : CCC ]]
      The Use of uninitialized value $. in regexp compilation... message gives you a clue about what is happening.

      If the  $ is unambiguously a regex metacharacter:

      c:\@Work\Perl\monks>perl -le "use warnings; use strict; ;; my $s = qq{aaa : AAA\n} . qq{bbb : BBB\n} . qq}ccc : CCC\n} ; print qq{[[$s]]}; ;; my $m = 'bbb'; ;; my $t = $s =~ s/.*^$m *: (.*?)$(?:.*)/$1/rsm ; ;; print qq{[[$t]]}; " [[aaa : AAA bbb : BBB ccc : CCC ]] [[BBB]]
      You have your intended output for this regex.


      Give a man a fish:  <%-(-(-(-<

      You really should try to work with simpler examples before you make things complicated:

      use strict; use warnings; use Data::Dumper; my ($str,@match); $str = " foo bar baz "; @match = $str =~ /(foo.*bar)/; # nope! print Dumper \@match; @match = $str =~ /(foo.*bar)/m; # nope! print Dumper \@match; @match = $str =~ /(foo.*bar)/s; # this one! print Dumper \@match; $str = " foo bar foo baz "; @match = $str =~ /^(foo bar)/; # nope! print Dumper \@match; @match = $str =~ /^(foo bar)/s; # nope! print Dumper \@match; @match = $str =~ /^(foo bar)/m; # this one! print Dumper \@match;

      The first set of matches illustrates a case when the 's' modifier gets the match and the second set of matches illustrates a case when the 'm' modifier gets the match. Hope this helps!

      jeffa

      L-LL-L--L-LL-L--L-LL-L--
      -R--R-RR-R--R-RR-R--R-RR
      B--B--B--B--B--B--B--B--
      H---H---H---H---H---H---
      (the triplet paradiddle with high-hat)
      

      See perlvar#$.
      rxrx and http://perldoc.perl.org/re.html#%27debug%27-mode and other regex tools
      The "anchor" misnomer in regexes (string location assertion)
      Why \n matches but not $^?
      Disabling regexp optimizations?

      matches after newline (or beginning of string). $ matches before newline (or end of string)

      $ perl -MData::Dump -Mre=debug -le " dd( $_=qq{a\n\nb} ); s{^$}{boop}m +; dd( $_ ); " Compiling REx "^$" Final program: 1: MBOL (2) 2: MEOL (3) 3: END (0) anchored ""$ at 0 anchored(MBOL) minlen 0 "a\n\nb" Matching REx "^$" against "a%n%nb" 0 <> <a%n%nb> | 1:MBOL(2) 0 <> <a%n%nb> | 2:MEOL(3) failed... 2 <a%n> <%nb> | 1:MBOL(2) 2 <a%n> <%nb> | 2:MEOL(3) 2 <a%n> <%nb> | 3:END(0) Match successful! "a\nboop\nb" Freeing REx: "^$"

      Trying to match newline after end of line won't work, $\n won't work

      $ perl -MData::Dump -Mre=debug -le " dd( $_=qq{a\n\nb} ); s{^$\n}{boop +}m; dd( $_ ); " "a\n\nb" Compiling REx "^%nn" Final program: 1: MBOL (2) 2: EXACT <\nn> (4) 4: END (0) anchored "%nn" at 0 (checking anchored) anchored(MBOL) minlen 2 Guessing start of match in sv for REx "^%nn" against "a%n%nb" Did not find anchored substr "%nn"... Match rejected by optimizer "a\n\nb" Freeing REx: "^%nn"

      But matching an OPTIONAl newline works

      $ perl -MData::Dump -Mre=debug -le " dd( $_=qq{a\n\nb} ); s{^$\n?}{boo +p}ms; dd( $_ ); " "a\n\nb" Compiling REx "^%nn?" Final program: 1: MBOL (2) 2: EXACT <\n> (4) 4: CURLY {0,1} (8) 6: EXACT <n> (0) 8: END (0) anchored "%n" at 0 (checking anchored) anchored(MBOL) minlen 1 Guessing start of match in sv for REx "^%nn?" against "a%n%nb" Found anchored substr "%n" at offset 1... Found /^/m, restarting lookup for check-string at offset 2... Found anchored substr "%n" at offset 2... Position at offset 2 does not contradict /^/m... Guessed: match at offset 2 Matching REx "^%nn?" against "%nb" 2 <a%n> <%nb> | 1:MBOL(2) 2 <a%n> <%nb> | 2:EXACT <\n>(4) 3 <a%n%n> <b> | 4:CURLY {0,1}(8) EXACT <n> can match 0 times out of 1 +... 3 <a%n%n> <b> | 8: END(0) Match successful! "a\nboopb" Freeing REx: "^%nn?"