http://qs1969.pair.com?node_id=544589


in reply to Re: Passing variables into regular expressions
in thread Passing variables into regular expressions

I could be misunderstanding something but I thought you had to use the s flag to get the regular expression to match across newlines. The following script

use strict; use warnings; my $str = "ab12c\nde34f\ngh56i\njk78l"; my @digits = $str =~ /(\d+)/g; print "\@digits -- @digits\n\n"; my ($span_d) = $str =~ /(\d\d.*?\d\d)/; print "\$span_d -- $span_d\n\n"; my ($span_m) = $str =~ /(\d\d.*?\d\d)/m; print "\$span_m -- $span_m\n\n"; my ($span_s) = $str =~ /(\d\d.*?\d\d)/s; print "\$span_s -- $span_s\n\n";

produces

@digits -- 12 34 56 78 Use of uninitialized value in concatenation (.) or string at reSorM li +ne 12. $span_d -- Use of uninitialized value in concatenation (.) or string at reSorM li +ne 15. $span_m -- $span_s -- 12c de34

The m flag doesn't seem to so the trick. Have I missed something?

Cheers,

JohnGG

Replies are listed 'Best First'.
Re^3: Passing variables into regular expressions
by Tanalis (Curate) on Apr 20, 2006 at 13:32 UTC
    Interesting.

    The docs indicate that the s flag causes the input string to be treated as if it's a single line. The m flag, on the other hand, causes the string to be treated as if it's multi-line.

    It seems that .*? can't cross the newline character when using the m flag:

    # doesn't match my ($span_m) = $str =~ /(\d\d.*?\d\d)/m; # matches my ($span_m) = $str =~ /(\d\d.*?\n.*?\d\d)/m; # so does my ($span_m) = $str =~ /(\d\d\w+\s\w+\d\d/m;
    Treating the \n as whitespace (or explicitly naming it in the regex) seems to solve the problem. Any ideas why that'd be the case?
      Looking at the Camel book, 3rd edn., table 5-1 on page 150, the entry for /s says "Let . match newline ... " which sort of implies that /m doesn't. So it is the treatment of the "." metacharacter that changes between the two. This with no modifying flag also matches

      ($span_d) = $str =~ /(\d\d\w+\s\w+\d\d/;

      This might imply that the default behaviour of m/.../ with no modifying flag is the same as m/.../m. I will delve into the documentation when I get a chance.

      Cheers,

      Johngg

      Update:

      This passage is in the "perlre" manual page

      ... m Treat string as multiple lines. That is, change "^" and "$" from matching the start or end of the string to matching the start or end of any line anywhere within the string. s Treat string as single line. That is, change "." to match any character whatsoever, even a newline, which normally it would not match. The "/s" and "/m" modifiers both override the $* setting. That is, no matter what $* contains, "/s" without "/m" will force "^" to match only at the beginning of the string and "$" to match only at the end (or just before a newline at the end) of the string. Together, as /ms, they let the "." match any character whatsoever, while still allowing "^" and "$" to match, respectively, just after and just before newlines within the string. ... perl v5.8.4 Last change: 2004-01-17 1