http://qs1969.pair.com?node_id=544599


in reply to Re^2: Passing variables into regular expressions
in thread Passing variables into regular expressions

Interesting.

The docs indicate that the s flag causes the input string to be treated as if it's a single line. The m flag, on the other hand, causes the string to be treated as if it's multi-line.

It seems that .*? can't cross the newline character when using the m flag:

# doesn't match my ($span_m) = $str =~ /(\d\d.*?\d\d)/m; # matches my ($span_m) = $str =~ /(\d\d.*?\n.*?\d\d)/m; # so does my ($span_m) = $str =~ /(\d\d\w+\s\w+\d\d/m;
Treating the \n as whitespace (or explicitly naming it in the regex) seems to solve the problem. Any ideas why that'd be the case?

Replies are listed 'Best First'.
Re^4: Passing variables into regular expressions
by johngg (Canon) on Apr 20, 2006 at 14:08 UTC
    Looking at the Camel book, 3rd edn., table 5-1 on page 150, the entry for /s says "Let . match newline ... " which sort of implies that /m doesn't. So it is the treatment of the "." metacharacter that changes between the two. This with no modifying flag also matches

    ($span_d) = $str =~ /(\d\d\w+\s\w+\d\d/;

    This might imply that the default behaviour of m/.../ with no modifying flag is the same as m/.../m. I will delve into the documentation when I get a chance.

    Cheers,

    Johngg

    Update:

    This passage is in the "perlre" manual page

    ... m Treat string as multiple lines. That is, change "^" and "$" from matching the start or end of the string to matching the start or end of any line anywhere within the string. s Treat string as single line. That is, change "." to match any character whatsoever, even a newline, which normally it would not match. The "/s" and "/m" modifiers both override the $* setting. That is, no matter what $* contains, "/s" without "/m" will force "^" to match only at the beginning of the string and "$" to match only at the end (or just before a newline at the end) of the string. Together, as /ms, they let the "." match any character whatsoever, while still allowing "^" and "$" to match, respectively, just after and just before newlines within the string. ... perl v5.8.4 Last change: 2004-01-17 1