in reply to multiline regex: heredoc vs. reading file

Because you've only read ONE line from the file.

Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart
  • Comment on Re: multiline regex: heredoc vs. reading file

Replies are listed 'Best First'.
Re^2: multiline regex: heredoc vs. reading file
by ikegami (Patriarch) on Jan 25, 2006 at 17:55 UTC

    Let's elaborate. <FILE> in scalar context will only read one line. By default, that means it will only read until (and including) the next \n. How can a line match \w+\n\n\w+ if a line can't contain \n other than at the end?

    The fix would be to read the whole file in at once, as follows:

    my $text; { open(my $test_fh, '<', 'testfile) or die "Unable to open testfile: $!\n"; local $/; # Read to end of file. $text = <$test_fh>; } if ($text =~ /\w+\n\n\w+/) { print "reading file test:\n$text\nmatches.\n"; }

    Note: The m modifier on your regexp is useless since you don't use ^ or $. The s modifier on your regexp is useless since you don't use ..

    Update: If you want to find all matches, use the following:

    ... while ($text =~ /\w+\n\n\w+/g) { print "reading file test:\n$text\nmatches.\n"; }
Re^2: multiline regex: heredoc vs. reading file
by bowei_99 (Friar) on Jan 25, 2006 at 17:59 UTC
    One line? From page 147 of 'Programming Perl' -

    /m Let ^ and $ match next to embedded \n.
    /s Let . match newline and ignore deprecated $* variable.

    Wouldn't that mean it would look for multiple lines?

      /m changes where ^ and $ can match; /s changes what . can match. Since you don't have any of ^, $, or . in your regex, the flags do nothing.

      The problem is that you have a regex that only matches multiple lines, but you are trying to match each line of the file against it individually, and of course none of them do match.

      The problem is not with the regexp. The problem is that $_ only contains one line. See my earlier post in this discussion for more details.
        OK, so I changed the while loop to contain the following, and still get the same thing -
        $_ .= <TEST>; if (m{ \w+\n .+ \w+ }msx) { print "reading file test: The line \n$_\nmatches.\n"; }
        I would think that
        a) $_ .= <TEST>; would be equivalent to your $text = <$test_fh>; line, and
        b) replacing \n with .+ utilizes the /s,

        correct?