Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have a file containing the following content:

Line1 Line2 Line3 XYZ: Blah

I'm trying to get:

Line1 Line2 Line3

Why does the following test case not work?

printf "Line1\nLine2\nLine3\n\nXYZ: Blah\n" | perl -p -e "s|\nXYZ.+\n| +|m"

Replies are listed 'Best First'.
Re: multi-line RE question
by moritz (Cardinal) on Aug 02, 2011 at 11:35 UTC
    perl -p reads line by line, there can only be one \n at most inside a line, at the end of the line. Your regex requires two \n characters to match. Changing the first \n to ^ is necessary for it to match.

    But it seems what you really want is something like grep -v ^Blah or grep ^Line. See grep.

      But it should also remove the empty last line.
Re: multi-line RE question
by Eliya (Vicar) on Aug 02, 2011 at 11:47 UTC

    The regex option m here does not do what you think. First, it's not required to match/replace multiple newlines in a string.  Secondly, it has no influence on the behavior of the readline (<>) in the implicit while loop behind the command line option -p. In other words, all you ever have in $_ is one line of text...

    One way around the problem would be to set the input record separator to undef to slurp in the whole file at once

    $ printf "Line1\nLine2\nLine3\n\nXYZ: Blah\n" | perl -p -e "$/=undef; +s|\nXYZ.+\n||" Line1 Line2 Line3

    (You can also say BEGIN { $/=undef } ... if you want to have it execute only once at the expense of a few more keystrokes.)

    Whether that is a good solution depends on the circumstances, e.g. how large the input file is, etc.  In case it's huge, you might want to read it in single line mode, and have previous lines accumulate in a buffer up until the search pattern is found. In this case, you can trim the trailing newline in the buffer before printing it out.

    $ printf "Line1\nLine2\nLine3\n\nXYZ: Blah\n" | perl -n -e '$buf.=$_; +if ($buf=~s|\nXYZ.+\n||) {print $buf; $buf=""}' Line1 Line2 Line3

      There is a simpler solution to setting $/ to undef as described in perlrun:

      $ printf "Line1\nLine2\nLine3\n\nXYZ: Blah\n" | perl -0777pe "s|\nXYZ. ++\n||" Line1 Line2 Line3
        Wow, thanks. I try to understand the magic...
      Thank you! That solves my problem.
Re: multi-line RE question
by ww (Archbishop) on Aug 02, 2011 at 11:37 UTC
    Perhaps you intended to stick to Perl... but what you posted doesn't seem to do so as the printf seems to be a nix-ish shell command.

    Try, instead, this (and excuse the windows shell syntax and avoidance of file ops):

    C:\_wo>perl -e "my $line = \"Line1\nLine2\nLine3\n\nXYZ: Blah\n\"; $li +ne =~ s|\nXYZ.+\n||m; print $line;" Line1 Line2 Line3 C:\_wo>

    OTOH, if I misunderstood your question, please correct me.