in reply to Why does my Perl regex substitution for linebreak fail?

You say you want to remove a single linebreak before all lines starting with the string "= = = =", but your snippet would remove two linebreaks ("\n\n" is replaced with nothing). Just curious about that.

Anyway, I think others have already given good ideas. Here's another one, that doesn't require holding the entire file in memory at once (unless of course the file does not actually contain any instance of "\n===="):

#!/usr/bin/perl use strict; use warnings; $/ = "\n===="; while (<>) { s/\n====$/====/; print; }
Setting the INPUT_RECORD_SEPARATOR ($/, see perlvar) like that makes things very simple. If the file happens to have CRLF line termination, you may need to set $/ to "\r\n====" (and include "\r" in the s/// as well).

(updated upon realizing that a CRLF file would just need a modified s///; the original $/ setting above would still work fine -- oops! I just noticed that ikegami already posted this idea, as I should have known he would!)

Replies are listed 'Best First'.
Re^2: Why does my Perl regex substitution for linebreak fail?
by pat_mc (Pilgrim) on Mar 06, 2008 at 09:15 UTC
    Yes, graff, you are right in observing that my regex contains two linebreaks - in contrast to what I actually intended to do. The curious thing is that the regex performs as expected when it should match one linebreak but not when it contains two linebreaks - in that case it appears to do NOTHING at all, although the file definitely does contain several consecutive linbreak-only lines. I am still puzzled and am starting to believe the issue is not due to the Perl-side of things but rather an I/O or even a Linux problem. Any conejectures on this one? Thanks again - Pat
      I'm not sure I follow what you are describing there. The best thing to do is to present a minimal script and data set that still (even after what you've learned) produces results that you consider to be unexpected, and point out how it differs from what you would expect.