g_speran has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, lets say I have a file like the following that I read the contents into a file handle.
line1 line2 line4 line6
using perl substitution (s/ / /g;) how can I make the contents become like the following
line1 line2 line4 line6
I have tried $_ =~ s/\n\n+/\n/g; but that is not working for me. Below is the code I am working with
open(FILE, "<$MODFILE") || die "File not found"; my @lines = <FILE>; close(FILE); my @newlines; foreach(@lines) { $_ =~ s/\n\n+/\n/g; push(@newlines,$_); }
TIA....

Replies are listed 'Best First'.
Re: replace multiple newline characters
by Marshall (Canon) on May 15, 2018 at 04:25 UTC
    Since you have the input lines as an array, easiest would be to use grep.
    my @newlines = grep{/\S/}@lines; #move only lines that have #a non whitespace char to the left or @lines = grep{/\S/}@lines;
    Update:
    Normally when reading text files, I read it line by line instead of creating an in memory array.

    For input I normally allow any number of "unseeable" whitespace chars. A blank line often may have an extra space in it that you can't see, especially if the input file is something that a human might edit.

    while (<FH>) # update: was: while (my $line = <FH>) # don't really need "my line" here { next unless /\S/; #skip blank lines next if /^\s*#/; #skip comments, or whatever... blah, blah }
Re: replace multiple newline characters
by AnomalousMonk (Archbishop) on May 15, 2018 at 03:43 UTC

    Since you're already reading the entire input at once, read it all into a scalar with something like
        my $input = do { local $/;  <FILE>; };
    and then

    c:\@Work\Perl\monks>perl -wMstrict -le "my $input = qq{line1\nline2\n\nline4\n\n\nline7\nline8\n}; print qq{[[$input]]}; ;; $input =~ s{ ^ \s* \n }{}xmsg; print qq{<<$input>>}; " [[line1 line2 line4 line7 line8 ]] <<line1 line2 line4 line7 line8 >>

    Update 1: Or, if you need to read all the lines of the file as separate lines and keep them that way, try
        my @lines = grep !m{ \A \s* \Z }xms, <FILE>;

    c:\@Work\Perl\monks>perl -wMstrict -le "my @lines = ( qq{line1\n}, qq{line2\n}, qq{\n}, qq{line4\n}, qq{ \t\n}, qq{\t \t \n}, qq{line7\n}, qq{line8\n}, ); print qq{[[@lines]]}; ;; @lines = grep !m{ \A \s* \Z }xms, @lines; print qq{<<@lines>>}; " [[line1 line2 line4 line7 line8 ]] <<line1 line2 line4 line7 line8 >>

    Update 2: Slight formatting fix to code in Update 1; no functional change.

    Update 3: Rather than using
        !m{ \A \s* \Z }xms
    to grep lines in the code in Update 1, I think I'd prefer
        m{ \S }xms
    as used by Marshall here; don't know why I didn't use it in the first place. Note that no logical negation is needed.


    Give a man a fish:  <%-{-{-{-<

Re: replace multiple newline characters
by haukex (Archbishop) on May 15, 2018 at 04:30 UTC

    The problem with the code you showed is that my @lines = <FILE>; reads the lines into the array, with each separate line in its own array entry, so the regex never sees more than one \n. One solution is to do what AnomalousMonk has already shown, by reading the entire file into one string. Here is one more way that reads the entire file into memory, using the "paragraph mode" supported by $/:

    use warnings; use strict; use Data::Dump; open my $fh, '<', 'test.txt' or die $!; my @lines = do { local $/=''; <$fh> }; chomp @lines; close $fh; dd @lines; __END__ ("line1\nline2\n", "line4\n", "line7\nline8\n", "line9\n")

    But you don't necessarily need to read the entire file into memory first, you can also do this operation line-by-line, as you're reading the file. For example:

    open my $fh, '<', 'test.txt' or die $!; while (my $line = <$fh>) { print $line if length $line && $line ne $/; # - OR - push @newlines, $line if length $line && $line ne $/; # - OR - chomp($line); next unless length $line; print $line, "\n"; # etc. } close $fh;
Re: replace multiple newline characters
by morgon (Priest) on May 15, 2018 at 07:58 UTC
    Provided your input-file is small enough to be read into memory, you can do this with a one-liner:
    perl -i.old -0777 -pe 's/\n+/\n/sg' input.txt
    the original version of the file is kept as input.txt.old.
      Provided your input-file is small enough to be read into memory

      Here's one without that constraint:

      perl -i.old -ne 'print if /./;' input.txt
Re: replace multiple newline characters
by thanos1983 (Parson) on May 15, 2018 at 09:37 UTC

    Hello g_speran,

    Fellow Monks have answered your question with multiple great answers. Just to add another minor answer using one of my favorite module(s) IO::All:

    #!/usr/bin/perl use strict; use IO::All; use warnings; use Data::Dumper qw(Dumper); my @lines = io('in.txt')->chomp->slurp; # Chomp as you slurp print Dumper \@lines; @lines = grep { $_ ne '' } @lines; # Skip empty elements print Dumper \@lines; __END__ $ perl test.pl $VAR1 = [ 'line1', 'line2', '', 'line4', '', 'line6' ]; $VAR1 = [ 'line1', 'line2', 'line4', 'line6' ];

    But why not reading the file line by line more efficiently and make it easier to simply keep the lines that contain something? Sample of code bellow:

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper qw(Dumper); my @lines; while (<>) { chomp; next if /^\s*$/; # skip blank lines; # next if /^\s*#/; # skip comments push @lines, $_; } continue { close ARGV if eof; # Not eof()! } print Dumper \@lines; __END__ $ perl test.pl in.txt $VAR1 = [ 'line1', 'line2', 'line4', 'line6' ];

    I prefer to read my files from command line by using the eof function as it gives you the ability to read multiple files one after the other. For example perl test.pl in_1.txt in_2.txt etc... etc...

    Minor note, a similar question has been asked before in the Monastery (Removing empty string elements from an Array). Remember to search and read as much as possible for your problem it will really help you reading and trying alternative approaches. :)

    Hope this helps, BR.

    Seeking for Perl wisdom...on the process of learning...not there...yet!