in reply to foreach skipping elements

You could stick spaces into those 'empty' bars by swapping with something like this: while($line =~ s/\|\|/\| \|/){}

eg where $fileName is a text file containing your string:

#! /usr/bin/perl -w use strict; my $fileName = $ARGV[0]; my @old; my @new; open (my $file, "<", $fileName); while(<$file>) { my $line = $_; while($line =~ s/\|\|/\| \|/){} #print STDERR "OLD: $_\tNEW: $line"; @old = split(/\|/, $_); @new = split(/\|/, $line); } for(@old) { print STDERR "OLD: $_\n"; } print STDERR "\n"; for(@new) { print STDERR "NEW: $_\n"; }

Replies are listed 'Best First'.
Re^2: foreach skipping elements
by kcott (Archbishop) on Jul 20, 2013 at 04:53 UTC
    "You could stick spaces into those 'empty' bars by swapping with something like this: while($line =~ s/\|\|/\| \|/){}"

    Perhaps I'm missing your intent, but that seems like an unnecessarily complicated way to get a repeat substitution when the 'g' modifier is provided for that task:

    $ perl -Mstrict -Mwarnings -E ' my $x = q{1|2|3|||6|7||||}; { my $line = $x; while($line =~ s/\|\|/\| \|/){} say $line; } { my $line = $x; $line =~ s/\|(?=\|)/| /g; say $line; } ' 1|2|3| | |6|7| | | | 1|2|3| | |6|7| | | |

    Using the while loop instead of the 'g' modifier is also slower:

    $ perl -Mstrict -Mwarnings -E ' use Benchmark qw{cmpthese}; my $x = q{1|2|3|||6|7||||}; cmpthese(1e6 => { while_no_repeat => sub { my $line = $x; while($line =~ s/\|\|/\| \|/){} }, just_repeat => sub { my $line = $x; $line =~ s/\|(?=\|)/| /g; } }); ' Rate while_no_repeat just_repeat while_no_repeat 158479/s -- -21% just_repeat 201613/s 27% --

    [Aside: the pipe (|) character is not special in the "replacement" part of s/pattern/replacement/ so there's no need to escape it.]

    -- Ken

      Fair point, thanks for comparing them!

      Michael

Re^2: foreach skipping elements
by rjt (Curate) on Jul 20, 2013 at 03:49 UTC

    But your program will miss the last empty field. In 1|2||3|45|||6|, for example, your program stops too early:

    . . NEW: 45 NEW: NEW: NEW: 6

    (One empty NEW: line should follow, since separators separate fields, rather than terminate them.)

    If that were the only issue, you could just append a single '|' to the input string before the rest of your logic.

    The second problem, though, is that your regex converts empty fields to a single space ' '. True, you could easily replace them, but if the field was ' ' in the first place, the value will be destroyed. That may not be a problem depending on the data, but to fix this, and the previous issue, you can make a couple of changes:

    $line =~ s!\|!\| !g; . . @new = map { substr $_, 1 } split /\|/, $line;

    Or, as the OP suggested he was trying to avoid, he could just add a non-blank field to the end. In that case the entire code reduces to:

    my @new = split /\|/, $line . '|sentinel'; $#new--; # Remove sentinel

    But of course, split /\|/, $line, -1; still gets my vote. :-)

      Two good points, thanks for pointing those out!

      Michael

Re^2: foreach skipping elements
by tiggyboo (Initiate) on Jul 19, 2013 at 18:10 UTC
    Thanks for the speedy suggestions everyone. I'm off an running, able to do some *real* damage now :-) Al