in reply to Change the behavior of Perl's IRS

the last chunk isn't read at all

eh? That shouldn't be. Perl returns whatever's after the last $/.

hope that there's a nice and clean way [...] without any ugly corrective code.

I don't know if you'll consider the following "nice and clean", but at least there's no "ugly corrective code" (or any corrective code at all) in the following. It employs a single-line lookahead.

(Update: I've replaced the code I had here originally with a version that hides the guts in an iterator. It's longer, but the usage is much simpler.)

Usage:

my $rec_reader = make_rec_reader('myrecordsep'); while (my $rec = $rec_reader->($fh)) { print("Record\n"); print("======\n"); print "$_\n" for @$rec; print("\n"); }

Guts:

sub make_rec_reader { my ($sep) = @_; my $first = 1; my $line; my @rec; return sub { my ($fh) = @_; # Skip what's before first record. if ($first) { $first = 0; for (;;) { $line = <$fh>; last if not defined $line; chomp($line); last if $line eq $sep; } } while (defined($line)) { my @rec; for (;;) { push @rec, $line; $line = <$fh>; last if not defined $line; chomp($line); last if $line eq $sep; } return \@rec; } }; }

Replies are listed 'Best First'.
Re^2: Change the behavior of Perl's IRS
by LighthouseJ (Sexton) on Jul 14, 2007 at 19:59 UTC
    But see, that's precisely what I'm trying to avoid, anything except the absolute minimum of code which I'd like to think Perl strives for.  All I want to do is write something like the following and have it work properly.
    { $/ = 'myrecordsep'; while (<DATA>) { # do the actual work on text here } } __DATA__ myrecordsep field1=item1 field2=item2 myrecordsep ...
    I want Perl to read a chunk at a time following that model, that's absolutely all I'm looking for. Like I mentioned before, I've written different scripts that utilized different methods but I'm exploring this particular avenue. I appreciate the attention to the problem though.
    "The three principal virtues of a programmer are Laziness, Impatience, and Hubris. See the Camel Book for why." -- `man perl`

      It's not complicated, just discard the first separator:

      #! perl use strict; { $/ = "myrecordsep\n"; scalar <DATA>; ##discard the first; while (<DATA>) { chomp; print "'$_'\n"; } } __DATA__ myrecordsep field1=item1 field2=item2 myrecordsep field1=item1 field2=item2 myrecordsep field1=item1 field2=item2 myrecordsep field1=item1 field2=item2

      Produces:

      C:\test>junk2 'field1=item1 field2=item2 ' 'field1=item1 field2=item2 ' 'field1=item1 field2=item2 ' 'field1=item1 field2=item2 '

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      But see, that's precisely what I'm trying to avoid, anything except the absolute minimum of code which I'd like to think Perl strives for.

      True. This is often done by placing reusable code in modules. I wrote the solution to be reusable so you could place it in a module. All that's left is two lines:

      my $rec_reader = make_rec_reader('myrecordsep'); while (my $rec = $rec_reader->($fh)) { ... }