reading several lines in a gulp

John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: reading several lines in a gulp by moritz (Cardinal) on Apr 27, 2011 at 11:37 UTC
`my @lines; push @lines, scalar <$file> for 1..10;` [download] No need for a primitive for something that can be easily achieved with existing primitives, and isn't used all that often. Perl buffers the data it reads from the file, so it shouldn't be much less efficient than slurping a small file in list context. Perl 6 - second systems done right	[reply] [d/l]
Re^2: reading several lines in a gulp by John M. Dlugosz (Monsignor) on Apr 27, 2011 at 12:31 UTC
If the file is not a multiple of 10 lines, does that push extra undef's into the @lines?	[reply]
Re^3: reading several lines in a gulp by moritz (Cardinal) on Apr 27, 2011 at 13:00 UTC
Yes, I didn't think of that. Maybe this would be better: `sub gulp { my ($file, $count) = @_; my @lines; for (1..$count) { push @lines, scalar <$file>; last if eof $file; } return @lines; }` [download] Perl 6 - second systems done right	[reply] [d/l]
Re^2: reading several lines in a gulp by raybies (Chaplain) on Apr 27, 2011 at 12:20 UTC
Why is scalar used in that? What's the effect? I'm staring crosseyed at this very useful solution, can someone decompress this one for me? Thanks ahead of time for all you experts and your great solutions... --Ray	[reply]
Re^3: reading several lines in a gulp by moritz (Cardinal) on Apr 27, 2011 at 12:58 UTC
push evaluates the second argument in list context, so that would slurp the whole file. scalar force the `<$filehandle>` operator (which calls readline under the hood) to only read one line. Perl 6 - second systems done right	[reply] [d/l]
Re^3: reading several lines in a gulp by Eliya (Vicar) on Apr 27, 2011 at 12:58 UTC
Why is scalar used in that? What's the effect? push imposes list context (you can push more than one element in one statement), and <$file> would read the whole file at once in list context (as pointed out in the OP). In other words, without scalar, the whole file would be read in the first iteration, which would kind of defeat the purpose of the exercise...	[reply]
Re: reading several lines in a gulp by Anonymous Monk on Apr 27, 2011 at 11:46 UTC
`@lines = map scalar <$file>, 1..10;` [download]	[reply] [d/l]
Re^2: reading several lines in a gulp by wind (Priest) on Apr 27, 2011 at 17:29 UTC
Same problem with moritz's first solution. Needs to handle eof conditions: `@lines = map {eof($file) ? () : scalar <$file>}, 1..10;` [download]	[reply] [d/l]
Re^3: reading several lines in a gulp by Anonymous Monk on Apr 27, 2011 at 18:45 UTC
In that case, defined-or operator imposes scalar context `@lines = map { <$file> // () } 1 .. 10;` [download]	[reply] [d/l]
Re^4: reading several lines in a gulp by wind (Priest) on Apr 27, 2011 at 18:52 UTC
Re: reading several lines in a gulp by anonymized user 468275 (Curate) on Apr 28, 2011 at 16:37 UTC
If it's unix, for max perf. I'd seek to successive (4096 byte) buffered IO page boundaries per outer iteration, transfer the lines contained to an array, process those in an inner iteration and carry over the last incomplete line (if no \n) to the next IO page iteration. One world, one people	[reply]
Re^2: reading several lines in a gulp by moritz (Cardinal) on Apr 28, 2011 at 17:08 UTC
Now guess what perl does under the hood... Perl 6 - second systems done right	[reply]
Re^3: reading several lines in a gulp by anonymized user 468275 (Curate) on May 02, 2011 at 08:43 UTC
Yes, but if you read a whole file into an array, in spite of such optimisation being reasonable at that point, Perl won't reorganise your code to minimise memory usage, nor does it provide hooks to insert your code per iteration of such optimisation. One world, one people	[reply]
Re^4: reading several lines in a gulp by moritz (Cardinal) on May 02, 2011 at 09:18 UTC
Re: reading several lines in a gulp by JavaFan (Canon) on May 01, 2011 at 14:27 UTC
`my $several = ...; my @lines; push @lines, $_ while @lines < $several && defined($_ = <$file>);` [download]	[reply] [d/l]
Re^2: reading several lines in a gulp by John M. Dlugosz (Monsignor) on May 01, 2011 at 16:31 UTC
Interesting, by using the count-up of the scalar @lines instead of counting down of $count it is naturally proof against funny (negative) values of $count and works correctly for an initial value of zero. If scalar @lines is efficient and "free" (it knows the size anyway) it might even be faster than incrementing $count.	[reply]
Re: reading several lines in a gulp by LanX (Saint) on Apr 29, 2011 at 17:31 UTC
$. and eof should help `my $count=3; while (<DATA>) { next if ($.-1) % $count and ! eof; print @gulp; print "-----\n"; @gulp=(); } continue { push @gulp,$_ } __DATA__ a b c d e f g h` [download] I'm sure there are more elegant solutions... Cheers Rolf	[reply] [d/l]
Re^2: reading several lines in a gulp (iterator) by LanX (Saint) on May 01, 2011 at 13:06 UTC
> I'm sure there are more elegant solutions... indeed, iterators are easier to maintain! `sub readlines { my ($fh, $count) = @_; my @gulp; push @gulp, scalar <$fh> while $count-- and ! eof $fh; return @gulp; } while ( @lines = readlines(DATA,3) ) { print @lines,"----\n"; } __DATA__ a b c d e` [download] prints `a b c ---- d e ----` [download] alternative iterator: `sub readlines { my ($fh, $count) = @_; my @gulp; while (<$fh>) { push @gulp,$_; last unless --$count; } return @gulp; }` [download] Cheers Rolf UPDATE: Handling of edge cases like missing $count parameter could be added to the iterators: `$count=1 unless $count;` Maybe `$count<=0` should be handled differently... 0 => no iteration -1 => slurp whole file -2 .. => warning	[reply] [d/l] [select]
Re^3: reading several lines in a gulp (iterator) by John M. Dlugosz (Monsignor) on May 01, 2011 at 13:25 UTC
I like it! Thanks.	[reply]


Welcome to the Monastery
	PerlMonks