RE: Re: read X number of lines?

Just for something kind of different, you can play fun games with the input record seperator and end up with something like

sub IRS_chunky {
    my($buf, $leftover, @lines);
    local $/ = \10240;
    open(FILE, "totalfoo");
    while( $buf = <FILE> ) {
       $buf = $leftover.$buf;
       @lines = split(/\n/, $buf);
       $leftover = ($buf !~ /\n$/) ? pop @lines : "";
       foreach (@lines) { }
    }
    
    close(FILE);
}
[download]

which is comparable, on my machine, to chunky.

My real question, though, is what kind of machines are you running this on? My benchmark results are completely different than yours. Having run this three times on a Sun Ultra 5 on a file that is approx 500,000 lines and 24Mb in size from my local IDE drive, my results were pretty consistently like this

perl test_read.pl
Benchmark: timing 10 iterations of Chunky IRS, chunk, linebyline...
Chunky IRS: 41 wallclock secs (31.31 usr +  3.71 sys = 35.02 CPU)
     chunk: 40 wallclock secs (31.00 usr +  3.81 sys = 34.81 CPU)
linebyline: 27 wallclock secs (17.67 usr +  2.47 sys = 20.14 CPU)
[download]

The code I used was identical to the earlier post, with the subroutine I wrote added. Using perl-5.6 generated the same basic results, plus or minus 1 for each stat. I am now somewhat confused. Is this a difference in the way Solaris uses its buffers? What platform/OS were the original tests run on?

mikfire

Comment on RE: Re: read X number of lines? Select or Download Code

Replies are listed 'Best First'.
RE: RE: Re: read X number of lines? by ZZamboni (Curate) on May 26, 2000 at 17:17 UTC
I'm missing something here. What does assigning $/=\10240 mean? Thanks, --ZZamboni	[reply]
RE: RE: RE: Re: read X number of lines? by mikfire (Deacon) on May 26, 2000 at 17:37 UTC
Something new and twisted they added in perl 5.005. To quote perldoc perlvar: Setting $/ to a reference to an integer, scalar containing an integer, or scalar that's convertable to an integer will attempt to read records instead of lines, with the maximum record size being the referenced integer. So this: $/ = \32768; # or \"32768", or \$var_containing_32768 open(FILE, $myfile); $_ = <FILE>; will read a record of no more than 32768 bytes from FILE. If you're not reading from a record-oriented file (or your OS doesn't have record-oriented files), then you'll likely get a full chunk of data with every read. If a record is larger than the record size you've set, you'll get the record back in pieces. [download] mikfire	[reply] [d/l]