comment on

Nope.

This code suffers from the same bug as the other code.

Notice that he said perl 5.6 on Win2k.

Updated

Also tested other solutions, with 5.6 on win98, this yielded the best performance.

Please to show your benchmarks.

use strict;
use warnings;
use Benchmark 'cmpthese';

my %subs = (
            list_io =>
            sub {
                # Second time round this will take
                # a loooooong time.
                my $file=$0;
                my @line;
                open my($fh), $file;
                @line= <$fh>;
                close($fh);
                return \@line;
            },

            split_slurp =>
            sub {
                my $file=$0;
                my @line;
                local $/;
                  open my($fh), $file;
                @line=split /\n/,<$fh>;
                close($fh);
                return \@line;
            },

            while_io =>
            sub {
                my $file=$0;
                my @line;
                 open my($fh), $file;
                 push @line,$_ while <$fh>;
                close($fh);
                return \@line;
            },
          );

cmpthese -5,\%subs;

__END__
[download]

Benchmark: running list_io, split_slurp, while_io, each for at least 5
+ CPU seconds...
   list_io:  5 wallclock secs ( 3.89 usr +  1.34 sys =  5.24 CPU) @ 17
+21.98/s (n=9018)
split_slurp:  4 wallclock secs ( 3.02 usr +  2.21 sys =  5.23 CPU) @ 2
+725.38/s (n=14251)
  while_io:  7 wallclock secs ( 3.64 usr +  1.38 sys =  5.02 CPU) @ 16
+29.26/s (n=8174)
              Rate    while_io     list_io split_slurp
while_io    1629/s          --         -5%        -40%
list_io     1722/s          6%          --        -37%
split_slurp 2725/s         67%         58%          --
[download]

Don't trying to be confused, and think local $/, read in and then split would improve performance.

In light of the evidence I think you will have to reconsider.

(As I will explain, whether you use scalar context has not much to do with performance.

Depends what you mean. Reading in smaller chunks at a time reduces memory overhead and can thus have a signifigant effect on run time.

On the other hand, keep in mind, split is not free, it requires to walk thru the whole string. As anyone has a c background would know, string operation does hurt performance a lot, especially this kind of operation that invloves head to toes.)

You would think this on face glance. As I said the evidence contradicts you.

Perhaps its due to perl being able to allocate one buffer sysread the lot and then walk the string. It may in fact be that this is more efficient than reading whatever the standard size buffer is for PerlIO, scaning it for new lines, then reading another buffer.... (Assuming of course memory is available)

First layer, the physical reading layer, Perl would read in block by block, doesn't matter whether your code requires a scalar context or array context. This makes sense, it is optimized for Perl's own performance.

By block by block presumably you mean buffer by buffer.

Second layer, the layer between Perl and your program. Perl would present the data in the right context, as you required. This layer doesn't involve physical devices, and is much less related to performance than the first layer does.

You have the return type part of context mixed up with the actions that the context causes to happen. In list context the IO operator does something different to what it does in scalar context. In list context it causes the entire file to be eventually read into mememory and sliced up into chunks as specified by $/.

In scalar context it reads enough to provide a single chunk at a time. If the chunks are small it may read more chunks into memory than one, but it doesn't necessarily load them all.

I will agree that I was suprised myself however about these results. But you cant just say that something works the way it does, and that it should be faster, because you think so. Unless you have poured over the PerlIO code and unless you have benchmarked the issue in question rather exhaustively you have no way to know howfast something is going to run in Perl.

--- demerphq
my friends call me, usually because I'm late....

In reply to Re: Re: opening a file in a subroutine by demerphq
in thread opening a file in a subroutine by abhishes

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.