Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Most efficient way to load file contents into scalar?

by smallwalrus (Initiate)
on Apr 24, 2009 at 09:28 UTC ( [id://759763]=perlquestion: print w/replies, xml ) Need Help??

smallwalrus has asked for the wisdom of the Perl Monks concerning the following question:

Hi fellow monks, I've been experimenting with a piece of code here that slurps the contents of a file into a scalar, and found that it runs really slow! To illustrate, copying the file takes approximately 20 seconds, however loading that entire file into a scalar takes 10 minutes. The procedure I used was kind of basic though. do { local $/; <$fh>; } Is there a faster way to load a file's contents into a scalar?
  • Comment on Most efficient way to load file contents into scalar?

Replies are listed 'Best First'.
Re: Most efficient way to load file contents into scalar?
by moritz (Cardinal) on Apr 24, 2009 at 09:31 UTC
    File::Slurp uses sysread, thus bypassing any IO layers, which can be a performance benefit.

      A direct sysread call seems to be >2x faster:

      $ ls -sh bigfile.txt 2.8G bigfile.txt $ cat direct_sysread.pl use Fcntl qw(:DEFAULT); use Symbol; my $file = shift @ARGV; my $fh = gensym; sysopen $fh, $file, O_RDONLY; my $content; sysread $fh, $content, -s $file; $ cat file_slurp.pl use File::Slurp; my $file = shift @ARGV; open my $fh, "<", $file or die $!; my $content = read_file ($file); $ time direct_sysread.pl bigfile.txt real 0m1.785s user 0m0.008s sys 0m1.776s $ time file_slurp.pl bigfile.txt real 0m4.994s user 0m0.924s sys 0m4.052s

      citromatik

Re: Most efficient way to load file contents into scalar?
by betterworld (Curate) on Apr 24, 2009 at 10:13 UTC

    You could try Sys::Mmap::Simple. It doesn't really load the file, but for some applications, the end result is the same, and it should be much faster for big files.

      In Linux you have mmap which is a system function built just for that purpose,of course there are wrappers in Perl that do that for you,like the one mentioned above.
      MMAP(2) Linux + Programmerâs Manual + MMAP(2) NAME mmap, munmap - map or unmap files or devices into memory DESCRIPTION mmap() creates a new mapping in the virtual address space of +the calling process. The starting address for the new mapping is spe +cified in addr. The length argument specifies the length of the mapping.
        What I was planning to do is to load a particular file into memory, and then open that scalar as a file handle so as to do repeated file operations on it (ideally for performance improvements).
        But the way it seems to be working out, it doesn't seem to work... is there a better way?
Re: Most efficient way to load file contents into scalar?
by AnomalousMonk (Archbishop) on Apr 24, 2009 at 10:00 UTC
    If copying the file (Copying how? With the OS? With a Perl read/print while-loop?) takes 20 seconds, this implies the file is lotsa gigabytes in size.

    If this is the case, you're lucky it only takes 10 minutes to read it into a scalar! Symptom to look for: merciless disk thrashing while the file is read to the scalar.

    Solution: don't do that! Process the file in pieces.

Re: Most efficient way to load file contents into scalar?
by Anonymous Monk on Apr 24, 2009 at 09:56 UTC
    It is probably too big . You could try pre-sizing scalar
    my $size = -s FILE; my $giant = ' ' x $size; print "Read ", read(FILE,$giant,$size,0)," of $size bytes";
    but if its really big it won't make any difference.
Re: Most efficient way to load file contents into scalar?
by targetsmart (Curate) on Apr 24, 2009 at 12:39 UTC
    the node title can be 'other/alternate ways of loading a file into a scalar'
    because loading a file to a scalar is nice when the file size is small, if file size is in Giga Bytes processing the entire file as one slurp in a scalar is meaningless and not a efficient technique.

    Vivek
    -- In accordance with the prarabdha of each, the One whose function it is to ordain makes each to act. What will not happen will never happen, whatever effort one may put forth. And what will happen will not fail to happen, however much one may seek to prevent it. This is certain. The part of wisdom therefore is to stay quiet.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://759763]
Approved by moritz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2024-04-18 05:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found