Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

Re: Most efficient way to load file contents into scalar?

by betterworld (Curate)
on Apr 24, 2009 at 10:13 UTC ( #759777=note: print w/replies, xml ) Need Help??

in reply to Most efficient way to load file contents into scalar?

You could try Sys::Mmap::Simple. It doesn't really load the file, but for some applications, the end result is the same, and it should be much faster for big files.

  • Comment on Re: Most efficient way to load file contents into scalar?

Replies are listed 'Best First'.
Re^2: Most efficient way to load file contents into scalar?
by spx2 (Deacon) on Apr 24, 2009 at 15:00 UTC
    In Linux you have mmap which is a system function built just for that purpose,of course there are wrappers in Perl that do that for you,like the one mentioned above.
    MMAP(2) Linux + ProgrammerÔs Manual + MMAP(2) NAME mmap, munmap - map or unmap files or devices into memory DESCRIPTION mmap() creates a new mapping in the virtual address space of +the calling process. The starting address for the new mapping is spe +cified in addr. The length argument specifies the length of the mapping.
      What I was planning to do is to load a particular file into memory, and then open that scalar as a file handle so as to do repeated file operations on it (ideally for performance improvements).
      But the way it seems to be working out, it doesn't seem to work... is there a better way?
        You haven't told us what this file is and what you are doing with it. I'm guessing that you are doing repeated searches through this huge thing. It sounds like you don't have enough RAM on your system. It could be that the easiest answer for you is to spend $50 on more RAM!

        For example, I have an app that searches a data set of 700 files. The first exhaustive search thru all 700 files takes about 7-8 seconds. The second time I search thru those same 700 files, it takes way less than 1 second! Wow! My system has plenty enough ram for the data in all those files and my OS can use a whole bunch of the RAM for its own use. What happens is that on the first search, the OS winds up caching all of the files in memory. The second search is 10x+ faster because the OS knows that those files haven't changed and it has a cached copy of them.

        In my app the searches are driven by interactive user input so I can't bunch them all up and do them at the same time. Of course there all sorts of fancy database things that I could do. But there is no need! Typical user works with this program for a couple of hours at a time. Actually nobody has even noticed that the first search is slow! They just remember how well its been working for the last 30 minutes! Anyway I can deliver sub-second response time to a user request and that is way fast enough! It takes the user way longer than that to figure out what the next question is gonna be! No need to optimize further! I would add that this app gives a "progress report" as it searches and delivers partial results as they become available and the user can abort the operation if needed. Even if search n+1 took a few seconds, this is ok as user is starting to think about the results that are flowing in.

        Anyway I have very simple code that just gets faster the more times it is used. Perfect. Don't do something yourself that the OS can also do very, very well and with no coding effort on your part!

        Perl alone won't help you at all in understanding how things work out. In particular the Mmap module that betterworld suggested is useful but not without knowing how it works. If you're serious about this I suggest you take the APUE book , flip to page 487 and read up on how mmap actually works(pages 487 and 488). Perl is not always the best context to learn some ideas,in this case it is just a wrapper around a concept which does not belong to it.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://759777]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2022-11-30 21:27 GMT
Find Nodes?
    Voting Booth?