in reply to Perl always reads in 4K chunks and writes in 1K chunks... Loads of IO!

You can slurp the file in one read and split it yourself:

#! perl -slw use strict; my $file = 'test.txt'; open DF, '<:raw', $file or die "$file : $!"; my @test = split "\n", do{ local $/ = \ -s( $file ); <DF> }; close DF;

However, if you are reading this file frequently, (like every time a web page is hit as suggested by your example), then you are probably worrying about the wrong thing. After the first time the file is read, it will be cached in the file system cache, so the second and subsequent times you read it, the 4k reads will be coming from cache. You can demonstrate this to yourself if you have a disk activity led on your machine. Run the above script and you should see the disk hit for a sustained period the first time. The second time and subsequent runs you may see a brief access but no sustained hit.

Equally, whilst you may see many 1K calls to the system write api, these will frequently be cached in ram and written to disk asynchronously as the demands on the cache dictate. For example, the system may decide to write chunks out when the disk head is in approximately the correct position following disk activity by other processes. If you attempt to optimise the writing by your process, you could interfere with the dynamics of the overall system which could actually result in slower throughput. The very best way to ensure optimal IO for your process and throughput by the entire system is to increase the proportion of your ram that is devoted to the system cache.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re: Perl always reads in 4K chunks and writes in 1K chunks... Loads of IO!
  • Download Code

Replies are listed 'Best First'.
Re^2: Perl always reads in 4K chunks and writes in 1K chunks... Loads of IO!
by NeilF (Sexton) on Jan 01, 2006 at 19:55 UTC
    BrowserUk, thanks... Two questions/comments regarding your post.

    Wouldn't your code mean the lines are stripped of the line feeds they originally had? ie: When you came to write the array out it would not longer have the line feeds and you'd have to add them into every line?

    The area I'm looking at is where I'm posting a new message in a forum, which reads the forum in, manipulated the lines and then writes it back out. So this code is not used for general browsing, just when updating.


    I'll have a play with your example and see what the outcome is... You recon it reads it in one(ish) hit and not in horrible 4K blocks?

      The problem is, it is quite likely that your ISP is measuring your IO in terms of bytes read and written rather than the number of reads and writes, so reducing the latter is unlikely to satisfy them.

      Also, when you have read the entire file, there is no need to re-write the entire thing in order to add a new line. If you open the file for reading and writing, when you have read it, the file pointer will be perfectly placed to append any new line to the end. That will reduce your writes to 1 per new addition. If there is no new addition, they user is just refreshing, then you'll have no writes.

      Also, you presumably do not redisplay the entire forum each time, but rather only the last 20 or so lines?

      If this is so, then you should not bother to re-read the entire file each time, but rather use File::ReadBackwards to get just those lines you intend to display. If you do this, then you can use seekFH, 0, 2 to reposition the pointer to the eof and then append new lines without having to re-write the entire file each time.

      Using this method, you can fix the total overhead per invocation to (say) 20 reads and 0 or 1 writes. You'll need to deploy locking, but from your code above you seem to be already familiar with that.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Thanks... Interesting stuff! They do seem to talk specifically about IO processes! Is File::ReadBackwards a standard library/module?? ie: Will it exist on sites?

      I just realised I completely ignored one of your questions.

      Wouldn't your code mean the lines are stripped of the line feeds they originally had?

      Yes, as I coded it the newlines would be removed. This would effectively do a free chomp @test;. I don't see this as a problem as it would cost very little to replace them when writing the lines out again.

      However, if you want them left in place, then you could use the following split instead.

      #! perl -slw use strict; my $file = 'test.txt'; open DF, '<:raw', $file or die "$file : $!"; my @test = split /(?<=\n)/, do{ local $/ = \ -s( $file ); <DF> }; close DF;

      All that said, if you are only appending to the end of the file, why read the file at all? Have you heard of opening a file for append?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.