SadEmperor has asked for the wisdom of the Perl Monks concerning the following question:

Now I'm trying to finish a script for sharing huge files (maybe 3Gb) in local network using remote drive under MSWin32. I found some helpful code at File copy progress., and in this code, the following statements make me puzzle :
my $blksize = int ($filesize / 10); my $num_read = sysread(SRC, $buffer, $blksize);
That's how to decide the block size ($blksize) to get the best performance, and is that sensitive to file system, if I use my script under Linux, shall I modify the value ?
Thanks very much !

Replies are listed 'Best First'.
Re: How to decide the size of block in file transferring?
by GrandFather (Saint) on Nov 18, 2008 at 06:17 UTC

    Actually you would be better to use File::Copy. Trying to figure out an optimum block size is a tricky thing that depends on factors within the operating and file systems that you probably don't have easy access to, are likely highly volatile and are certainly not portable.


    Perl reduces RSI - it saves typing

      There simple logic for block size choosing in File::Copy:
      Unless given as parameter the block size choosen accordingly to size of the copied file up to 2MB. If the file length less than 512 bytes then the size is set to 1024.

      So if OP will use File::Copy than he never know if optimal size more than 2MB. :)

        Optimal size is likely to be an order of magnitude smaller than 2 MB. Beyond a few tens of KB any gains are likely to be very small. At some point swapping will become an issue with a dramatic decrease in throughput. Allowing a 2 MB buffer seems fairly generous given the current state of technology and even just a few years ago would have been ridiculously large.


        Perl reduces RSI - it saves typing

        There is no tracking call back, but File::Copy does provide syscopy which will use an OS provided file copy for Win32, VMS and OS/2 and will fall back to its normal copy for *nix.


        Perl reduces RSI - it saves typing
Re: How to decide the size of block in file transferring?
by BrowserUk (Patriarch) on Nov 18, 2008 at 06:31 UTC

    On windows (and probably all filesystems), I'd strongly suggest that you avoid such arbitrary math which will result in weird blocksizes and instead opt for some multiple of the filesystems inherent read size which is generally 4096 bytes (under NTFS). I've also found that throughput gains tend to tail off rapidly as the blocksize grows, with 64kb usually seeming to give the best read/write performance.

    You should also ensure that you read/write the files in binary, (':raw'), even if they are text files as there is a substantial overhead in crlf translations, which are redundant for disk to disk transfers.

    The only merit I see in the /10 strategy is that it makes the progress calculations simple. And that seems no good reason at all to suffer the cost of partial block reads and writes.

    Also, if this were purely for Windows, then I'd use CopyFileEx which provides a callback for progress monitoring and display, and is likely far more efficient than you could write at the Perl level. But you seem to be looking for portablility?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Yes, the script is running under Windows now, but it maybe used under Linux in the future.
      I'll try CopyFileEx, Thanks a lot :)
Re: How to decide the size of block in file transferring?
by ccn (Vicar) on Nov 18, 2008 at 06:16 UTC
    It is easy to know. Just make an experiment!