smanicka has asked for the wisdom of the Perl Monks concerning the following question:

#!/usr/bin/perl # copy a file $in = "postcodes"; $out = "pc.bak"; open (IN,$in); open (OUT,">$out"); print OUT $buffer while (read (IN,$buffer,65536));
What does the 65536 indicate here? i got this off some site and am trying to understand how this works since i need to write a similar function to copy using bin mode. Update:I actually want to make this code to copy files in binary mode.Any ideas to help me on my mission?
#!usr/bin/perl use strict; use File::Find; use File::Copy; my @location=("C:\\Documents and Settings\\smanicka\\Desktop\\JAVA","C +:\\Documents and Settings\\smanicka\\Desktop\\excel files-new"); my $new_location="C:\\moved_files"; foreach my $location(@location){ find(\&force_move,$location); } sub force_move(){ my $file=$_; print "\n $file"; copy($file,$new_location) or warn "$!" ; } print "I am done!!!!"; sleep(2);
File::copy says that file would be opened in bin mode where applicable.The issue is that the font files that I am trying to copy are often not copied properly and the copied files are smaller than the original file.

Replies are listed 'Best First'.
Re: bin mode file copy
by Corion (Patriarch) on Mar 16, 2009 at 15:56 UTC
Re: bin mode file copy
by Bloodnok (Vicar) on Mar 16, 2009 at 16:09 UTC
    Both Corion & kennethk, quite correctly, refer you to read.

    However, IMO, what the aforementioned doc doesn't explicitly tell you is that (if LENGTH is specified) the call to read() should attempt to read no more than LENGTH bytes before returning i.e if the number of bytes successfully read from the file is less than LENGTH bytes, then the call will still be deemed successful (the call always returns the number of bytes actually read - which is stated in the doc).

    Update

    Wording modified in (what I think is) in line with ikegami's observation.

    A user level that continues to overstate my experience :-))

      This isn't the first time someone suggested adding the word "max", but it would be wrong to do so.

      If someone were to complete "I attempted to read a max of 10,000 chars, but", I would expect to hear "I read 15,000." A failed attempt is a failure to match the "max" constraint (going over).

      It would probably be useful for the docs to specify that falling short of the goal isn't considered an error. And that read will actually wait for the desired number of chars to arrive (absent eof or error). But it's incorrect to use "max" as you've used it.

Re: bin mode file copy
by ikegami (Patriarch) on Mar 16, 2009 at 20:40 UTC

    Not many people will notice the fact that you added an entirely new question to your post.

    I doubt it's a problem with File::Copy. Even as far back as Perl 5.6.0 / File::Copy 2.03 / 8 years ago, copy used binmode whenever its arguments are file names (as they are in the snippet you posted).

    Could you provide which version of Perl and File::Copy you are using, and could you post the first 10 lines of fc /b srcfile dstfile?

Re: bin mode file copy
by kennethk (Abbot) on Mar 16, 2009 at 15:58 UTC
    From read, the 65536 is the length argument where
    Attempts to read LENGTH characters of data into variable SCALAR from the specified FILEHANDLE.
Re: bin mode file copy
by locked_user sundialsvc4 (Abbot) on Mar 16, 2009 at 16:44 UTC
    Although 65536 is a number familiar to anyone in our binary centered world, here it may well be an arbitrary choice. It still seems uncomfortably "C-like" to me. I'll bet this entire logic can be replaced by calls to existing, known-good CPAN code... therefore, do so. When in Perl, do as the monks do, etc...
      Don't even need CPAN. File::Copy is part of Perl.
Re: bin mode file copy
by Marshall (Canon) on Mar 16, 2009 at 19:25 UTC
    First, if you open a file, it will be opened by default in text format. That means that \n (new lines), etc mean things.

    If the file you are going to copy is just a bunch of bits, then you should do something like this to just completely ignore any such bytes:

    $out = "output_path"; open (OUTBIN, '>',"$out") || die "unable to open $out"; binmode(OUTBIN) || die "unable to set binmode $out";
    The above says that I'm opening the $out file and I'm just gonna send bits to it! This next part does the copy..
    open(INBIN, "<$x") || die "unable to open $x"; binmode(INBIN) || die "unable to set binmode $x"; while (read(INBIN, my $buff, 8 * 2**10)) { print OUTBIN $buff; } close(INBIN) || die "unable to close inbin"; close(OUTBIN) || die "unable to close outbin";
    The above says: open filepath $x, then set it for binary read. Each read will be 8*2**10 or 8 * 1024= 8192 bytes. Perl will help out here as it keeps track of the number of bytes in $buff. If the $buff has less than what you expect for the maximum read, this is no problem!

    2**10=1024 is a "magic number" for the hardware.
    A typical Unix system will read from the hard drive in increments of 4x that or 4096 bytes. Here the buffer size is twice that or 8192 bytes. It is counter-intuitive, but increasing the buffer size can actually slow things down if you have a smart disk system.

    The important part is to set BINMODE. And for the read, you will have to specify a size that should be in increments of 1024 (a magic number).

      A typical Unix system will read from the hard drive in increments of 4x that or 4096 bytes. Here the buffer size is twice that or 8192 bytes. It is counter-intuitive, but increasing the buffer size can actually slow things down if you have a smart disk system.

      With respect to disk reads, it doesn't really matter what size you specify with Perl's read(). Perl uses its own internal buffer anyway, which is 4k (hardcoded in the Perl source, i.e. not configurable, except by recompiling perl). In case you don't believe me, do an strace on your sample code, and you'll see that the underlying system calls always return 4096, independently of whether you specify 1k, 2k, 8k, or whatever size with read().

      But you could use sysread(), which is actually implemented in terms of the read(2) system call, and thus does pass through the size you request. Whether the latter actually maps to disk block read requests is still another story, though... (depends on the OS).

      You might also want to read 4k read buffer is too small.

Re: bin mode file copy
by Marshall (Canon) on Mar 19, 2009 at 00:58 UTC
    I think I see the problem now. The sub force_move() is only going to get file names, not full path names. I.e. $_ is just going to be a file name, not a full path name in force_move(). So the copy will fail without regard to binmode or not. As you descend through the directory structure, you will have to keep track of the current file path.

    Another point: a directory is a special type of binary file and cannot be copied with a simple copy.

    if (! (-d $filepath) && (-B $filepath) { true if not a directory and otherwise a binary file}
    Also in Windows, it is not necessary to use \\, use / for the separators in file path. Perl will convert this the right way for Windows and it is easier to read.

    So I would suggest getting this force_move thing working with full file paths: copy full_dest_path, full_src_path.