Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I'm using the following code to move all the files from dir1 to dir2...
perl -MFile::Find -MFile::Copy -we ' $dir1="/home/user/dir1"; $dir2="/home/user/dir2"; find(\&wanted, $dir1); sub wanted { File::Copy::move( "$File::Find::dir/$_" , $dir2 ); }'

... which works fine, but I wanted to optimize it by having _as many_ files as possible in the 'wanted' function, instead of moving the files on _one by one_ basis as the above code does ( as I understand it ). Something similar to this shell code, which moves the files all together in _one shot_ or at least in big chunks ...,

mv $( find . -type f ) /home/user/dir2/

Obviously the above code fails for file names with spaces in them ..., so I thought using Perl, but can't seem to find the right approach to insert _all_ of the files. Below is just an illustration of my idea:

File::Copy::move( "$File::Find::dir/*" , $dir2 ),  # a * instead of current file $_

Is this possible ? Thanks.

Replies are listed 'Best First'.
Re: Optimizing files' moving with File::Find & File::Copy modules
by holli (Abbot) on Nov 17, 2009 at 11:37 UTC
    We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. (Donald Knuth)
    Though this statement is not applicable in all situations it certainly is in this one.

    Why do you think that "optimiz<ing> it by having _as many_ files as possible in the 'wanted' function would affect the performance? The only thing here you could call (minimal) "overhead" is the repeated calls to "wanted", but that's just the price you pay for using File::Find.

    Why do you think that the OS would "move(...) the files all together in _one shot_ or at least in big chunks"? It moves them file by file.

    Update: If you perceive performance differences between your perl version and a shell based equivalent, that's because File::Copy moves a file by
    1. opening the target for writing
    2. reading from source and write to the target
    3. unlinking/deleting the source
    Depending on OS and filesystem this is less then optimal but also not OS dependent.

    If that's no concern you could push the files found by "wanted()" into an array (or better yet use File::Find::Rule, which doesnt have this clunky callback interface) and feed that to the systems "mv", using proper quoting.


    holli

    You can lead your users to water, but alas, you cannot drown them.
      My guess is that the OP wants to cut down on the number of processes - which many monks will tell you is "bad".

      Personally, if I were to move within the file system, I would call rename (and use an external find to find the files); if I were to move from one filesystem to the other, I would realize the entire process is likely to be disk I/O bound, so I wouldn't care about the process calls.

        What processes? Neither File::Find nor File::Copy spawn any external processes.


        holli

        You can lead your users to water, but alas, you cannot drown them.
Re: Optimizing files' moving with File::Find & File::Copy modules
by JavaFan (Canon) on Nov 17, 2009 at 02:14 UTC
    If $dir2 doesn't contain any directories, you could just do:
    system "mv $dir1/* $dir2";
    which does work even if there are filenames with spaces in them.

    If $dir2 contains directories, you could do:

    system "find $dir1 -type f -exec mv {} $dir2 ';'";
    Note that File::Copy::move doesn't allow moving more than one file at a time.
      Thanks for your suggestions.

      I could have used system "find ...", but I'd be facing the same issues ( special chars, whitespaces in filenames ), and also I'd like to avoid find ... -exec .. as this will process the files one by one. I need find, cause I need a recursive selective check to be done on all files.

      >> Note that File::Copy::move doesn't allow moving more than one file at a time.<<

      Hmmm ... that's bad news ... :(

      My whole point is optimization, so having _all_ the files ( with potentially bad characters in their names ) available to 'mv' command, but I guess that's not possible, and doing that with shell only is a real pain.

      Thank you anyway.
        system "find $dir1 -type f -exec mv -t $dir2 '{}' '+'";
        Spaces and special characters aren't a problem as find doesn't invoke a shell. And by using a + instead of a ;, the -exec will behave as xargs.

        You may need GNU implementations of find and mv, I do not know whether mv -t is a POSIX requirement, nor do I know that of find -exec command . But I doubt it's much of a problem to get those GNU tools running on a platform that supports perl.