nimdokk has asked for the wisdom of the Perl Monks concerning the following question:

My team handles automated file transfers within the company and to/from external companies. We have been asked to see how long it would take us to move a few hundred DVD's worth of data from its current location on jukeboxes to a SAN. We are trying two different methods, one using some third-party software we have to automate moving data and the other is Perl. The data itself is kept in an array of about 1000 directories, each directory contains a variable number of files that need to be moved. When we do this, we need to make sure we maintain the directory layout.

I had done some tests a while back benchmarking how fast it would take File::Copy to move large files versus our third-party software. Normally, File::Copy would come out on top which is part of the reason we are looking at trying this out with Perl.

My question is this: has anyone set up something like this before - thoughts on other ways to do this. What I have set up is one script that will take a parameter to indicate which sub-directory structure needs to be copied. It will then loop through the subdirectories, copying all the files sequentially to the new location, creating the new directories as it goes. I looked at File::Xcopy, but I'm not sure if it would help us much and since I've never used it before, I'm not sure if the copies would be optimized the way they seem to be with File::Copy.

Thoughts would be appreciated.

Thanks.

Replies are listed 'Best First'.
Re: Data Copying
by dragonchild (Archbishop) on Jan 18, 2005 at 14:59 UTC
    rsync? cp -r? your standard backup software? (Backing up can be used as a very efficient copying tool ...)

    Also, why does it matter how long computers take? This is the kind of job weekends were designed for ...

    Being right, does not endow the right to be rude; politeness costs nothing.
    Being unknowing, is not the same as being stupid.
    Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
    Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

Re: Data Copying
by Anonymous Monk on Jan 18, 2005 at 15:39 UTC
    I wouldn't dream of rolling my own stuff in this scenario. And specially not something that's based on the broken and limited functionality of 'File::Copy' (which, among other things, doesn't respect permission bits, and only does file to file copy). There are a handful of tools that are excellent, and are optimized to do exactly this task: copying (large volumes of) data from one location to another.

    My preference would be to use rsync (using ssh to do remote login if necessary). It does everything you list as requirements, and it's very efficient in picking up syncing if aborted halfway (it won't transport filesblocks that are identical on both sides).

    Reuse code. It's good for you.

Re: Data Copying
by TedPride (Priest) on Jan 18, 2005 at 15:25 UTC
    Some more background information might be helpful. Since the main limiting factors are going to be the speed of your DVD drive and the speed of your hard drive, I doubt the interface between the two is going to matter a whole lot. I'd personally just use Perl for the overall program (looping through the DVDs, selecting the directory on each to copy) and call the system copy function to actually recursively copy. Why reinvent the wheel?