Re: Copying a large file (6Gigs) across the network and deleting it at source location
by Abigail-II (Bishop) on Feb 04, 2004 at 23:18 UTC
|
Copying a file of 6Gb means you have to write 6Gb of data.
That's going to take a long time. Instead of copying and
removing, people tend to 'move' a file instead. That's fast
when it's on the same filesystem, and on most modern OSses,
it falls back to copy and delete if the data has to moved
to a different filesystem.
But I don't really understand your question. You can't
really speed up the process - at least not by using different
statements in your program (you might be able to tune your
OS that copying huge files goes faster). I don't know why you
are considering a timer, and I've no idea what you mean by
"copying until EOF to delete the file once it finished copying".
I would do the thing you're doing from the command line,
and skip the Perl part:
find M:/Directory -name '*.BAK' \
-exec mv {} 'I:/(Directory0)/(Directory1)/(Directory2)/{}' \;
Abigail | [reply] [d/l] |
Re: Copying a large file (6Gigs) across the network and deleting it at source location
by allolex (Curate) on Feb 04, 2004 at 23:15 UTC
|
Sorry for stating this so flatly, but you should really do a checksum on the original and compare it to your copy before unlink()ing the orginal. Have a look at Digest::MD5, which seems to be very popular. You could also call the *NIX command 'cksum' (which probably has been ported to Windows, or at least has an equivalent) and get very similar results.
Also, when handling errors, try 'die' instead of 'print', so Perl will return the right error level: unlink( "$_" ) or die "Couldn't delete file: $_\n"; If you add an 'or die' to the copy operation, it will only attempt to delete the orginal if the copy is successful.
| [reply] [d/l] |
|
|
Doing a checksum will effectively double the transfer time because the files need to be read back from the remote location. Especially since a network filesystem copy is fairly reliable since the OS does some error recovery.
| [reply] |
|
|
| [reply] |
Re: Copying a large file (6Gigs) across the network and deleting it at source location
by Roger (Parson) on Feb 05, 2004 at 00:40 UTC
|
Just an idea, how about using secure copy to copy files across the network recursively instead?
scp -CBvrp srcpath user@host:destpath
|||||
||||+-- Preserves time stamp
|||+--- Recursively copy entire directories.
||+---- Verbose mode useful for logging
|+----- Batch mode
+------ Enable compression accross network
and then...
del /S *.BAK # recursively delete BAK files
| [reply] [d/l] |
|
|
This assumes that the extra CPU juice needed to encrypt/decrypt and compress/decompress 6gb will allow the transfer to happen at the same rate or faster. It may or may not, but I assume if his network can't transfer raw 6gb faster than he has stated, the CPU's (and memory) plugged into such a network may be an even tighter bottleneck.
| [reply] |
Re: Copying a large file (6Gigs) across the network and deleting it at source location
by dws (Chancellor) on Feb 05, 2004 at 05:10 UTC
|
It takes about one hour and half to copy across the network.
You don't mention how far apart the source and destination are, or whether they're attended by people (as opposed to running in a dark colo somewhere). If the servers are close, there's an option we often forget: use removable drives. It take considerably less than an hour and a half to copy 6Gb of data at IDE (or SCSI) speeds, remove the drive, and walk it across the room to the backup box.
Or, if the source and destination are far apart and "latency" isn't critical, shipping a removable drive via FedEx can still yield reasonable bandwidth.
It might be an option to consider.
| [reply] |
Re: Copying a large file (6Gigs) across the network and deleting it at source location
by ctilmes (Vicar) on Feb 05, 2004 at 11:59 UTC
|
rsync has a "--delete-after" option to delete the file after the copy.
Depending on the nature of your file (does every single byte change every time you need to copy it?) rsync can also improve efficiency. It can also use ssh compression similar to the scp already mentioned if that helps your transfer.
(Or it could be much less efficient...in which case don't use it.)
| [reply] |
Re: Copying a large file (6Gigs) across the network and deleting it at source location
by zentara (Cardinal) on Feb 05, 2004 at 15:32 UTC
|
On a file that big, I would be tempted to split the file on the remote machine into say 60 pieces of 100 meg each(or even 600 10meg files). Take md5sums on all the pieces, and send the list to your local machine. Then download them 1 at a time(or even a few in parallel if your bandwidth permits), and as they arrive, if their md5sum match, then delete that cut portion off of the remote machine. After all the files have arrived and are verified, cat them back together. Do alot of testing of this method first. :-) But it would give you some protection against one of the network connection hanging, causing loss of a partial file. It may also speed up your transfer, with parallel file transfers. | [reply] |
Re: Copying a large file (6Gigs) across the network and deleting it at source location
by rchiav (Deacon) on Feb 05, 2004 at 15:51 UTC
|
robocopy is well suited for this. I've used it for a lot of large (read: 20-50 gig) data migrations and it's worked fairly well. It has a switch to retry on errors, and it can recover from network glitches. You can also copy security info. It's in the NT resource kit, and I believe it comes with XP. | [reply] |