in reply to is Net::FTP reliable?

I transfer around 600Mb per day using Net::FTP between servers. Mind you, I haven't inspected each line explicitly looking for garbled fields :) however, I have sufficient faith in TCP to believe that the data are being transferred correctly. The only problems I have ever encountered are after modifying firewall rules, and then all of a sudden nothing goes through, but that is easy enough to spot programmatically, and sufficient to help you debug the problem.

At one point, I was sufficiently paranoid to produce an MD5 digest of the file (so as to transmit foo and foo.md5). On the other end I ran md5 on the received file to confirm that the digest of foo was the same as foo.md5. Of course, in a hostile environment, you have to consider the possibility that if someone can tamper with foo, they could also rewrite foo.md5 to make it match. This way lies madness. But if you're just worried about garbled data, you don't have to go to all that fuss.

Just check all your error codes, and make sure you have sufficient disk space on hand to receive the file (the only other real source of error). In other words, yes, do check the size of the files on each side... but beware if you are transferring in ASCII mode between systems that have a different encoding for new lines: the size will not be the same.

A better way of dealing with this is to transfer in binary anyway, then, once you have received the file, check that the sizes are the same, and then munge the newlines with something like a unix2dos filter.


print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'

Replies are listed 'Best First'.
Re: Re: is Net::FTP reliable? (yes)
by Elian (Parson) on Nov 25, 2002 at 15:50 UTC
    I have worked at places where bad hardware has caused undetected errors in TCP file transmissions, because of the way that IP handles its checksums. IP checksums each packet as its transferred from machine to machine, and that checksum is recreated each hop. If something corrupts the packet data on a router, it may well stay corrupt and undetected on its way to the final destination.

    (We ultimately found the problem because DECNet transfers did do end-to-end checksumming, and when transfers went through this particular router the DECNet stuff would fail but the IP stuff wouldn't. (It was just silently corrupted))

    Moral of the story: If the data's truly important, do an end-to-end checksum of the files, just to be sure. You rarely need this sort of check, but if you're moving financial or medical data, you'd probably best do it to be extra sure.