in reply to Re: Storing program settings and state
in thread Storing program settings and state

No, my situation is not being able to copy a whole file. Suppose you have a 7Gb file and a link that goes down after 20 seconds. So, I want to treat it like a bunch of small files (configurable chunk size). I also intent to make it portable, not just Windows only (obviously Unix-only is not of interest to me).

To copy a bunch of small files, I would just use the command line copy source dest /u and it would continue where it left off, whole-file wise. The resource kit program robocopy does something similer: it keeps trying until it works. Verifying is the same issue, since it requires reading the source again.

My niche is different: the files are so large (relative to the link's reliability) that it would never be able to copy a whole file.

Replies are listed 'Best First'.
Re: Re: Re: Storing program settings and state
by Mr_Person (Hermit) on Jul 09, 2003 at 18:42 UTC
    Actually rsync is intended for exactly that. Not only does it copy the file in chunks, when the file on the other end changes and you need to recopy it, it only copies the chunks that have changed. From the rsync features page:
    rsync uses the "rsync algorithm" which provides a very fast method for bringing remote files into sync. It does this by sending just the differences in the files across the link, without requiring that both sets of files are present at one of the ends of the link beforehand.
Re: Re: Re: Storing program settings and state
by sauoq (Abbot) on Jul 09, 2003 at 18:45 UTC
    Suppose you have a 7Gb file and a link that goes down after 20 seconds.

    I'd look into getting the link fixed.

    obviously Unix-only is not of interest to me

    I'm sure rsync will run on Windows, at least in a cygwin environment. (I wouldn't be surprised if there were a native port.)

    Finally, I agree with Corion's assessment that all of the needed information should be in the files themselves. There should be no reason to write metadata elsewhere.

    -sauoq
    "My two cents aren't worth a dime.";
    
      After starting into the rsync paper, I see why y'all are not seeing my problem. This is not two peers that can compare notes. It is ONE computer and a remote storage system. Reading the file to compare it is just as problematic as copying it!

      In particular, I'm trying to get files off a Maxstore FireWire enclosure. It transfers a while and then has to have power cycled.

        Hmm. Sounds like that remote storage system is horribly broken. You still shouldn't necessarily need a metadata file storing how many bytes you've received -- just look at the file you've got. Count how many bytes it has, and skip that many. This is probably error-prone, but hopefully better than nothing. In any case, I assume you are able to tell the remote store to skip some amount of bytes before starting the transfer back up, right? If not... I'm not sure how you would accomplish this.

Re: Re: Re: Storing program settings and state
by revdiablo (Prior) on Jul 09, 2003 at 20:47 UTC

    If you don't end up simply using rsync as suggested (which I highly recommend. rsync is very good), you should at least read about the rsync diff algorithm. It's fairly simple (I even made a test implementation in Perl, long long ago), and quite effective. They put a lot of thought into solving the problem of efficiently transferring large files over a flaky link. The solution they came up with is really nice. If you're not going to use their program, you might as well use their algorithm. :)

    Update: Here's a link to a paper describing the rsync algorithm, for your reference. This is just one copy of many... a quick Google search will show you more.

    Update 2: Looking around at the rsync docs, I stumbled upon Andrew Tridgell's PhD Thesis. Chapters 3, 4, and 5 discuss rsync in great detail. Very interesting reading, if I do say so myself. :)