John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

I'm designing a program to copy large files over not-reliable-enough connections. That is, it will read as much as it can before the connection vanishes, and record the state of how far it got. When run again (after re-establishing the connection) it will seek to the correct spot and continue. The same process can make a second or nth pass to validate, as well as copy to a destination.

So, I need to store some simple values in a file and read it again later.

What's a good "settings" module? My needs are simple, but I want labeled human-readable values not just a colon-separated list, and may add more values later. The Windows .ini files are not portable if using the built-in OS functions to access them.

I figure obtaining and learning a good one, even if its overkill, so I can use it in future projects as well.

—John

Replies are listed 'Best First'.
Re: Storing program settings and state
by Corion (Patriarch) on Jul 09, 2003 at 17:48 UTC

    First of all, when copying / verifying files, all the persistent data you need are the files themselves - you verify each file, and if it is invalid or does not exist, you (re) download it. rsync is a very good program that does exactly that, so this wheel has already been invented :-) I don't know if you, as a mere user, can install and run it or if it needs to be run as root though...

    perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ; # The $d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider ($c = $d->accept())->get_request(); $c->send_response( new #in the HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' # web
      No, my situation is not being able to copy a whole file. Suppose you have a 7Gb file and a link that goes down after 20 seconds. So, I want to treat it like a bunch of small files (configurable chunk size). I also intent to make it portable, not just Windows only (obviously Unix-only is not of interest to me).

      To copy a bunch of small files, I would just use the command line copy source dest /u and it would continue where it left off, whole-file wise. The resource kit program robocopy does something similer: it keeps trying until it works. Verifying is the same issue, since it requires reading the source again.

      My niche is different: the files are so large (relative to the link's reliability) that it would never be able to copy a whole file.

        Actually rsync is intended for exactly that. Not only does it copy the file in chunks, when the file on the other end changes and you need to recopy it, it only copies the chunks that have changed. From the rsync features page:
        rsync uses the "rsync algorithm" which provides a very fast method for bringing remote files into sync. It does this by sending just the differences in the files across the link, without requiring that both sets of files are present at one of the ends of the link beforehand.
        Suppose you have a 7Gb file and a link that goes down after 20 seconds.

        I'd look into getting the link fixed.

        obviously Unix-only is not of interest to me

        I'm sure rsync will run on Windows, at least in a cygwin environment. (I wouldn't be surprised if there were a native port.)

        Finally, I agree with Corion's assessment that all of the needed information should be in the files themselves. There should be no reason to write metadata elsewhere.

        -sauoq
        "My two cents aren't worth a dime.";
        

        If you don't end up simply using rsync as suggested (which I highly recommend. rsync is very good), you should at least read about the rsync diff algorithm. It's fairly simple (I even made a test implementation in Perl, long long ago), and quite effective. They put a lot of thought into solving the problem of efficiently transferring large files over a flaky link. The solution they came up with is really nice. If you're not going to use their program, you might as well use their algorithm. :)

        Update: Here's a link to a paper describing the rsync algorithm, for your reference. This is just one copy of many... a quick Google search will show you more.

        Update 2: Looking around at the rsync docs, I stumbled upon Andrew Tridgell's PhD Thesis. Chapters 3, 4, and 5 discuss rsync in great detail. Very interesting reading, if I do say so myself. :)

Re: Storing program settings and state
by hardburn (Abbot) on Jul 09, 2003 at 17:52 UTC

    Possiblities:

    • XML config file (lots of bloat in this solution)
    • Data::Dumper to print a configuration hash to a file, then read it back later (OK, so its kludgy, but it'll work)
    • Config::IniFiles (I think you can use this outside a Win32 box)
    • Tie::Config
    • Just about anything else in the Config:: namespace.

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    Note: All code is untested, unless otherwise stated

Re: Storing program settings and state
by ajdelore (Pilgrim) on Jul 09, 2003 at 19:29 UTC
    We used to have something for this, back in the day. It was called ZModem. <g>

    </ajdelore>

Re: Storing program settings and state
by bm (Hermit) on Jul 10, 2003 at 14:57 UTC
    I would of thought that use'ing  Storable is worth a look.

    It looks flexible enough for your current (and most likely future) needs.