glwtta has asked for the wisdom of the Perl Monks concerning the following question:

I'm in need of the Monks' wisdom yet again,

It seems I need a system that will do the following (I would think this is rather common): at scheduled times, check certain online resources, if there are changes then download, parse, index and/or insert the data into a database alongside the old versions, or superseeding them.

So basically, what I need is a system that will easily allow me to schedule tasks that are comprised of (usually) three arbitrary scripts: the check for new / prepare data / integrate data. This would need have some rudimentary dependency checking, and very basic reporting (and failure notification).

I simply do not have the time to write this myself, and so far, all I could find in online searches are large systems that seems to be targeted at scheduling high volumes of tasks, and just look too "big" and scary, usually involving separate daemons running and generally seem to be overkill for what I need.

Any suggestions for a system which is simple and, well, doesn't include a kitchen sink? In an ideal world, I would want something that has a single web interface and allows me to schedule the same tasks on different machines (with different parameters), but that's certainly not a requirement to get something functional.

I know this is probably fairly common, but I don't think I can articulate what I need well enough to search for it successfully, without the involvement of knowledgeable individuals such as yourselves :)

Replies are listed 'Best First'.
Re: cron plus?
by LameNerd (Hermit) on Apr 29, 2003 at 17:54 UTC
    Have cron run a perl script that use LWP and DBI.
Re: cron plus?
by pzbagel (Chaplain) on Apr 29, 2003 at 20:16 UTC

    I know this is PERLmonks but if you are adamant about not writing this thing in it's entirety, why not cron + short shell script running wget? wget offers many switches that can make your life easier such as checking timestamps and file sizes before downloading. And if you can do it via SSH+RSA keys, there is rsync which is even better on only downloading new or modified files. Another possibility I can think of is CFEngine. It can download and execute files across countless systems. Hopefully they've fixed some of their file-transfer problems since I last used them.

    As far as slicing-n-dicing(tm) your data into your database. Unless you already have a program to do this, you will probably need to write a script yourself. Use DBI or if you are unfamiliar with it and in a hurry, just have your script dump the required SQL to a file or the sql client directly.

    Bottom line is, and you basically said it yourself, the systems out there are overkill for your needs because software tends to be generalized to meet the needs of many. Since your needs are narrow and specialized it's tough to find a system that fits and you either have to wrangle one of these "big" systems to your needs, or put together your own.

    Good Luck

Re: cron plus?
by Fletch (Bishop) on Apr 29, 2003 at 18:31 UTC

    Cron, rsync'd crontabs, NFS shared directories, and `uname -n`.cfg named machine specific configs probably could go quite a long way towards your requirements.

      Hm, both of those suggestions just involve me doing it :)

      The problem isn't that I don't know how I would accomplish this - I have a very good idea of how I would do this, I just can't take on such a project right now; so I was wondering if something along these lines already exists out there...

        My point was that the existing tools can be made to do what you want, it's just a matter of setting things up the right way. It's not really a matter of coding, rather just configuring what's there (and you're going to have to do that with any hypothetical über-cron anyhow).

A reply falls below the community's threshold of quality. You may see it by logging in.