in reply to Persistent timed events

it'd be good if I could avoid being dependant on an RDBMS

"Premature optimisation"! ;-) Under the principles of YAGNI, you could go with a filesystem approach. But I'm not really a big believer in YAGNI. I'd go with the RDBMS because I've found applications like this to constantly grow with every new PHB in my management chain (if your corporation is large enough to have a chain of management). And RDBMS's are much more flexible - a quick ALTER TABLE, and suddenly you have more metadata that you can store about each task. Auditing requirements? No big deal - just add appropriate metadata, and create a second table where you put "completed" tasks (rather than just deleting them).

If you go with the filesystem approach, you'll either have a real headache on your hands, or you'll rewrite the whole thing to use a database backend. Either one means you have no flexibility.

Oh, and I'd reverse that comparison in your select statement - I think you mean "where date <= now()" ;-)

As for concurrency issues - I think you will have those no matter what solution you go with. For starters, one way to deal with that is to mark the row you're working on somehow. With a filesystem, you could rename the file or lock it. But that may not tell you what process is working on it. With a database, you could put in metadata to tell you what host and process ID is working on the problem. Then you can have a workload manager program run periodically on each host, querying for any workloads that are supposedly working on that host, checking the process ID, and then pinging the worker - if there is no such PID, or that worker is hung, you can kill the worker, and reset the item in the database so the next worker available will try it again.

Yes, this is called feature creep. But those features do creep, and sometimes it's because you realise that your design didn't take into account certain requirements. The first time a worker hangs, you'll realise you need some watcher to notice this and handle the situation. When things start taking too long, someone will want to add a second machine to help the workload (NFS isn't exactly a good filesystem for handling locking and other atomic operations). What I agree with YAGNI on is that you don't build these all in until you do need it. But you do make your architecture flexible enough to allow you to do what you will eventually need to.

Replies are listed 'Best First'.
Re^2: Persistent timed events
by tirwhan (Abbot) on Nov 03, 2005 at 20:04 UTC

    Thanks a bunch, lots of really valuable advice here. Great stuff about the workload and concurrency management. I think I'm going to go with a filesystem-based approach to begin with, but your node has reminded me to make the system decoupled enough that I can later swap out the queuing stuff for a different system, if necessary (which is a good idea in any case of course). So I agree 100% with your last two sentences.


    Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan