Database Synchronization

Dr. Mu has asked for the wisdom of the Perl Monks concerning the following question:

My first major Perl program, written several years ago, is an online database query system for a 12K race's runner registration. There is one master flat-file database that exists offline and which is manipulated by several offline scripts, including inactive participant pruning and incoming registration data entry. The data gets to the online system by uploading the entire zipped database to a Perl CGI script which extracts the info needed for online queries and indexes it along several dimensions for rapid access. The abbreviated dataset is stored online in a format similar to the master file.

Until now, all changes to the master database have had to occur offline, principally through the data entry script. This can be inconvenient, for example, when someone calls during the registration period and asks that their address or the spelling of their name be changed, since it means contacting the data entry person to update the master database. Consequently, these requests too often fall through the cracks.

I'm contemplating transferring the master database to the online server, so multiple entities can update it as needed, either directly through online transactions for small changes, or by synchronizing clones of the master database for larger changes. Due to the possibility of synchronization conflicts, this could be non-trivial to program from scratch.

I know there are commercial and freeware database systems (like MySQL) that would be of great benefit in this endeavor. Unfortunately my hosting service charges quite a bit extra for an account with MySQL access, so I'd like to consider other alternatives first. They are very good about installing requested modules, though, and even allow private libraries for those modules without universal appeal. Hence (after this long-winded intro) my question: Does anyone have experience with any Perl modules which provide record-level or (preferably) field-level synchronization support for flat-file databases? If so, are there any that you would recommend?

Many thanks!

Comment on Database Synchronization

Replies are listed 'Best First'.

Re: Database Synchronization
by PodMaster (Abbot) on Dec 03, 2002 at 08:29 UTC

BerkeleyDB

Start with the Berkeley DB Tutorial and Reference Guide, especially with What is Berkeley DB? and What Berkeley DB is not, and convince yourself it's what you need, then get it ;)

update: A always, BerkleyDB's simpler cousin DB_File is also available ;)(and this might be interesting as well)

MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
** The Third rule of perl club is a statement of fact: pod is sexy.

[reply]

Re: Re: Database Synchronization

by Dr. Mu (Hermit) on Dec 03, 2002 at 18:28 UTC

Berkeley DB replication groups consist of some number of independently configured database environments. There is a single master database environment and one or more client database environments. Master environments support both database reads and writes; client environments support only database reads.

What I was hoping for was a module which automated the synchronization bookkeeping with hooks to the main program for dealing with conflicts. One that also includes journaling for reversing changes would be a plus.

[reply]

Re: Re: Re: Database Synchronization

by PodMaster (Abbot) on Dec 04, 2002 at 10:33 UTC

Here's what I see happening. Your data entry person has a read-only copy of the database, and any changes she makes do not get recorded in the database, but a file which she then uploads to your server, and which then hopefully gets incorporated into the master database.

You can have a special program on the server to control this, where the data-entry person can chose to which of the pending updates get entered into the master.

You can build your own journaling system around that, or you can build one from the logs.

I'd like to see what you come up with ;)

MJD says you can't just make shit up and expect the computer to know what you mean, retardo!
** The Third rule of perl club is a statement of fact: pod is sexy.

[reply]