rob_au has asked for the wisdom of the Perl Monks concerning the following question:

I am currently writing a module which inherits from Tie::Hash and makes use of the hash interface to hide much of the details of the underlying module. One feature which I am looking now at incorporating into this interface is locking - This is so that the same underlying data source can be used by multiple processes. The following is the schema which I envision for this locking of the underlying data source within the tied-hash interface:

CLEARwrite lock
DELETEwrite lock
EXISTSwrite lock
FETCHread lock
FIRSTKEYread lock
NEXTKEYread lock
STOREwrite lock
TIEHASH?
UNTIE?

I am however unsure of what locking or synchronisation, if any, I should be performing upon TIEHASH or UNTIE - Note that I have not outlying any specifics of the underlying data source or locking mechanism as these are user-modifiable components of this module. eg.

tie %hash, 'MyClass', { 'Lock' => 'Semaphore', 'Store' => 'DB_File' }, { .. more options .. } or die $!;

As such what I am looking for with this post is a more general discussion as to the general aspects of synchronisation locking within tied hashes rather than specific approaches or caveats.

 

Replies are listed 'Best First'.
Re: Locking and synchronisation within tied hashes
by perrin (Chancellor) on Sep 08, 2002 at 13:18 UTC
    This already exists: MLDBM::Sync. If that isn't exactly what you want, I still suggest you look at the code, since the synchronization part is nicely split out from the rest.

    You don't need to do any locking when tie or untie, but you do have to untie and re-tie in a number of situations because of the internal caching that DB_File does. The source of MLDBM::Sync will show you when to do this.

    By the way, this would all be unnecessary if you used BerkeleyDB instead. It handles locking and synchronization internally.

Re: Locking and synchronisation within tied hashes
by Aristotle (Chancellor) on Sep 08, 2002 at 13:05 UTC

    It becomes a bit clearer, I think, if you use the actual terminology for the locks: shared lock and exclusive lock. It's probably best to get a shared lock right when you tie, then upgrade it to an exclusive lock as soon as you need to write to the file and not downgrade to a shared lock from an exclusive one ever as that will jeopardize integrity. There is no guarantee that changes will properly become visible as long as you don't outright close the file.

    Btw, am I missing something? EXISTS shouldn't need an exclusive lock..

    Makeshifts last the longest.

      Thanks for your comments Aristotle - And no, you're not missing anything, it is suppose to be a 'read lock' associated with the EXISTS method.

      It's probably best to get a shared lock right when you tie, then upgrade it to an exclusive lock as soon as you need to write to the file and not downgrade to a shared lock from an exclusive one ever as that will jeopardize integrity.

      Are there are any caveats or disadvantages which I should be aware of in acquiring and holding a shared lock for the duration of the life of the tied object?

       

      Update - Curiously, while I would expect that a shared lock would be appropriate for the EXISTS method, the module MLDBM::Sync which perrin referred me to here actually calls upon an exclusive lock for calling upon the EXISTS method of the underlying file store.

       

        None so long as everyone's just reading. If anyone wants to write and tries to get an exclusive lock, they'll have to wait for everyone to finish reading and untie, but that's the point of the excercise: it guarantees that you can never end up writing back out of date data. Of course you'll have to tell your users that they need to untie and forget what they've read as soon as possible, if many processes are to be simultaneously writing to that same database.

        Makeshifts last the longest.