Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

Is there an efficient (as in, as good as having your own copy) way to share a rather large hash between sibling mod_perl processes for read-only operations? Without having to resort to Storable, Freeze/Thaw, etc. ?

There have been many such discussions in the past (using IPC::MM, IPC::SharedCache, etc. etc.) but they all work for read/write access, and therefore are complicated. I'd like only read-only access (except, of course, for the first process which reads the data in from files).

But what I'd like is a simple (but very fast) tied interface to a rather large (and complicated) hash for reading only.

Any ideas, tips, suggestions would be appreciated.

  • Comment on Sharing a rather large data structure between siblings

Replies are listed 'Best First'.
Re: Sharing a rather large data structure between siblings
by Joost (Canon) on Oct 31, 2006 at 22:40 UTC
    If you are sure the data never changes, you can read it in before the apache child processes are fork()ed - for example in startup.pl. On (most) unixes at least, fork() is implemented via copy-on-write, which means child processes share all memory pages as long as they aren't changed.

    edit: creating new references to the data and some other "read-only" operations still write to the internal data structures, so this might not be the best solution. It is pretty simple to implement, though.

Re: Sharing a rather large data structure between siblings
by RMGir (Prior) on Nov 01, 2006 at 12:28 UTC
    How fast do you need?

    Have you tried DBFile, GDBM, or BerkeleyDB?

    I've haven't used them much, so this isn't a ringing endorsement. Just a suggestion for a (possibly) simpler alternative :)

    I'm not sure about the others, but a quick glance at the docs shows that BerkeleyDB and DBFile, at least, supports read-only access.

    It might just be fast enough for what you want, and is probably simpler and safer than trying to roll your own.


    Mike
      I have used them. BerkeleyDB is extremely fast for read-only access, if you use the BerkeleyDB module and avoid the tied interface.