in reply to Sharing data structures in mod_perl

Sharing perl data between processes without serialization is not possible. Even putting a simple scalar into shared memory involvs serialization, since the scalar has to be converted from a perl data structure to a simple string of bytes and back again.

What makes you so sure that serialization is the problem? Storable is very fast. It sounds to me like the problem is that you are serializing the entire data structure every time, instead of just the one tiny chunk of it that you need to look at.

Your problems with the staleness check sounds like some kind of bug in your code. There is no problem with globals in mod_perl, or references, or file mod times. I couldn't say more without seeing the code. Anyway, as I said, storing and loading the entire hash is not an efficient way to do this.

I'm presenting a paper on the most efficient data sharing modules at the Perl Conference this year, but I'm not done with my benchmarking yet so I can't tell you the winner. I do recommend that you try MLDBM::Sync or Cache::Cache. Both of them give you a hash-like interface (MLDBM::Sync is a tied hash module, while Cache::Cache provides get/set methods for key/value pairs), and each element of the hash can contain arbitrary data structures. Don't stuff all your data into one element of the hash, or you'll defeat the purpose. The idea is to only de-serialize the small piece of data your program needs at any given moment.

Hope that helps. When I have more information about which data sharing modules are fastest, I'll post some data about it on perlmonks.

  • Comment on Re: Sharing data structures in mod_perl

Replies are listed 'Best First'.
Re: Re: Sharing data structures in mod_perl
by Hero Zzyzzx (Curate) on Mar 28, 2002 at 17:14 UTC

    With the staleness check above, I was trying to NOT serialize the data, therefore storing a reference in a global shared among httpd processes. This wasn't working, obviously, and for good reason.

    I agree that serializing is fast, except when you are trying to serialize too large a structure (we're talking foolishly large, here. My bad. Basically, I was serializing/deserializing a structure that was MUCH larger than it needed to be. I wouldn't think to do this with a DBI query, don't know why I was thinking it'd be OK when I did it with Storable.) I'm probably going to move toward splitting my structure into MUCH smaller chunks and still serializing it, or back to straight DBI. I haven't decided., this is going to depend on the results of some benchmarking I need to whip up, and the opportunity cost of switching my code.

    -Any sufficiently advanced technology is
    indistinguishable from doubletalk.

      With the staleness check above, I was trying to NOT serialize the data, therefore storing a reference in a global shared among httpd processes.

      Yeah, you can't do that. Globals are not shared between processes. That's not a mod_perl thing; it's just how processes work.

      One more tip: others have seen great results from Cache::Mmap. You may want to look at that.

      Did you try benchmarking the serialization in memory without any disk I/O?

      In my experience Storable is amazingly fast and it was the disk that quickly became the bottleneck.
      But then I was only using ~ 5k structures.