nmerriweather has asked for the wisdom of the Perl Monks concerning the following question:

i'm starting to port a rather large mess i wrote into an object oriented framework under mod_perl

an issue i've hit is dealing with something i can only describe as semi-dynamic data

in the project's former life, there were a few data structures i would read from disk on every execution, but write only with administration scripts (1/1000 executions)

the data is small and relatively stupid. for example, one file is a hash of city_states keyed by zipcode. this is mostly used for dropdown select menus. occassionaly, a city is added or subtracted.

i can't figure out how to handle this under mod_perl. i dont like the idea of storing this information in a db and constantly querying it -- and reading it from file every execution seems silly.

i don't even know where to begin approaching this.

help?!?

Replies are listed 'Best First'.
Re: mod_perl and semi-dynamic data
by perrin (Chancellor) on Oct 28, 2003 at 05:43 UTC
    If the data is small enough, just cache it in memory. All you need to do is put it in a global, and it will stay there. To reduce the memory footprint, you can load this in your startup.pl before apache forks. This makes it shared (by copy-on-write) between all the apache processes.

    If it's too big for that, just keep it in a file and load it each time. You can use Storable for reading and writing it, which is usually faster than normal parsing if it's a complex structure.

Re: mod_perl and semi-dynamic data
by liz (Monsignor) on Oct 28, 2003 at 07:40 UTC
    The question is really how fast you want to "see" your updates and what interruption in service you're willing to take.
    • In the simplest use case, you may want to load the entire database into memory at server startup. And restart the server (semi-)automatically when you have made a change.

    • If you can live with updates only becoming available once per day, restart the server once every day at a quiet time.

    • Check whether the data is updated in each request and only load it when updated. You'll have a performance hit once you've made a change until you restart the server.

    Finally there is one way I once thought about being possible, but haven't tried it yet myself if it is any good. If you're running on *nix, it may be possible to use a signal to tell the parent process to re-read the database. Any new children would then automatically share the data with the parent again. For the moment, this is just an idea that I once had, I'm not sure whether it will fly in general, or in your particular setup.

    The above may or may not be the same as a graceful restart of Apache: my experiences with graceful restarts and mod_perl have never been good, so I would personally not recommend using some type of graceful restart approach.

    Hope this helps.

    Liz

Re: mod_perl and semi-dynamic data
by Roger (Parson) on Oct 28, 2003 at 05:37 UTC
      and reading it from file every execution seems silly

    Perl is quite good at loading small files into internal structures quickly. Just a few lines of code and that's it. (And the performance penalty for reading it from file is minimal because the files are small and simple.)

    use strict; use Data::Dumper; chomp(my @data = <DATA>); my %states = map { my ($p,$c,$s) = split /\|/; $p => [ $c, $s ] } @data; print Dumper(\%states); __DATA__ 3000|Melbourne|VIC 2000|Sydney|NSW
    and the output is
    $VAR1 = { '3000' => [ 'Melbourne', 'VIC' ], '2000' => [ 'Sydney', 'NSW' ] };
Re: mod_perl and semi-dynamic data
by adrianh (Chancellor) on Oct 28, 2003 at 13:42 UTC

    You might want to look at using things like Cache-Cache to handle your caching logic for you.

Re: mod_perl and semi-dynamic data
by nmerriweather (Friar) on Oct 28, 2003 at 16:04 UTC
    Thanks all!

    This is helping me.

    The data is ridiculously small: right now 4 files about 4k each.

    I'd like to avoid constantly reading to disk. I was hoping there would be some efficient way to write updated data to a global variable in memory that all existing apache processes could address, and store it to disk for the next time the server starts.