mhearse has asked for the wisdom of the Perl Monks concerning the following question:

A question regarding mod_perl. I've written one main script which I'm using for several different purposes. For each different case, first I declare a glogal hash 'our %var', then I require a config file which populates the hash with the values I need for this situation. When using mod_perl, I've heard that it's bad to declare global variables and require files (It can cause memory leaks). Is this in fact a problem? If so, is there other ways to handle it? Thanks.

Replies are listed 'Best First'.
Re: mod_perl memory question
by Tanktalus (Canon) on Jun 02, 2006 at 03:42 UTC

    What mod_perl does to speed up your script is to a) load the perl interpreter into memory, and b) load and compile your perl code into memory. (That's a bit of a simplification, but it should suffice.)

    As part of compiling perl code, anything in a BEGIN block is run. This allows you to do things like load a config file. Or import subs. Or even "use" other modules (since use implies a BEGIN block).

    So, if you have code such as:

    package My::Config; our %var = load_config(); sub load_config { #... }
    what will happen, assuming that My::Config is loaded at all, is that during apache's initialisation, you will load your configuration. And then, apache will fork.

    In each child, %var will be private to that child, but not necessarily private to that connection. So, if a single child handles 30 different requests before dying, any changes you make to %var in the first request will be visible to the next 29 - by that same child. Other requests can come in and be handled by other children which won't see that change (but will see their own in a similar manner).

    So it's very important for consistency, and, more importantly, for ease of coding and debugging, that anything you do in the BEGIN phase is immutable. That is, it won't get changed during the run of your application. Things like loading modules are generally immutable. Loading configuration files often are as well. However, if that configuration hash is modified at runtime, then it's no longer immutable.

    That, too, is a simplification. If you are writing things to global values that are calculated from immutable values, and are merely caching them, then that's fine, too. But it then becomes important to know what is really request-dependent vs request-independent. For example, this is independent:

    sub get_translation_dir { my $lang = shift; # have we encountered this before? return $vars{"translation_dir_$lang"} if $vars{"translation_dir_$lan +g"}; # I think languages are sent as semicolon-seperated values, e.g. # en;en-US for someone who only speaks English ... or # fr;fr-FR;en-FR;en for someone who prefers French, but will accept +English. # modify if that's wrong. my @choices = split /;/, $lang; for (@choices) { # have we seen this one before? return $vars{"translation_dir_$lang"} = $vars{"translation_dir_$_" +} if $vars{"translation_dir_$_"}; # no? Do we have such a translation? if ( -d "$vars{root_dir}/$_" ) { # save it to the cache, and return. return $vars{"translation_dir_$lang"} = $vars{"translation_dir_$ +_"} = "$vars{root_dir}/$_"; } } # if we get this far, try no language. return get_translation_dir(''); }
    Whereas anything that uses CGI parameters is not. And anything coming from a database ... may or may not. You have to decide what is invariant.

    Hope that helps.

    PS: above code is untested, even uncompiled, and, even if it were to work the way I think it should, it may not work with the way you handle translated content.

      Some notes:

      When I started my mod_perl app. framework, I'v challenged BEGIN blocks too. Unfortunately, I do not remember, why I left this idea. Well, my part of httpd.conf looks like

      #apache 1.3 PerlSetEnv FRAMEWORK_HOME /usr/local/framework <perl> use lib "$ENV{FRAMEWORK_HOME}/perl"; use MyFramework (); MyFramework::new(); # this new() writes the created instance into # $MyFramework::inst, which is our :) </perl>
      And what about global vars. I use them with this rule: Write it when apache starts (all actions are caused under MyFramework::new()) and never later. It gives the chance to share occupied memory between childs. I use $ENV{runtime} for volatile (request related) data.

      I'v found this question when I supersearched caching in apache, so, when I find something important about it, I'll link it here.

Re: mod_perl memory question
by jdtoronto (Prior) on Jun 01, 2006 at 22:54 UTC
    A little difficult to comment on without some pretty specific information.

    As a general rule we should only declare globals with considerable malice of forethought. Why do you need a global? Why not some other scope?

    I have never tried a 'require' in a mod_perl process, but again, why do a require? Why not contemplate loading a config file in some established format into a local data structure as needed, and freed when done?

    As to actual memory leaks? Well, even though perl_mod processes are persistent they are not infinitely persistent - they don't last for ever. You really should do some benchmarking specific to your own situation. I would also advocate that you join one of the mod_perl lists at apache.org and ask these questions there as well, there are plenty of very experienced folk there that may not see your question here.

    jdtoronto