spurperl has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

There's an interesting issue I ran into, not for the first time:

My module contains some hash that maps data. It is constant and is changed only by the programmer, for example:

my %opcode_map = ( 'LDA' => [8, 0], 'STA' => [14, 1], ... ... );

The 'my' makes this map private to the module, as it should be (only functions from the module use it, it's never exported). However, when a table contains many duplications that can be generated automatically, I'm not sure what is the best way to generate the entries.

For instance, there are keys 'LD1', 'LD2', ... up to 6, with the opcode also having a pattern which can be easily generated. So, instead, I place 'LDi" and then a loop replaces each i by 1..6 and adds all the relevant entries into the map.

In the past, I just slapped the code "in the wild" in the module's scope. It runs when the module is "use"-d, and that's it. However, it doesn't look "clean" to me, and I wonder whether ther are better ways ?

Replies are listed 'Best First'.
Re: code that runs at module loading
by Zaxo (Archbishop) on Dec 07, 2004 at 08:10 UTC

    You can make sure variable initialization code is executed at compile time with either a BEGIN{} block or with use constant . . ..

    Your initialization code itself looks fine to me. Even if there is a clever way to generate some of the pairs, it's probably better to avoid anything tricky. Init code is only run once. That opinion might change if there are a gazillion pairs in the hash.

    After Compline,
    Zaxo

Re: code that runs at module loading
by eyepopslikeamosquito (Archbishop) on Dec 07, 2004 at 08:49 UTC

    From your description, I'd say what you're doing is 100% ok. Putting "my" variables in the module is just normal modular data hiding and should work fine with both "use" (compile time) and "require" (run time) in the common case where you don't need the data to be known at compile time. If you need it at compile time (e.g. to push onto @INC), simply enclose in a BEGIN block.

    Since data hiding is a good thing, be sure to declare these "in the wild" variables in the smallest scope you can. In simple cases, organise your module so that the "my" globals and the functions that use them are stuffed at the very end of the file. Alternatively, put a "my" variable and the function/s that use it together in a bare block. For example:

    # some module code # ... { my %some_global = ( ... ); sub fred { # fred() is the only function in the module # that uses %some_global # ... } } # the %some_global variable is not in scope here. # ... rest of module
    Note that fred() is globally known, while %some_global is not. Again, this is just plain old hiding of data into the smallest scope you can. Notice that this is a simple "modular" style of programming. As a good rule of thumb, ask the question "do I need more than one of these?"; if the answer is yes, use O-O style, otherwise you can often get away with the simpler modular style described above.

Re: code that runs at module loading
by conrad (Beadle) on Dec 07, 2004 at 10:57 UTC

    On your main question, since it needs to be initialised somewhere along the line anyway, I don't think that where the code is really matters — unless of course, other modules might be calling your module (and hence needing that data initialised already) from their own BEGIN { } section, in which case you have to place the initialisation in your own BEGIN too…

    Tiny thing that you may well already be using (actually looking at your words again, maybe not), but for those autogeneratable opcodes, you could do something like this:

    my %opcode_map = ( a => [0], (map +( "LD$_" => [1, $_], "TST$_" => [2, $_] ), 1 .. 6), b => [1] );

    Ignore the a and b entries, they're just there for context — you can use map to replace 1 .. 6 with the computable opcode entries (note use of +( ) to prevent interpretation of the map as a function-style call).

    Sounds like your LDi notation might be a bit clearer though :-)

Re: code that runs at module loading
by sgifford (Prior) on Dec 07, 2004 at 17:22 UTC
    One technique I use sometimes is to write another very small program to generate Perl code to create the constant hash, then paste the output of that program into your main program. Something like:
    print "my %opcode_map = (\n"; print "LD$_ => [".join(',',$_*3,$_*5,$_*7)."],\n" foreach (1,2,3); print ")\n";

    Also, note that you should use parentheses not squiggly brackets to set up your hash, or else you may not get what you expect; use warnings will warn you about that.

      Having the script that uses the constant generate the code at compile time (in a BEGIN block) is one of the wonderful features of Perl that distinguishes it from other languages. Why not make use of it?

        I was just throwing out a technique I use sometimes to see if the OP might find it useful; it's not necessarily the best strategy here. It's useful if the constants take a long time to generate, or if they're actual use constant declarations, which Perl can optimize away at compile-time if it knows them beforehand.