in reply to Re^5: Autoloading tie routines
in thread Autoloading tie routines

Please explain "Furthermore ABC looks like an example of the God module anti-pattern. That is a code smell right there." Colorful language!

Obviously there are is a lot of stuff in XYZ::ABC, but I'm working toward a successor module to an existing CPAN module, which has the same Scalar, Hash, and BTree stuff. I admit to making it worse by adding the Array stuff. But hopefully the linker will treat the ABC.so file as a library, and only load what's called for.

"everything that the AUTOLOAD has done with those 6 packages"? (There are now 8 because I added BTree as an alias for Hash, to be more compatible with the predecessor module.) Far as I know, AUTOLOAD has done nothing but be there to field attempts to use unknown constants and subroutines. The subsidiary package names cannot be use'd, and exist only to be used in tie statements.

There is no "use AutoLoad" in ABC.pm any more. ABC.pm's AUTOLOAD is free-standing, and passes the tie calls to subs in ABC.xs that are also available as direct calls to users of XYZ::ABC.

Rather than define 82 names in the symbol table at the start of execution, the compromise course between your routine and the direct-through AUTOLOAD that for the moment is still shown below, is for AUTOLOAD to put each name for which it is called in the symbol table (thus eliminating future calls to AUTOLOAD for this name) and then go to the newly defined sub. It will need to make an anonymous sub that dereferences the first operand from perl to match what the destination XS routine expects.

If I wasn't "confused about how this works" I wouldn't be posting to perlmonks. ABC.pm does not have an import method. It has an export method that used to include AUTOLOAD in @EXPORT_OK. But I took the use's out of the subsidiary packages, and no longer export AUTOLOAD. Thank you for this!

We must be careful about commenting about the clarity of each others' code. I find your routine above quite readable, until I come to *name = thing; which I have not used. Based on your code I made several improvements to my AUTOLOAD routine, which looks like this PENDING THE "STORE THE NAME" IMPROVEMENT NOTED ABOVE:
# AUTOLOAD is used to # 1) 'autoload' constants from the constant() function in ABC.xs # If the name is not a constant then it's parsed for # 2) a tie package-name::function-name, which if matched is executed our $AUTOLOAD; # implicit argument of AUTOLOAD sub AUTOLOAD { # make the base name (without the "package::") (my $constname = $AUTOLOAD) =~ s/.*:://; # call the constant lookup routine in ABC.xs my $val = constant($constname, 0); if ($!) { # the name in $AUTOLOAD is not a constant defined by XYZ::ABC # sah = scalar/array/hash if (my ($abcx, $sah, $function) = $AUTOLOAD =~ /^XYZ::(ABCA?)::(Scalar|Array|Hash|BTree)::([ +A-Z]+)$/) { if ($sah eq 'BTree') {$sah = 'Hash'} if ($function eq uc("TIE$sah")) { my $self = shift; my $base_sah = shift; # sah = scalar/array/hash return bless \$base_sah, $self; # Scalar or Array or Hash } elsif ($function eq 'FETCH' || $function eq 'STORE' || $sah ne 'Scalar' # Array or Hash && $function =~ /^(DELETE|EXISTS|CLEAR)$/ || $sah eq 'Array' && $function =~ /^(FETCHSIZE|STORESIZE|EXTEND|POP|PU +SH|SHIFT|UNSHIFT|SPLICE)$/ || $sah eq 'Hash' && $function =~ /^(FIRSTKEY|NEXTKEY|SCALAR)$/) { $function =~ s/KEY$/_KEY/; my $subname = lc($abcx) . '_' . lc($sah) . '_' . lc($f +unction); my $base_sah_ref = shift; unshift @_, $$base_sah_ref; # dereference the base sca +lar/array/hash goto &$subname; } elsif ($function eq 'UNTIE' || $function eq 'DESTROY') { return; # do nothing } } croak "$AUTOLOAD is not a defined constant or subroutine for X +YZ::ABC"; } # the name in $AUTOLOAD is a constant defined by XYZ::ABC: define +it for perl eval "sub $AUTOLOAD { $val }"; # can this be $constname rather tha +n $AUTOLOAD? goto &$AUTOLOAD; # can this just be 'return' if all of the defined + names really are constants? }
I would appreciate comments on the last two questions in its comments.

Thanks for being there,
cmac
www.animalhead.com

Replies are listed 'Best First'.
Re^7: Autoloading tie routines
by tilly (Archbishop) on Feb 03, 2009 at 21:41 UTC
    For more on the God antipattern see http://en.wikipedia.org/wiki/God_object. It refers to the case when one class or module tries to take on too many jobs at once. This defeats the entire point of modularity, which is to enable code to be broken into modular pieces, each of which has reasonable complexity.

    If you do not know what benefits modularity is supposed to bring, or what good modularity looks like, please pick up a book like Code Complete 2. It is a large and important topic, and one I certainly don't have time to go into in any depth today.

    Moving on, it sounds like you are micro-optimizing on memory usage. The vast majority of mod_perl sites should just use a reverse proxy in accelerator mode and then not concern themselves overly much with memory needs. See this guide for an overview of what that is, and how to do it. (Yes, it is a 1.0 specific guide. But the advice still applies pretty well to 2.0. BTW I really, really hope you're using pre-fork mode with 2.0...)

    That said, I don't know the exact amount of memory needed to have 100 entries in a symbol table, but it is going to be at most a few K. (The functions themselves take up no space because they already exist.) If you are having trouble with running out of memory, this is not where you should look for improvements.

    Moving on, the syntax *foo = $coderef; is the standard way to manipulate the package table and install a function. See perlmod for more examples. It is certainly better than the eval solution that you are using in your AUTOLOAD. Better in what way, you ask? Well for a start, if the code you are generating is wrong, then the eval will silently fail and you won't get a good error message. Real example. If one of your constants is supposed to be "Hello, world" then your eval will fail, no error will be reported, and your AUTOLOAD is going to blow up. So compare:

    # This might break badly depending on $value eval "sub $AUTOLOAD { $val }"; # This will work no matter what $val happens to be. no strict 'refs'; *$AUTOLOAD = sub { $val };
    Which means that this is a piece of syntax you really should learn.

    Finally we come to your two questions. Yes, you can replace $AUTOLOAD with $constname if you are in the right package. Secondly the goto will go to the new function you just defined if it was properly created by eval, and it just returns a value. So you can replace one by the other. However as your code currently stands, that goto is the only sign you are given that your code broke if it broke.

      The eval was inherited (in the older meaning of the term) from the predecessor package, which was written in 1998. I've happily changed the second-last code line to *$AUTOLOAD = sub{$val};. In the next statement, I forgot that the object of using a constant name was to get the constant's value, which is why the goto is needed. I may be getting too old for this.

      I was not familiar with "God module/antipattern" terminology. Having been in programming since '63 I value modularity highly, and would never create/architect a module like the one I'm working to replace.

      But having inherited (in the classic sense) a bloated package, and knowing how much compatibility means to many people, at the moment I'm keeping compatibility. Not that the older package probably has many users: the problems I know about probably preclude that.

      On the other hand, having 8 packages inherit from XYZ::ABC may be OK this once, seeing as the only thing "in" each package is a 1;.

      Last question for now: to pursue the compromise course, my instantaneous version of this AUTOLOAD contains:
      my $subname = lc($mmx) . '_' . lc($sah) . '_' . lc($fu +nction); *$AUTOLOAD = sub { my $base_sah_ref = shift; # dereference the base "sah" unshift @_, $$base_sah_ref; goto &$subname; # or &{*$subname{CODE}} ?? }; goto &$AUTOLOAD;
      Is Perl smart enough to capture the current value of $subname in the anonymous sub, rather than use a reference to a local variable that will soon go away? Is this one of those "reference count" things wherein perl is superior to C?

      cmac
      www.animalhead.com
        The goto is not needed, you can just return $val.

        To answer your question about your later code sample, since you are using variables declared with my, Perl will correctly capture the current value in the anonymous subroutine. This is called a closure.

        To address an issue commented on in your code comment, Re: what is the difference between *a and *a{GLOB}? explains why &$subname is the same as &{*$subname{CODE}}.

        Going back to closures, a classic book to learn programming techniques using them is MJD's book Higher Order Perl. Or see Why I like functional programming (long) and Re (tilly) 1 (perl): What Happened...(perils of porting from c) (short) for a couple of examples I wrote that you can puzzle through. I'm a fan of techniques that use closures, but only if I am working with people who understand them. There is a definite learning curve in getting used to them.

        Closures can be used in most languages which support lexical scope and anonymous subroutines. That list includes most current dialects of Lisp (the one in emacs is the major exception), most scripting languages (including Perl, JavaScript, Ruby and Python though Python makes it harder than it should be), and virtually any language that calls itself a functional language. Closures are now making their way into mainstream languages. (They are in C#, and are a likely future addition to Java.)