Glav has asked for the wisdom of the Perl Monks concerning the following question:

Oh Mighty and Venerable Monks of Perl Wisdom, I humbly ask for your advice when using anonymous subs, specifically, their impact and perils. I have spent time lately cleaning and updating several modules that contain a large amount of duplicate code, both in structure and functionality, save for a couple segments. For a fictional example of a problem similar to the one encountered, I present the following code.
sub DoTask1 { my ($object, @arguments) = @_; if (! $object->TaskIsDone() ) { $object->SetupTaskOne(); } if (! @arguments ) { @arguments = qw(ARG1 ARG2 ARG3); } return $object->CommonTask(@arguments); } sub DoTask2 { my ($object, @arguments) = @_; if (! $object->TaskIsDone() ) { $object->SetupTaskTwo(); } if (! @arguments ) { @arguments = qw(ARG6 ARG2 ARG1 ARG5); } return $object->CommonTask(@arguments); } # (...) sub DoTask10 { my ($object, @arguments) = @_; if (! $object->TaskIsDone() ) { $object->DoTask(); $object->DoSomeMoreSetup(); } if (! @arguments) { @arguments = qw(ARG3 ARG5 ARG7); } return $object->CommonTask(@arguments); }
The module ended up having many subs that look almost identical. What do we, as excellent developers do when we see a maintenance issue like this? The interface is set; people rely on these methods to be there; this makes a complete rewrite unacceptable. Instead we take what have and extract, reduce, and reuse. The first obvious thing we can do is to extract the construction of default arguments out.
sub Helper { my ($object, $defaults, @arguments) = @_; if (!@arguments) { @arguments = @$defaults; } return $object->CommonTask(@arguments); } sub DoTask1 { my ($object, @arguments) = @_; if (! $object->TaskIsDone() ) { $object->SetupTaskOne(); } return $object->Helper([ qw(ARG1 ARG2 ARG3) ], @arguments); } sub DoTask2 { my ($object, @arguments) = @_; if (! $object->TaskIsDone() ) { $object->SetupTaskTwo(); } return $object->Helper([ qw(ARG6 ARG2 ARG1 ARG5) ], @arguments); } (...) sub DoTask10 { my ($object, @arguments) = @_; if (! $object->TaskIsDone() ) { $object->DoTask(); $object->DoSomeMoreSetup(); } return $object->Helper([ qw(ARG3 ARG5 ARG7) ], @arguments); }
The code from Helper can even be folded into the CommonTask method; it's simply separate here for simplicity.

Now we see that we still redo the 'if' statement in each, but the code inside is different each time. Arrugh. Could this be a place where our helper can be enhanced? What if the helper routine took a code snippet to work on? Perl has used anonymous subs for custom logic with the 'sort' and 'map' builtins. Would it work here? Would it be worth it?

sub Helper { my ($object, $code, $defaults, @arguments) = @_; if (! $object->TaskIsDone()) { $code->(); } if (!@arguments) { @arguments = @$defaults; } return $object->common_task(@arguments); } sub DoTask1 { my ($object, @arguments) = @_; return $object->Helper( sub { $object->SetupTaskOne() }, [ qw(ARG1 ARG2 ARG3) ], @arguments); } sub DoTask2 { my ($object, @arguments) = @_; return $object->Helper( sub { $object->SetupTaskTwo() }, [ qw(ARG6 ARG2 ARG1 ARG5) ], @arguments); } (...) sub DoTask10 { my ($object, @arguments) = @_; return $object->Helper( sub { $object->DoTask(); $object->DoSomeMoreSetup(); }, [ qw(ARG3 ARG5 ARG7) ], @arguments); }
The real world equivalent resulted in a couple hundred lines of code reduction. The refactoring process also exposed a couple corner cases I had missed, and almost completely reduced the code duplication in that area of the module. The module entered final unit and functional testing, and so far I've seen little to no impact on application behavior. Unfortunately, I do not have benchmarking tools in place to document performance changes, but with this particular application the majority of delays lie outside of Perl.

Later, this intrepid but inexperienced monk realized that there was another section of common code seen in his module. Very often I'd see developers using the library write code like the following:

my $original_state = Module::GetServiceState(); Module::SetServiceState(0); # temporarily disable the service, if it's + not already $object->DoSomething(); $object->DoSomethingElse(); Module::SetStatus($original_state); # restore it to what it was before
After exploring a bit, I set up a simple subroutine containing the duplicate code, taking a single argument - the work to be done while the service is off.
sub TempDisableService { my $code = shift; my $original_state = Module::GetServiceState(); Module::SetServiceState(0); # temporarily disable the service, if i +t's not already $code->(); Module::SetStatus($original_state); return; } # and later... TempDisableService sub { $object->DoSomething(); $object->DoSomethingElse(); }; # and even later... TempDisableService sub { $object->DoSomethingMore(); $object->DoSomethingAlso(); };
Now, there's been talk in the past about prototypes. Even though most talk seems to fall in the 'against' category, using one here seemed like a good application; it wouldn't be called indirectly, and it only had the code as an argument. Using a 'code block' prototype, the syntax now looks incredibly clean:
TempDisableService { $object->DoSomething(); $object->DoSomethingElse(); };
The need for 'sub' has been eliminated. I see this making it easier for less experienced developers to get it right, but it could also introduce some confusion.

Each of the main method/subroutines only contains the data specific to it's needs, and all common functionality is in one place. So far the refactored code seems to work quite well; I haven't run into any real issues in this process.

My main concern comes from only rarely seeing anonymous subs used like this in (an admittedly small number of) CPAN modules I've looked at. If others here know of particular handicaps, issues, or caveats with anonymous subroutines in Perl, I would love to hear them, especially any kinds of new behavior might pop up when using anonymous subs. Should I even be concerned? Maybe there exists many interesting examples of anonymous subroutine out there, and I just haven't seen them.

Oh great perl monks / I see, humbly pray with thee / please enlighten me!

Replies are listed 'Best First'.
Re: Caveats when using anonymous subs as subroutine arguments
by Fletch (Bishop) on Dec 02, 2008 at 00:36 UTC

    Granted it may be a relic of your sample simplification, but in this case I might had factored out the common bits into a helper that looks up the exact arguments from a hash instead (and then sets up methods for each task based on the contents of the hash).

    my %helper_args = ( Task1 => [ qw(ARG1 ARG2 ARG3) ], Task2 => [ qw(ARG6 ARG2 ARG1 ARG5) ], ## ... ); for my $task ( keys %helper_args ) { *{"Do$task"} = sub { $_[0]->_common_runner( $task, @_[1..$#_] ) }; } sub _common_runner { my $self = shift; my $task = shift; ## "Task1", "Task2", ... my @rest = @_; if( !$self->TaskIsDone() and (my( $setup ) = $self->can( "Setup$task +" )) ) { $self->$setup(); } my @args = @{ $helper_args{ $task } || [] }; $self->CommonTask( @args, @rest ); }

    Alternately create a common base class and then have that base class' CommonTask calls methods on itself (with the base class' implementations being noops / stubs); in other words the Gang Of Four "Template Method Pattern".

    Update: Oh, and to actually answer your question no that's not a totally bletcherous solution and that is one of the "accepted" usages of prototypes (to get new keyword-like syntax).

    The cake is a lie.
    The cake is a lie.
    The cake is a lie.

      I like the idea of the common default arguments hash; that would make it easier to update if/when the defaults need to change. And as you suspected, the actual methods involved were not quite as simple. A major factor in deciding to go this route came from the 'if' block contents. Most of subroutines used the same chunk of code with small differences (if any); I put that work into it's own subroutine and passed it directly. Others do mostly the same work, with minor additions and changes. (A side note: there's actually now a common interface to using these methods (eg, DoTask( TASK_1, @args); ) which makes these individual names simply a relic of maintaining the interface.)

      Since the setup is reused by maybe 80% of these methods and others use minor tweaks or additions, I'd want to specify that the same setup is used. That's easy to do with an individual setup method for each 'task', but that would create several new subroutines that simply call the common setup subroutine, leading me back to the same initial problem! I could pre-specify which specific setup to use in a constant, and then build the setup methods based on that string using 'can'; from there I'd simply look up a code reference. And in your example, that would happen at the time the module was 'use'd, making the lookup only happen once, right? Does perl resolve that reference right then, or does it still look it up every time the method is called? If it resolves it then, that seems pretty swanky.

      I'd have to make several extra minor subroutines that only ever get called from one place for each of the 20% corner cases, though. That's what led me to think of creating a common setup subroutine for most and anonymous subs for the one off's, and pass them in via a code reference. If I understand it right, the way I did it, perl will have to reconstruct that anonymous sub every time that method is called. The ones that simply use a reference to the common setup wouldn't, though.

Re: Caveats when using anonymous subs as subroutine arguments
by almut (Canon) on Dec 02, 2008 at 01:17 UTC
    TempDisableService { $object->DoSomething(); $object->DoSomethingElse(); };

    (...) but it could also introduce some confusion.

    That's exactly the point...  Although I do like the syntactic elegance (and would consider it a good use of prototypes), I'd say that employing such a rarely seen facility does not necessarily make it easy to comprehend for anyone with only a cursory knowledge of the language (in case the understanding is supposed to go beyond merely being able to apply the recipe "just put braces around what to run"...).  OTOH, if you document it, and make sure people come across the explanation before encountering the fancy construct, I'd say stick with it.

    (Side note: as an advocate of self-explanatory function/method names, I'd prefer something like "run_with_service_disabled" (or "runWithServiceDisabled" if you like), because that'd make it more obvious that what you pass in is something to run, and not some parameters controlling how to temporarily disable a service. </nitpick :> )