Seek Help Refactoring

jkeenan1 has asked for the wisdom of the Perl Monks concerning the following question:

Suppose that I have two subroutines, each of which has nested if/else conditions, which are identical in all respects except that, deep inside the if/else nest, one sub performs an action in all circumstances, while the other performs an action only if some condition X is true. The action may be called at a number of points within the subroutine depending on the value of arguments.

I would like to be able to refactor out all the code shared by the two subs but am wondering what the best way to handle the differences. Some pseudo-code:

sub alpha {
    my $self = shift;
    ...
    foreach ...
        if ...
            if ...
                if ...
                    push @foo, [ @bar ];
                }
            }
        } else {
            push @foo, [ @bar ];
        }
    }
    push @foo, [ @bar ];
    return \@foo;
}

sub beta {
    my $self = shift;
    ...
    foreach ...
        if ...
            if ...
                if ...
                    push @foo, [ @bar ] if $self->{baz} eq $gamma;
                }
            }
        } else {
            push @foo, [ @bar ] if $self->{baz} eq $gamma;
        }
    }
    push @foo, [ @bar ] if $self->{baz} eq $gamma;
    return \@foo;
}
[download]

Except for the difference between

    push @foo, [ @bar ];
[download]

and

    push @foo, [ @bar ] if $self->{baz} eq $gamma;
[download]

... the two subroutines are identical. Which implies that I could and should refactor out the common code into an internal subroutine. I would like to have a structure like this:

sub alpha {
    my $self = shift;
    my $coderef = sub {1};
    return $self->_internal_sub($coderef);
}

sub beta {
    my $self = shift;
    my $coderef = sub {$self->{baz} eq $gamma};
    return $self->_internal_sub($coderef);
}

sub _internal_sub {
    my $self = shift;
    my $coderef = shift;
    ...
    foreach ...
        if ...
            if ...
                if ...
                    push @foo, [ @bar ] if &$coderef;
                }
            }
        } else {
            push @foo, [ @bar ] if &$coderef;
        }
    }
    push @foo, [ @bar ] if &$coderef;
    return \@foo;
}
[download]

... but of course the values of $self-baz and $gamma are unknown at the time of assignment to $coderef in beta().

I have a feeling I'm missing something obvious here. Can any of the monks suggest a remedy? Thanks.

Jim Keenan

Comment on Seek Help Refactoring Select or Download Code

Replies are listed 'Best First'.
Re: Seek Help Refactoring by CountZero (Bishop) on Apr 02, 2006 at 21:30 UTC
Perhaps surround the `push @foo, [ @bar ] ...` with something like: `if ($mode = 'alfa') { push @foo, [ @bar ]; } elsif ($mode = 'beta') { push @foo, [ @bar ] if $self->{baz} eq $gamma; } else { die "Unknown mode: $mode!"; }` [download] CountZero "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law	[reply] [d/l] [select]
Re^2: Seek Help Refactoring by jkeenan1 (Deacon) on Apr 03, 2006 at 00:41 UTC
Thanks, CountZero! While there was something of interest in all the comments, yours was the most immediately applicable because it clued me into what to do right at the three points the `push` call was made and thereby enabled me to refactor out all the repeated code. In my real code, it meant that I was able to refactor the methods so they looked like this: `sub all_segments { my $self = shift; return $self->_gapfill_engine('all'); } sub all_gapfillers { my $self = shift; return $self->_gapfill_engine('gap'); } sub _gapfill_engine { my $self = shift; my $mode = shift; ... # foreach and if loops (next stage in refactoring) # do one thing if $mode eq 'all' # do another thing if $mode eq 'gap' return \@segments; }` [download] ... which is what I was aiming for. Thanks again. Jim Keenan	[reply] [d/l] [select]
Re: Seek Help Refactoring by GrandFather (Saint) on Apr 02, 2006 at 21:55 UTC
My first step would be to reduce the nesting. Each level of `if` nesting is another `and` clause the reader of the code has to remember to understand the condition that gets a particular line of code executed. Often nested ifs can tidy up a lot if you use early exits or negate the condition. Sometimes a redundant test cleans up the nesting sufficiently that the time cost doesn't matter compared to the ease of understanding. Sometimes it's even worth using a exitable if: `while (condition) { # Exitable if if (condition) { ... last; #early exit } .... last; # Bad things happen if this if omitted :) }` [download] Asside from that you could simply pass a boolen rather than a coderef. Then the test becomes `push @foo, [ @bar ] if $doTest and $self->{baz} eq $gamma;`. DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re: Seek Help Refactoring by Tanktalus (Canon) on Apr 02, 2006 at 21:55 UTC
In addition to some perfectly reasonable ideas already presented, here's the way I do a similar thing. In my case, I'm going to use $baz rather than $self->{baz} to show that it's a lexical. Or, more to the point, that $baz may not be known until you're three levels deep inside _internal_sub and completely dynamic. `sub alpha; # unchanged from your example. sub beta { my $self = shift; my $coderef = sub { $_ eq $gamma }; $self->_internal_sub($coderef); } sub _internal_sub { my $self = shift; my $coderef = shift; foreach ... if ... if ... if ... local $_ = $baz; push @foo, [ @bar ] if $coderef->(); } } } else { local $_ = $baz; push @foo, [ @bar ] if $coderef->(); } } local $_ = $baz; push @foo, [ @bar ] if $coderef->(); \@foo; }` [download] I prefer the `$coderef->()` version over `&$coderef` because the former says "dereference this code reference, and call it with no parameters" while the latter says "call this piece of code, (ignoring prototypes - not relevant with code refs), passing in @_." The former is what I'm doing, not the latter. However, since you have three places you call this coderef, it may be better to just pass in the parameter: `sub beta { my $self = shift; my $coderef = sub { $_[0] eq $gamma }; $self->_internal_sub($coderef); } # and ... push @foo, [ @bar ] if $coderef->($baz);` [download] Hope that helps,	[reply] [d/l] [select]
Re: Seek Help Refactoring by xdg (Monsignor) on Apr 02, 2006 at 21:25 UTC
Is the conditional known before the if-else logic? If so, why not just pass them as a precomputed argument rather than as a closure? `sub alpha { my $self = shift; my $condition = 1; # always return $self->_internal_sub($condition); } sub beta { my $self = shift; my $gamma = get_gamma_somewhere(); my $condition = $self->{baz} eq $gamma; return $self->_internal_sub($condition); } sub _internal_sub { my $self = shift; my $condition = shift; ... foreach ... if ... if ... if ... push @foo, [ @bar ] if $condition; } } } else { push @foo, [ @bar ] if $condition; } } push @foo, [ @bar ] if $condition; return \@foo; }` [download] -xdg Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.	[reply] [d/l]
Re^2: Seek Help Refactoring by jkeenan1 (Deacon) on Apr 03, 2006 at 00:49 UTC
xdg wrote: Is the conditional known before the if-else logic? If so, why not just pass them as a precomputed argument rather than as a closure? No, it is not. Some of the variables that go into the conditional are more tightly scoped than that or not even declared yet. So I can't pass a pre-computed argument. jimk	[reply]
Re: Seek Help Refactoring by dragonchild (Archbishop) on Apr 03, 2006 at 00:00 UTC
I have a few thoughts. It's pretty obvious that there's further refactoring that can be done. The number of if-else constructs is CodeSmell and indicates that further refactoring can be done. So long as you maintain the exact same interface and behaviors to the outside world, it doesn't matter how much you have or have not refactored. Basically, this means you should only refactor as much as is necessary for the items you have at the time. Every change is a risk. Only refactor with an automated test suite. Period, end of story. The first refactoring that should be done is variable names. It can always be done to any code, period. If it works, it passes the First Rule. My criteria for good software: Does it work? Can someone else come in, make a change, and be reasonably certain no bugs were introduced?	[reply]
Re^2: Seek Help Refactoring by moklevat (Priest) on Apr 03, 2006 at 14:13 UTC
Hot dog! Would you look at that! Here it is only 10 minutes into the work day and I've already learned my "something new" for the day. Thank you dragonchild for introducing me to the term CodeSmell, which I Super Searched to find your helpful Meditation that defined it for me (though I did a bit of Googling too). For the rest of the day everything else will be gravy.	[reply]
Re: Seek Help Refactoring by liverpole (Monsignor) on Apr 02, 2006 at 22:04 UTC
It's a very good question, as it deals more with the philosophy of programming, and isn't a specific "why isn't such-and-such working?" the way the usual questions are. In the absence of more information, I like CountZero's answer, because it's clean and simple. But xdg's answer is a good one too, especially if there are cases where you might other subroutines (eg. gamma, delta, etc.) to call the new _internal_sub() subroutine. I did want to point out that, when I encounter a situation like this, it often helps me to ask whether the two subroutines would be likely to evolve together? That is, if you were to make a change/improvement to alpha(), would you be making the same change/improvement to beta()? If so, then by all means use a flag (eg. set to "alpha" or "beta") so that you get more mileage from only having to make future changes to one subroutine rather than two. s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/	[reply]
Re: Seek Help Refactoring by TedPride (Priest) on Apr 02, 2006 at 23:42 UTC
You seem to have the structure: if a and b and c then do... if !a then do... Which, if I'm not messing up my logic analysis, can be simplified to: if !a or b and c then do... Both functions should be simplified down to: `sub alphabeta { my ($self, $test) = @_; ... foreach ... if (!... \|\| ... && ...) { push @foo, [ @bar ] if !$test \|\| $self->{baz} eq $gamma; } } push @foo, [ @bar ] if !$test \|\| $self->{baz} eq $gamma; return \@foo; }` [download] Then you'd just call it with the second value turned on if you wanted $self->{baz} tested. I am of course also little confused as to what the second `push @foo, [ @bar ]` is there for.	[reply] [d/l] [select]
Re: Seek Help Refactoring by codeacrobat (Chaplain) on Apr 02, 2006 at 22:43 UTC
how about `my ($alpha, $beta); $alpha = $beta = << "EOF"; my $self = shift; ... foreach ... if ... if ... if ... push @foo, [ @bar ]; } } } else { push @foo, [ @bar ]; } } push @foo, [ @bar ]; return \@foo; END $beta =~ s/(push \@foo, \[ \@bar \]);/$1 if \$self->\{baz\} eq \$gamma +;/gs; $beta =~ s/sub alpha/sub beta/; sub alpha { eval $alpha; } sub beta { eval $beta; EOF` [download] # warning this posting was written with after 2 beers.	[reply] [d/l]