in reply to Call sub first or later

The options are:

  1. Test whether to invoke the routine
  2. $str = fraction($str) if ($str =~ /\d+_\d+/);
  3. Invoke the routine unconditionally
  4. $str = fraction($str); ... sub fraction { my $str = shift; if ($str =~ /\d+_\d+/) { # convert $str to fraction using mimetex syntax } return $str; }

In my mind, it comes down to two factors, code clarity and cost.

Code Clarity

Because of the nature of Perl programs, human readability is important. We spend more time maintaining and updating existing programs than writing new programs. According to Robert L. Glass, Facts and Fallacies of Software Engineering, page 115, "Maintenance typically consumes 40 to 80% ( average 60% ) of software costs." Since one of the selling points of Perl is the ease of maintenance, I suspect Perl scripts/programs are more likely to be modified than C or Java programs would be. The easier it is to understand a program, the quicker you can carry out the required modifications, so I consider it worthwhile devoting some effort in making code readable.

If a condition invokes more than one action, use a traditional if block; if a series of conditions invokes the appropriate transformation, use the if/elsif/else block:

if ($str =~ /\d+_\d+/) { $str = fraction($str) } elsif ( condition 2 ) { action 2 } else { default }

If the important thing is that the value may be modified, use the conditional suffix:

$str = fraction($str) if ($str =~ /\d+_\d+/)

This emphasizes the modification, with the condition as a secondary factor.

To emphasise the transformation even more, hide the condition by moving it into the routine, as you suggested.

I don't totally like using $str in both left hand and right hand sides of the assignment. While far short of what M.J. Dominus calls a red flag, the double appearance suggests that other coding mechanisms should be considered.

One possibility is to pass a reference to the routine, so that it takes the form of a subroutine without any return value:

convert_ratio_to_decimal( \$str );

Or, better yet, specify a prototype for the routine so that you don't have to mark the variable parameter as being passed as a reference:

sub convert_ratio_to_decimal (\$) { ..... }; convert_ratio_to_decimal( $str );

Maybe $str should not be a scalar at all, maybe it should really be an object, or a component of an object. If you can visualize a concrete or abstract thing---an airplane, a bank account---which has a set of operations it can perform, an object is a natural representation:

$obj->convert_ration_to_decimal();

Cost

If this is an operation you perform on user input, efficiency is not too significant. If you are processing multi-million record log files, thinking in terms of costs becomes more significant.

Do not obfuscate code thinking that you are making it faster, until you have run benchmarks to determine where you are spending your time. Often, a better algorithm might pay off better than a better implementation. Still, it can be worth remembering relative costs when writing your code.

Invoking a function involves overhead. The overhead is tolerable for significant routines, but becomes expensive for frequently called routines that perform simple tasks. That's why C++ and Java provide inline functions.

If there are many records, and only a few require modification, test first and invoke the function only if necessary.

One problem with testing outside the routine is that the code needs to know about implementation details, in the regex /\d_\d+/. When the regex is in the function, details about the transformation and when it is performed are better encapsulated. One solution would be to put the condition into a routine of its own. Obviously, this may be expensive for a multi-million record problem, but might be suitable for simpler cases. I have used conditional routines in a error-tracking module I wrote ... actually, it was a class:

$error->logBadDateError( $str ) if (ErrorTracking::invalidDate( $str ) +)

In that case, some of the conditions were more complex, but all benefited from replacing clumps of code with meaningful names.

In your case, it's a simple regex, so you don't need a function call, you could simply store the regex in a variable, possibly in an external module:

$varName = qr(regex goes here); if ( $str =~ /$UnderscoredNumber/ ) ... # poor if ( $str =~ /$SimpleRatio/ ) ... # good if ( $str =~ /$Ratio::simpleFraction/ ) ... # better

--
TTTATCGGTCGTTATATAGATGTTTGCA

Replies are listed 'Best First'.
Re: Code Clarity & Cost Analysis
by dragonchild (Archbishop) on Apr 26, 2004 at 18:54 UTC
    While I wish I could ++ this node more than once, I do have one nit.

    I don't totally like using $str in both left hand and right hand sides of the assignment. While far short of what M.J. Dominus calls a red flag, the double appearance suggests that other coding mechanisms should be considered.

    Perl does this in the core with uc and friends. I originally considered this to be a crufty legacy from the past (and it may be), but I have since found it to be a benefit.

    My reasoning is simple - when I'm parsing something, I often want to keep the original unblemished and transform the original into a copy. While I can easily do something as assign it to a copy, I'm not always at liberty to be working with the absolute original copy. Sometimes, I, the caller of fraction(), am working with a partially-transformed version of the copy. By explicitly saying $str = fraction( $str );, I am telling my maintainer(s) what is going on without them having to read the code for fraction(). Simply saying convert_ratio_to_number( $str ); may not be explicit enough.

    Now, this argument does not apply in the case of an object. Methods are assumed to have side-effects where functions are not.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose

Re: Code Clarity & Cost Analysis
by kiat (Vicar) on Apr 27, 2004 at 00:48 UTC
    Thanks for your sharing. There's so much to learn from your inputs :)