apotheon has asked for the wisdom of the Perl Monks concerning the following question:

The point of this node is pretty simple. I have an example of a brief little program, and I wonder what would be done to format it according to "best practices" of readability. Obviously, I expect that there will be several different answers to this, but I'd like to know the whys and wherefores of them. One must always strive to improve, after all.

The program itself does nothing more than take two integers as command line arguments and print the sum of the range of numbers defined by them to STDOUT. The default behavior desired is for the first argument to be the integer immediately preceding the first number of the range, and for the second argument to be the last integer of the range. Additional desired behavior is for a command line option to be available to specify that the first numerical argument is the first number of the range, rather than the preceding integer.

For bonus points, tell me how you might modify the program for usability purposes. For instance, how (if at all) would you handle improper input such as non-integer arguments (e.g. floating point numbers like 2.7)?

And now, the code:

#!/usr/bin/perl -l use strict; use warnings; use Getopt::Std; our ($opt_s); getopts('s'); if ($opt_s) { print( ($ARGV[1] - $ARGV[0] + 1) * ($ARGV[0] + $ARGV[1]) / 2 ) } else { print( ($ARGV[1] - $ARGV[0]) * ($ARGV[0] + $ARGV[1] + 1) / 2 ) }

edit: It is of at least minor historical interest that this method of determining the sum of a range of numbers is credited originally to Leonardo Fibonacci, and secondarily to Carl Gauss in a quaint tale of outsmarting his primary school teacher. In the latter case, the teacher gave the whole class an insipid busywork assignment, to add together all the whole numbers from 1 to 100 (inclusive). Gauss is said to have almost instantly blurted out the answer (5050), having quickly rediscovered the trick of determining the sum of the first and last numbers in the range and multiplying by half the total number of integers in the range.

UPDATE (12:15 MST, 15 JAN 2007): What follows is a newer version of this program, incorporating ideas that have come up in this discussion and ideas that came to me while I stared at the suggestions offered here.

#!/usr/bin/perl -l use strict; use warnings; use Getopt::Std; our $VERSION = '1.0'; $Getopt::Std::STANDARD_HELP_VERSION = 'true'; my $help = ' syntax: series [-s] <first integer> <second integer> --help prints this help text and exits: invoking the help argument causes all other arguments to be discarded by this utility -s takes no arguments: changes program behavior so that the first number specified is the first number of the range, and not the number preceding it (syntax: series a+1 n) -- this option must precede all numerical arguments to the series utility description: This utility takes two positive integers as arguments and produces the sum of the range of integers starting after the first argument and ending with the second argument. It allows 0 to be specified as the first argument, indicating that all numbers in the range are positive integers. credits: Chad L. Perrin (author) PerlMonks community (contributors) -- node 594743 license: This software is licensed CCD CopyWrite. See the webpage at http://ccd.apotheon.org for licensing details. '; our ($opt_s); getopts('s'); die $help unless defined $ARGV[1]; map { /^-?\d+$/ or die "Invalid input: '$_'$!" } @ARGV; die "Error: arguments in descending value order.\n$help" if ($ARGV[0] +> $ARGV[1]); my $start = $ARGV[0] + ( $opt_s ? 0 : 1); my $end = $ARGV[1]; print( ($end - $start + 1) * ($start + $end) / 2 ); sub HELP_MESSAGE { print $help }

Frankly, if I'd thought about it before I started all this, I would have specified positive integers for the range (allowing 0 for the first numerical argument without the -s option). Since I didn't think of it, though, I'm happy to consider the problem of how to fix the fact that for some reason this version of the program errors out when a negative number argument is supplied because it thinks the negative indicates a command line option.

print substr("Just another Perl hacker", 0, -2);
- apotheon
CopyWrite Chad Perrin

Replies are listed 'Best First'.
Re: a simple exercise in readability
by liverpole (Monsignor) on Jan 15, 2007 at 13:57 UTC
    Hi apotheon,

    I'm a fan of producing a syntax message whenever arguments to a program are expected.  That way, the user never has to guess what the program does, nor what the program expects for input.

    Additionally, if you don't check to see that you DID get input, then you'll end up getting errors or "harder to track" unexpected results.

    That's why I'd suggest doing an assignment to your command line variables up front, and aborting with a syntax message if the program doesn't get what it wants.  That has the additional benefit of letting you do validity checking on those variables next, as well as simplifying your code later.

    Here's how I might "neaten it up" a bit:

    #!/usr/bin/perl -l # Libraries use strict; use warnings; use Getopt::Std; use File::Basename; our ($opt_s); getopts('s'); # Globals my $iam = basename $0; my $syntax = " syntax: $iam <first number> <second number> Your syntax message here. "; # Command-line (my $x = shift) or die $syntax; (my $y = shift) or die $syntax; # Validity checking ($x =~ /^-?\d+$/) or die "$iam: value $x not an integer\n"; ($y =~ /^-?\d+$/) or die "$iam: value $y not an integer\n"; # Main program if ($opt_s) { printf "%d\n", ($y - $x + 1) * ($x + $y) / 2; } else { printf "%d\n", ($y - $x) * ($x + $y + 1) / 2; }

    To my eye, the main program is a lot easier to read now, as a result of assigning to $x and $y.  That's partly because the lines are a lot shorter, so each printf statement fits on its own, single line.


    s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/

      Here's a version incorporating your approach, but in a way that's slightly more idiomatically "mine":

      #!/usr/bin/perl -l use strict; use warnings; use Getopt::Std; our $VERSION = '1.0'; $Getopt::Std::STANDARD_HELP_VERSION = 'true'; my $help = ' syntax: series [-s] <first integer> <second integer> --help prints this help text and exits: invoking the help argument causes all other arguments to be discarded by this utility -s takes no arguments: changes program behavior so that the first number specified is the first number of the range, and not the number preceding it (syntax: series a+1 n) -- this option must precede all numerical arguments to the series utility description: This utility takes two integers as arguments and produces the sum of the range of integers starting after the first argument and ending with the second argument. '; our ($opt_s); getopts('s'); HELP_MESSAGE() and exit unless $ARGV[1]; foreach (@ARGV) { HELP_MESSAGE() and exit unless /^-?\d+$/ } my $start = shift @ARGV; my $end = shift @ARGV; if ($opt_s) { print( ($end - $start + 1) * ($start + $end) / 2 ) } else { print( ($end - $start) * ($start + $end + 1) / 2 ) } sub HELP_MESSAGE { print $help; }

      I figured I might separate the help text dump for too few numerical arguments from the variable assignments to improve maintainability a tad. I also decided to use the Getopt::Std built in help and version output functionality. Thoughts . . . ?

      edit: johngg pointed out the "unless" bug above, where unless @ARGV should read unless defined @ARGV instead.

      print substr("Just another Perl hacker", 0, -2);
      - apotheon
      CopyWrite Chad Perrin

        First of all, apotheon, that's a very nice syntax message.  So many times, when I've run a script or program which gave no clue as to its purpose, I've wished for a syntax message even 1/10th as good as that one, so kudos for writing something clear and descriptive!

        Secondly, instead of:

        HELP_MESSAGE() and exit unless $ARGV[1]; foreach (@ARGV) { HELP_MESSAGE() and exit unless /^-?\d+$/ } # ... and later ... sub HELP_MESSAGE { print $help; }

        why not just do the following, which is (I think) more clear, takes less space, and is more precise in the case of invalid input ...

        die $help unless $ARGV[1]; map { /^-?\d+$/ or die "Invalid input $_\n" } @ARGV;

        s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
        You have changed liverpole's

        # Command-line (my $x = shift) or die $syntax; (my $y = shift) or die $syntax; # Validity checking ($x =~ /^-?\d+$/) or die "$iam: value $x not an integer\n"; ($y =~ /^-?\d+$/) or die "$iam: value $y not an integer\n";

        to

        my $start = shift @ARGV; my $end = shift @ARGV;

        It seems to me that you have gained more meaningful variable names but you have lost two forms of error checking, making your script less robust. You no longer detect if the user fails to supply enough arguments and you fail to validate those arguments that are supplied. Probably a backward step.

        Cheers,

        JohnGG

        Update: As apotheon pointed out, I completely missed an element of his code. Please ignore this post.

      # Command-line (my $x = shift) or die $syntax; (my $y = shift) or die $syntax;

      One small problem with this approach is that it does not allow zero as an input. You have to say something like:

      die $syntax if ! defined (my $x = shift);

      ...which is a little more clumsy, since the important part ($x = shift) is way off to the side.

        Yes, good point.  Of course my solution is only intended for cases where one explicitly does NOT want the value of zero as valid input.

        As for the "little more clumsy":

        die $syntax if ! defined (my $x = shift);

        and assuming you needed to allow for zero valued input, I'd be inclined to go with the slightly pithier:

        defined(my $x = shift) or die $syntax;

        s''(q.S:$/9=(T1';s;(..)(..);$..=substr+crypt($1,$2),2,3;eg;print$..$/
Re: a simple exercise in readability
by kyle (Abbot) on Jan 15, 2007 at 14:47 UTC

    I'd rewrite as follows:

    #!/usr/bin/perl -l use strict; use warnings; use Getopt::Std; our ($opt_s); getopts('s'); # get the range boundaries my $range_start = ( $opt_s ) ? $ARGV[0] : $ARGV[0] + 1; my $range_end = $ARGV[1]; # compute the sum of the range my $range_length = $range_end - $range_start + 1; my $range_sum = ($range_start + $range_end) * $range_length / 2; print $range_sum;

    The named variables make it easier to tell what it's doing. I might also consider handling the -s option like this:

    my $range_start = $ARGV[0]; $range_start++ if ( $opt_s );

    It's a little cleaner, but I think the ternary is actually clearer here. It's more obvious that there's a decision, and what it means.

    Ordinarily I'd say it should have comments too, but I'm so enamored with the readability of my own code, I think it doesn't need them anymore. If there were more options, they'd definitely need documentation.

    For bonus points, tell me how you might modify the program for usability purposes. For instance, how (if at all) would you handle improper input such as non-integer arguments (e.g. floating point numbers like 2.7)?

    I'd probably just put in a check early on and die with a usage statement:

    if ( $ARGV[0] !~ m{ \A [+-]? \d+ \z }xms || $ARGV[1] !~ m{ \A [+-]? \d+ \z }xms ) { die "usage: $0 [-s] <integer> <integer>\n"; }

    A longer, more detailed usage message would be better, but you get the idea.

      I quite like your use of the ternary operator there. I tend to be reluctant to use it because of the usual increased readability of explicit if/else statements (at least in most of the code that I write), though in this case I think it is well suited to the task.

      I think there's a limit to the value of explicit step-by-step variable assignment and use, however, for readability. In this case, the arithmetic formula used to produce results seems obfuscated by being broken up the way you've done so -- though it may well just be a matter of taste, or I may just be weird to think it's more readable to keep the formula a little more integrated.

      I'd appreciate it if you'd have a look at the relevant parts of my response to liverpole, and perhaps comment on the way I used explicit variable names for the start and end integers, adjusting perhaps for the use of the ternary operator.

      In fact, I'll cut out the middleman and provide a version of the relevant snippet using the ternary operator:

      my $start = ( $opt_s ) ? $ARGV[0] : $ARGV[0] + 1; my $end = $ARGV[1]; print( ($end - $start + 1) * ($start + $end) / 2 );

      edit: code typo pointed out by kyle at roughly the same instant I noticed it myself

      print substr("Just another Perl hacker", 0, -2);
      - apotheon
      CopyWrite Chad Perrin

        I like things broken up and named. For a relatively simple operation like this, it doesn't make as much of a difference. Yours is shorter and still fairly easy to understand (easier with a comment). I'd say it's a matter of taste.

        One thing that might recommend my version over yours is a bug that we both had (I've edited mine since), namely the range length should be ($end - $start + 1). I might say that mine (with the name) makes this bug more obvious. Then again, maybe not. Someone might read the variable name and not the formula, and just assume it's right when it's not.

Re: a simple exercise in readability
by Zaxo (Archbishop) on Jan 15, 2007 at 16:39 UTC

    Pod::Usage will let pod documentation double as a usage message. That's both handy for you and an encouragement to give your users the benefit of perldoc and derived man pages.

    After Compline,
    Zaxo

      I've been meaning to learn more about the use of POD. Maybe this will be the year I become one of the POD people.

      Thanks for the tip about Pod::Usage.

      print substr("Just another Perl hacker", 0, -2);
      - apotheon
      CopyWrite Chad Perrin

        You'll find it's really easy to learn and well worth it. I, too, learned pod after using perl for a while.

        The only mental shift you'll need, I think, is to remember that newlines and whitespace are significant in pod sections. Other than that, just open perlpod in a spare terminal for reference, and start writing it. For examples, perldoc -m will open any module or utility you like with the pod unparsed. You can also use perldoc to see how you're doing as you write.

        The utilities pod2man, pod2html, pod2latex and friends will convert pod to other formats. That is so handy, and pod so simple, that when I have to produce any covered format I first write in pod.

        After Compline,
        Zaxo

Re: a simple exercise in readability
by BrowserUk (Patriarch) on Jan 15, 2007 at 18:20 UTC
    #! perl -slw use strict; our $h; our $s; die <<usage if $h or @ARGV != 2; $0 [-s] low high Calculates sum( low+1 .. high ) With -s calculates sum( low .. high ) (integers only) usage my( $low, $high ) = @ARGV; die "Arguments must be integers\n" unless $low =~ m[^\d+$] and $high =~ m[^\d+$]; my $result = ( $low + $high ) * ( $high - $low + 1 ) / 2; $result -= $low unless $s; print $result;

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: a simple exercise in readability
by jwkrahn (Abbot) on Jan 15, 2007 at 18:42 UTC
    #!/usr/bin/perl use strict; use warnings; use Getopt::Std; getopts( 's', \my %opt ); print( ( $ARGV[ 1 ] - $ARGV[ 0 ] + $opt{ s } ? 1 : 0 ) * ( $ARGV[ 0 ] + $ARGV[ 1 ] + $opt{ s } ? 0 : 1 ) / 2, "\n" );