gu has asked for the wisdom of the Perl Monks concerning the following question:

Hi wise monks

I've been wondering about the ways of dealing with command-line arguments. Are the following equally usable, as a TIMTOWTDI proof, or is there a "best practice" among them ? Do I miss some better ones ?

In the following examples I assume that the program always needs two mandatory arguments (being filenames).

#1 : Shift and Die
my $infile = shift || die usage() ; my $outfile = shift || die usage() ; sub usage () { print "usage : prog foo bar\n" ; }
#2 : Unless and Exit
unless(@ARGV == 2) { print "usage : prog foo bar\n" ; exit(1) ; } my ($infile, $outfile) = @ARGV ;
#3 : Unless and Die
die "usage : prog foo bar\n" unless (@ARGV == 2) ; my ($infile, $outfile) = @ARGV ;

Thank you, monks.

Gu

Replies are listed 'Best First'.
Re: Best practices for processing @ARGV ?
by ikegami (Patriarch) on Dec 12, 2005 at 17:23 UTC
    • #1 is buggy. A file name "0" will incorrectly die. The (not so pretty) solution:

      defined(my $infile = shift) || die usage(); defined(my $outfile = shift) || die usage();
    • I find #2's unless (@ARGV == 2) harder to read than if (@ARGV != 2).

    • I'd print usage to STDERR, so I prefer die("message"), warn("message") and print STDERR ("message") over #1 and #2's print("message").

      I find #2's unless (@ARGV == 2) harder to read than if (@ARGV != 2)

      Me too but, as a non-native english speaker, I find it much easier to translate "unless" as if it were "if not".

      --
      David Serrano

Re: Best practices for processing @ARGV ?
by dragonchild (Archbishop) on Dec 12, 2005 at 17:33 UTC
    Getopt::Long or Getopt::Std. They both support the kinds of solutions you want to do, plus they provide the abilities to do more. There is absolutely no reason to reinvent the wheel, especially not this one.

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: Best practices for processing @ARGV ?
by Belgarion (Chaplain) on Dec 12, 2005 at 17:14 UTC

    My personal preference is your third option. Only two lines of code (and less code is easier to read, normally) and it's obvious the die statement only executes if the number of arguments is not 2.

Re: Best practices for processing @ARGV ?
by turo (Friar) on Dec 13, 2005 at 21:56 UTC
    I'm not an expert in command line parsing; but command line parsing is a part of your program that must be easy and quick to program, and very easy to extend (program options, if there are not planified, may grow without control!).
    I like very much the Getopt::Std and Getopt::Long (dragonchild mentioned yet), because its similarity to the 'getopt' (man 3 getopt) function in C.
    Anyway, i usually program my own function for parsing arguments, like this (for example):
    #!/usr/bin/perl -w use strict; my $logfile = "/var/log/whatever.log"; my $verbose = 0; parseArgs($#ARGV + 1, @ARGV); # # ... do stuff with the logfile ... # # subroutine parseArgs sub parseArgs { my ($argc, @argv) = @_; my $need_help = 0; # is mandatory to receive comand args? usage() if ( $argc == 0 ); for ( my $i=0; $i < $argc; $i++ ) { foreach ($argv[$i]) { if ( /^-logfile$|^-l$/ ) { if ( $argv[$i + 1] ) { #here you must do, if you want, some checks #before asing nothing ... $logfile = $argv[++$i]; } else { #complain! $needs_help = 1; } } elsif (/^-v$|^--verbose$/) { $verbose = 1; } elsif (/^-h$|^--help$) { $needs_help = 1; } else { print "Command argument ", ($argv[$i]), " is not valid\n"); $needs_help = 1; } } } usage() if ( $needs_help ); } sub usage { print <<EOF usage: program [-l logfile] [-v] [-h] blablabla .... EOF exit 0; }


    It simple, and not difficult to understand and modify. Yes i know its not exactly what you are asking for, sorry ...

    Good Luck!

    perl -Te 'print map { chr((ord)-((10,20,2,7)$i++)) } split //,"turo"'

      but command line parsing is a part of your program that must be easy and quick to program, and very easy to extend

      It's also a part of the user interface of your program, and like all user interfaces, it should match what the user's expectations are (DWIM, if you will). That's why human interface guidelines specify consistency from one application to another.

      Getopt::Long implements many of the long standing conventions people expect to be able to use on the command line. Some of the ones you are missing in your home rolled implementation are:

      • Your long option -logfile does not follow the double dash convention.
      • Your long option names cannot be abbreviated to a unique substring.
      • Your option -logfile that takes an argument cannot be specified as a single argument (which works by juxtaposition (bundling) for short options and --option=value for long options).

      So I suggest use of Getopt::Long, it's part of the standard distribution after all. However, it's not perfect either. For example, bundling is off by default but people used to the UNIX command line will expect bundling to work.

      That said, the original question was about multiple modules picking out options from the command line. Take a look at how some modules like Gtk2 deal with this. Gtk2 wants to handle some options but so might the application. So the Gtk2 initialization code fiddles with the program's @ARGV to find the options it is interested in, removed them, leaving the rest for the application to process (incidentally, it works the same in the C version). The application does not have to worry about the GTK+ specific options. Maybe the OP can use the same method?

        Okay, i must admit tha Getopt::Long is cool and simple to use, but i always forget how to use, so i prefer to implement my own parseArgs (reminiscences of c 'while(c=getopt(...)) { switch(c) { ... } }' i think).

        Well, in my example i cant not accept the double hash convention, but is was only an example. You can accept this conventions by modifying the regular expresion:

        if (/-{1,2}l(ogfile){1}/) { # do stuff }

        i must go to eat!! :-)

        Regards

        perl -Te 'print map { chr((ord)-((10,20,2,7)[$i++])) } split //,"turo"'