Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I'm looking for a module that can parse command-line parameters without setting up expected options.

For instance, if my cmd line looks like:

myscript.pl --name 'joe' val=0 --list 'a' 'b' 'c'

I would like to have in my script:

my $opts = GetOpts::HelpMeHere; $opts->{name}; # 'joe' $opts->{list}; # [ 'a', 'b', 'c' ]

It would really be great, but I can't seem to find such magic module. They all expect me to specify the options I'm supposed to get beforehand.

Thanks,

Miguel

Replies are listed 'Best First'.
Re: generic getopt to hash
by ELISHEVA (Prior) on Oct 01, 2009 at 18:52 UTC

    The recent versions of Getopt::Long support a really wide variety of option syntax, including all of the examples you gave above (except that you would need to use '-v' instead of 'v': it does need a character to distinguish options from option values). You can also use both long and short options, include or omit equal signs, follow each option with a single value or many, and even do some lightweight type checking. Order doesn't matter - you can have a multi-valued option before or after other options. Command lines such as:

    • myscript.pl --name joe -v=0 --list a b c
    • myscript.pl --name joe --val=0 --list a --list b --list c
    • myscript.pl --name joe --list a --list b --list c -v=0
    • myscript.pl --name joe --list a b c -v=0

    can be parsed with the following code:

    use strict; use warnings; use Getopt::Long; use Data::Dumper; #------------------------------------------------------- sub parseArgs { my ($aRules, $aArgs) = @_; # version used with 5.8.8 expects arguments in @ARGV # the version in the 5.10 core has a function that can # read arguments from any array, not just @ARGV. local @ARGV = @$aArgs; Getopt::Long::GetOptions(my $hOpts={}, @$aRules); return $hOpts; } #------------------------------------------------------- my $aRules = [ 'name=s' #allow any string as value , 'val|v=i' #allow --val or -v, require number , 'list=s@{,}' #allow arbitrary lists of values ]; my $hParsed = parseArgs($aRules, \@ARGV); print Dumper($hParsed);

    For all of these various combinations of command line arguments it would produce the same data structure:

    $VAR1 = { 'name' => 'joe', 'val' => 0, 'list' => [ 'a', 'b', 'c' ] };

    Best, beth

    Update: added example with multivalued option followed by other options, as per request of jakobi.

      It made be curious enough to ask the author himself for clarification wrt code vs docs. Here's part of Johans reply to me:

      > /me
      > According to my reading of the 2.38 docs however, the --list option
      > should have eaten -v as well, as it states that args can start with
      > - or --. Which I'd take to mean that all further elements in @ARGV
      > are eaten by --list.
      
      /johan
      Args can start with -/-- only if mandatory. list=s@{,} allows for
      many arguments, but does not require them. list=s@{,4} would *require*
      4 arguments and eat -/-- args if necessary.
      

      He also pointed me to !FINISH to help with stopping option parsing early. Together with ::Long's option aliasing, this should suffice to replace some of my looping thru @ARGV to massage arguments before getopt :).

      Thanx to Johan & Beth for her excellent example above!

Re: generic getopt to hash
by ikegami (Patriarch) on Oct 01, 2009 at 18:15 UTC

    I've never seen a program that expects arguments like --list a b c. Normally, it's --list a --list b --list c.

    Anyway, how can you tell the difference between the following without configuration:

      --list a b c  ⇒  $opts{list} = [qw( a b c )]; @ARGV = qw( );
      --name a b c  ⇒  $opts{name} = 'a';           @ARGV = qw( b c );
      --flag a b c  ⇒  $opts{flag} = 1;             @ARGV = qw( a b c );
    

    You'll need to add restrictions.

    Updated format

      In his defense, Getopt::Long does support options taking multiple values just as he expressed. Of course, it does require configuration. I used it to implement a "arguments before -x are plaintext, after -x are xml" logic. I'm sure I could have done better, but it worked.

      print pack("A25",pack("V*",map{1919242272+$_}(34481450,-49737472,6228,0,-285028276,6979,-1380265972)))

        Thanx for the pointer.

        Acc. to docs, = type [ desttype ] [ repeat ] allows things like GetOptions ('list=s{1,}')(?) to allow a variable length list. Should work nicely for postive integer lists (i). But it seems to have the slight problem that for strings (s), all remaining args from @ARGV are eaten DOC CONTRADICT IMPL **, even if you'd like to stop the list before say the next string starting with a minus WORKS (or an argument that is existing in the filesystem, which you may or may not treat as the first non-option argument instead AFAICS UNSUPPORTED).

        An argument on why such vararg-options might be sometimes convenient: quoting s/hell: The normal way for multi word lists as arguments to an option is probably using a single string and splitting it on whitepace. But if the words themselves are allowed to contain both quoting and whitespace, it might be worthwile to skip having both shell and perl doing quote interpolation and thus shave off an extra layer or two of the quoting has to use on entering the command in the shell.

        Is there a getopts variant that allows still a bit more for such a scenario?

        Updated: ** is from my reading of the docs for 2.38, while Elisheva's code shows that the current module does stop "eating" at the start of a new option. Kudos to Elishiva for her comment below!
Re: generic getopt to hash
by bv (Friar) on Oct 01, 2009 at 17:46 UTC

    But of course! Otherwise, how does it know that --name needs a value? or that --list takes more than one value? This is the way the getopt libraries work. If you want different, you will likely have to write your own. Simply, it might look like this:

    sub magic_opts { my ($lastopt,%opthash); for (split) { if (/^-+(.*)/) { $lastopt=$1; $opthash{$lastopt}=[]; } else { push @{$opthash{$lastopt}}, $_; } } return \%opthash; }

    Of course this code is buggy, specifically it does not handle the case if there was no previous --option. Good luck!

    print pack("A25",pack("V*",map{1919242272+$_}(34481450,-49737472,6228,0,-285028276,6979,-1380265972)))

      You code doesn't allow for parameters that aren't options. Let's introduce --. And let's handle the case with no previous --option. Then you can do

      command --option ... -- file1 file2
      sub parse_args { my $last_opt = ''; my %opts; while (@ARGV) { my $arg = shift(@ARGV); if ($arg eq '--') { last; } elsif ($arg =~ s/^--?//) { $opts{$last_opt = $arg} = []; } else { push @{ $opts{$last_opt} }, $arg; } } unshift @ARGV, @{ delete $opts{''} } if $opts{''}; return \%opts; }

      Supporting val=0 as an option makes no sense. It would prevent you from having values with equal signs in them.

      If -list a b c wasn't supported, we could do away with requiring --.

      That's what I was looking for. Thanks.

      I've tweaked one line though:

      $opthash{$lastopt}=[] unless ref $opthash{$lastopt};

      So that it can handle both types of lists:

      script.pl --list aa --list bb and script.pl --list aa bb

      cheers

Re: generic getopt to hash
by jakobi (Pilgrim) on Oct 01, 2009 at 17:39 UTC

    I don't remember any getopt with multiple-argument options, but I'm looking forward to the other comments, esp. concerning using vararg-style options intermixed with other options.

    A very very ugly way would be normally using getopt, but hiding the list from getopt, with some guesswork on end-of-list vs. next option or first file argument. A sin I committed when ditching an ugly getopts.pl for a more modern approach.

    If you need to do this for a last-ditch fall-back and really want to see an example of doing this, search getopt in github.com/jakobi/script-archive/blob/master/cli.list.grep/Compact_pm/Grep.pm (esp. the boolean/-B stuff).

    Then again, for myself, I tend to enjoy the greater freedom in option naming when not using getopt*, and perl's sufficently expressive to keep the size managable.

    Not linked to protect innocent eyes and Mr. Christiansen :)

    update1 due to typo and typing trying too hard to catch up with brains.

    update2: thread threatens with a chance to throw away a few screenful of earlier getopt cruft+workarounds with only tiny loss in flexibility. Promising :)