mrnoname1000 has asked for the wisdom of the Perl Monks concerning the following question:

Hiya, Perl newbie here! I've been learning it alongside Ruby with my background consisting of mostly shell, C, and Python (the latter of which takes a wholly different approach to argument parsing).

As I write more Perl scripts with options, I prefer my option parser to be very strictly configured. My problem is that no matter how I configure Getopt::Long, it always accepts single character options with two preceding dashes. I achieved a compromise when rewriting one of my uglier sh+awk scripts, and here's a stripped down example:

use Getopt::Long qw( :config posix_default gnu_compat bundling no_auto_abbrev no_ignore_case ); my $column = 1; GetOptions( 'column|c=i' => \$column, ) or die; die if $column < 1;

I want this program to accept exactly two strings as options, --column and -c, but it also accepts --c. I want the prefix for short options to be - and only -, similar to how -- is treated for long options when bundling is enabled. I had hoped qw( :config prefix_pattern - long_prefix_pattern -- ) would get me what I want, but it seems like the latter isn't processed if the former doesn't match. Anyone know of a clean way to achieve this?

Also, feel free to suggest improvements or mention best practices! If anyone wants, I can post the full script as well; it's intended to format numbers from stdin/files into human-readable sizes like 681.2K, 12.1M, 3.5G. I should probably figure out my preferred license as well...

  • Comment on Enforce single hyphen for single character options in Getopt::Long
  • Download Code

Replies are listed 'Best First'.
Re: Enforce single hyphen for single character options in Getopt::Long
by parv (Parson) on Feb 24, 2024 at 22:40 UTC
    use Getopt::Long qw( :config posix_default gnu_compat bundling no_auto_abbrev no_ignore_case ); my $column = 1; GetOptions( 'column|c=i' => \$column, ) or die; ...

    I want this program to accept exactly two strings as options, --column and -c, but it also accepts --c. I want the prefix for short options to be - and only -, similar to how -- is treated for long options when bundling is enabled

    posix_default sets POSIXLY_CORRECT and then from Configuring Getopt::Long ...
    prefix_pattern
    A Perl pattern that identifies the strings that introduce options. Default is --|-|\+ unless environment variable POSIXLY_CORRECT has been set, in which case it is --|-.

      Post-detour, from some combinations of G::L options of ...

      (nothing provided) prefix=- long_prefix_pattern=-- posix_default getopt_compat no_posix_default no_getopt_compat

      ... only any one of the following combinations produces the desired outcome of failure on --o option ...

      prefix=- long_prefix_pattern=-- prefix=-, long_prefix_pattern=--

      After running the program below, above was obtained thus ...

      perl5.36 getopt-test-combination.pl > out 2> err grep '^ok.+input:.+--o' out

      ... there could be bugs or something being overlooked.

      Update: Oh there is a deficit! Above listed options fails with only --o as the option, but set a value with options of -o --o ...

      grep --color=always '^not ok.+input: -o --o.+parsed: [0-9]+.+(long_)?p +refix' out | head -n 3 not ok 6 - input: -o --o; parsed: 1; conf: ( prefix=- ) not ok 9 - input: -o --o; parsed: 1; conf: ( long_prefix_pattern=-- ) not ok 30 - input: -o --o; parsed: 1; conf: ( prefix=-, long_prefix_pa +ttern=-- ) ...

      ... same as noted in OP. Nothing has changed. Sh?ucks (╯°□°)╯︵ ┻━┻

        sub parse_option( @conf ) { ... my $option; ... try { # Use incremented value type to count number of times options ma +tched. GetOptions( 'option|o+' => \$option ) or die $!; $show_progress and warn qq[option set: $option]; } catch { warn qq[Option parsing failed of @arg: $_]; ## CHANGED. # Need to set so as death of "GetOptions" does not. $option = undef; } ; return $option; }

        With above change, the 3 sets of options now also behave the same for both inputs of --o & -o --o.

Re: Enforce single hyphen for single character options in Getopt::Long
by Danny (Chaplain) on Feb 26, 2024 at 00:43 UTC
    It seems that if you set prefix_pattern or prefix to '-' then setting long_prefix_pattern to '--' doesn't recognize the --long_opt. You can add debug as a config option to get an idea of what it's doing. In Getopt/Long.pm in the FindOption function it first checks if the option starts with '-' if prefix is '-' and it splits the option into $starter=- and the rest of the option into $opt. After this it checks $starter=~/^$longprefix$/ which doesn't match if $longprefix is '--'. That is the only place in the module that evaluates $longprefix so it seems like a bug to me.
      For what it's worth, as the module is written, long_prefix_pattern doesn't seem to be intended as a way to distinguish short and long option prefixes per se. Instead it is intended as a subset of the general prefixes that allow the --option=value syntax in addition to the --option value syntax.
Re: Enforce single hyphen for single character options in Getopt::Long
by parv (Parson) on Feb 24, 2024 at 22:34 UTC
    Could you rephrase and/or explain more the "latter isn't processed" part ...
    I had hoped qw( :config prefix_pattern - long_prefix_pattern -- ) would get me what I want, but it seems like the latter isn't processed if the former doesn't match.
    ... ?

      Sure, it seems like arguments are checked against prefix_pattern to determine whether they're an option, and only after that succeeds are they additionally checked against long_prefix_pattern. I was hoping the two patterns would be checked separately so I could get mutually exclusive short/long option prefixes.

      In essence, I just want to disallow the use of --c in this example. This may not be possible without some unholy hacks, but I'm open to experimenting!

Re: Enforce single hyphen for single character options in Getopt::Long
by Don Coyote (Hermit) on Feb 29, 2024 at 14:37 UTC

    hello mrnoname1000 and welcome to the Monastery.

    Such requirements will probably subvert system conventions on some or other systems.

    However, a slight manipulation of the @ARGV at the right place will allow you to restrict to requirement.

    The return value from the grep will be true in scalar context, so you could use the expression as a conditional rather than assignement if you further wanted to amend the default value in some way based on this format of option.

    #!/usr/bin/perl use strict; use warnings; use Getopt::Long qw( :config posix_default gnu_compat bundling no_auto_abbrev no_ignore_case ); my $column=1; @ARGV = grep $_ !~ /\A--c\b/, @ARGV; #if( grep s/\A--c\b=?.*//, @ARGV ){ $column = 100 }; # \b can be replaced with (?!o) for finer resolution #if( grep s/\A--c(?!o)=?.*//, @ARGV ){ $column = 100 }; GetOptions( 'column|c=i' => \$column, ) or die; print "\$column is $column\n"; die if $column < 1; __END__ # grep filtered $ ~/Desktop/gol.pl --c 2 --c=2 --c= 2 --c = 2 $column is 1 # grep substitution conditional $ ~/Desktop/gol.pl --c= 2 --c = 2 --c 2 --c=2 $column is 100

    The first way greps all the arguments that dont match your condition, and reassigns the newly created list back into @ARGV

    The second way uses a substitution to clear the element, and returns true if it occurs, allowing the conditional block to be entered.

    ok heres the one that didnt quite work at first, as I almost posted a response that solved the problem, but then also discounted the long option without equal sign.

    #!/usr/bin/perl # first attempt, non-working as the # long option is also discounted use strict; use warnings; use Getopt::Long qw( :config posix_default gnu_compat bundling no_auto_abbrev no_ignore_case ); my $column = 1; #print map "[$_]", @ARGV, "\n"; #exclude the unwanted format with a filtering grep @ARGV = grep $_ !~ /\A--c=/, @ARGV; # remove the offending format using substition operator and test to se +e if you did # if( grep s/\A--c=.*//, @ARGV ){ $column = 100 }; #print map "[$_]", @ARGV, "\n"; GetOptions( 'column|c=i' => \$column, ) or die; print "\$column is $column\n"; die if $column < 1; __END__ $ ~/Desktop/gol.pl --column=2 $column is 2 $ ~/Desktop/gol.pl -c2 $column is 2 $ ~/Desktop/gol.pl --c=2 $column is 1 $ ~/Desktop/gol.pl -c=2 Value "=2" invalid for option c (number expected) Unknown option: = Unknown option: 2 Died at ~/Desktop/gol.pl line 19. # using grep and substitution as conditional $ ~/Desktop/gol.pl --c=2 $column is 100 #but also doesnt allow $ ~/Desktop/gol.pl --column 2 $column is 1

    I realised the code needed to not discount the long option so had to use a word border match, or forward notahead for more specific filtering.

    Hope this helps


    perl -M="Don Coyote" -e 'while(<ARGV>){$.== 1 and s!.*!pack(q{C*},043,041,057,0165,0163,0162,057,0142,0151,0156,057,0160,0145,0162,0154)!e }' *
Re: Enforce single hyphen for single character options in Getopt::Long
by Don Coyote (Hermit) on Mar 01, 2024 at 08:13 UTC

    There is something up with the way the arguments are processed. The prefix symbol appears to apply escapes whereas the _pattern suffixed symbols do not.

    Alongside the equal sign not being documented as mentioned recently in Solved- Getopt::Long- Trying to specify "prefix" configuration option produces error, apparantly you can also supply a regex.

    With the proviso the qr is inline, a quoted regex pattern can be supplied to pattern_prefix that appears to do what we need, without the need to import long_pattern_prefix at all.

    Though this is fairly simple case in that it is only distinguishing between single or double option introduction characters. On one system, so perhaps there is scope for further regex formulations in combination. It would be easy to expect that there is a kind of default fallback hieararchy to these for varying levels of complexity requirements. Documented intention would be even easier

    use strict; use warnings; use Getopt::Long ( qw( :config posix_default gnu_compat bundling no_auto_abbrev no_ignore_case passthrough ), 'prefix_pattern=' . qr/((?<=)(?=--\w{2,})--)|((?<=)(?=-\w+)-)/ ); my $column=1; GetOptions( 'column|c=i' => \$column, # 'c=i'=> \$column, ) or die; # with passthrough, tell the user their option wasnt applied # print map "[$_]", @ARGV, "\n"; print "\$column is $column\n"; die if $column < 1; __END__ $ ~/Desktop/gol.pl --column=2 $column is 2 $ ~/Desktop/gol.pl --column 2 $column is 2 $ ~/Desktop/gol.pl --c=2 $column is 1 $ ~/Desktop/gol.pl --c 2 $column is 1 #'column|c=i' => \$column, $ ~/Desktop/gol.pl -c=2 $column is 1 # 'column=i' => \$column, # 'c=i'=> \$column, $ ~/Desktop/gol.pl -c=2 Value "=2" invalid for option c (number expected) Died at /c/Users/archimediiiica/Desktop/gol.pl line 53. $ ~/Desktop/gol.pl -c2 $column is 2 #passthrough $ ~/Desktop/gol.pl --c 2 [--c][2][ ]$column is 1 #passthrough $ ~/Desktop/gol.pl --c=2 [--c=2][ ]$column is 1

    The difference between the alternation in the GetOptions call, is the Warning when -c=2 is not generated

    Importing passthrough allows to see that this is still preferable to my earlier @ARGV processing solution, as it keeps the failed option and value pair together, so it could be nice to the user to let them know some option or other was not applied, else they may get unexpected results from default args.


    Dont Coyote