HalNineThousand has asked for the wisdom of the Perl Monks concerning the following question:

This is really more academic than anything else, since it can be done with a loop, but I'm interested in this since I've just learned about the map function and am using it as a tool to learn more. (I'm self taught, so there are times I go back and look for obvious things I've missed.)

When I read about map, I was thinking that it might be easy to create a hash from @ARGV simply, then realized for what I want, it'd take three hashes. For my use, I have three kinds of arguments: 1) Settings (like foo=bar, usually written --foo=bar), flags or switches (like --redirect or --nooutput), and file names (given as just the file name).

I'm also wondering if I'm not clear on the range of what's possible for expressions, so I've been experimenting. I found that in three lines I could pull out all the settings and put them in one hash with a "key => value" arrangement, that I could also get a list of all the filenames (arguments without a double dash in front of them), and get a hash of all the switches, with each switch name being a key in the hash and the corresponding value being a 1.

Here's what I did:

%setting = map {/^--(.*?)=/ => /=(.*)$/} @ARGV; %file = map {$_ => (s/^([^=-]*)$/$1/)} @ARGV; %switch = map {$_ => (s/^--([^=]*)$/$1/)} @ARGV;
The first one is the only one I think works well. It pulls out the key and value for each argument in @ARGV that starts with "--" and has an equals sign, like "--foo=bar." The 2nd one had to go next since the 3rd messes up the arguments for future passes through @ARGV. In the 2nd, I'm counting on the value for the number of replacements made to provide a 0 or a number. If it's 0 and no replacements are made, then the hash value for that argument will be 0, or false.

The third works like the second, but would always have to go last, since it changes the values in @ARGV.

As I mentioned, this is academic and not something I have to have, but now I'm curious and wondering just how much could be done. For example, is there any way I could use the map function and pull out all the settings (--foo=bar) AND also pull out all the switches (--redirect) by using a regex to find a string that starts with "--" and has an "=" in it and pulls the value from after the equals, or, if there is no equals, pulls the entire string as a key and creates a 1 as a value in the hash?

In other words, it'd change "--foo=bar" into a hash key/value set like "foo => bar" but it would also convert "--redirect" to the key/value set "redirect => 1". And if it did that, then I'd like to put filenames in the hash, as well. (And if I could make each filename a key and make the value a number so the values tell the order of the filenames, that'd be great, but I don't think that's possible.)

I don't know if I'm being clear enough, I'm still getting used to the mapping and this takes me into a new area with regexes. I'm just wondering if there's enough flexibility to do all this, and do it without going through @ARGV multiple times.

Replies are listed 'Best First'.
Re: Parsing @ARGV w/ Map Function
by ELISHEVA (Prior) on Feb 14, 2011 at 07:18 UTC

    I'm just wondering if there's enough flexibility to do all this.... map's a great tool and I also wanted to see just how far I could go with it when I first learned it.

    Map does indeed have the flexibility you seek. That {...} block can contain nearly anything, including long if...elsif...else statements. Thus anything you can do with a sequence of map statements most likely can be done in one.

    The main limitation on map is this: you can't "return" from within the {...} block. If you try, you will return from the surrounding subroutine and not just quit mapping or skip an array element. What you can do if you want to skip an array element (e.g. a bad command option) is return an empty list: (). You can also do that if you want to put the data for that particular array element into some other data structure than the hash you are building.

    Your original spec would look something like this. Note the use of if..elsif to vary the result of the map based on a regular expression match. This works because map only cares about the last executed line, not the last line visually appearing in the block:

    use strict; use warnings; use Data::Dumper; my $i=0; my %hARGV = map { if (/^--(\w+)=(\w+)$/) { # option => value $1 => $2; } elsif (/^--(\w+)$/) { # flag => 1 $1 => 1 } elsif (m{^[\w/]}) { # filename => fileorder $_ => $i++; } else { # bad argument - add nothing to the result hash warn "Invalid option: <$_> - options must begin with " ."-- or be a legal file name"; (); } } @ARGV; print Dumper(\%hARGV);

    You could also stuff the filenames into an array so that you wouldn't have to reconstruct the array order by scanning the hash. This example also illustrates the technique of returning () when you want to do something with an array element other than place it immediately in the hash:

    The main thing to worry about with map is going a little bit crazy and trying to do everything in a map (I know I did at first). Usually, if the map block gets to be more than a few lines I'll define a subroutine and call that subroutine within map, like this:

    On a final note, learning exercises aside, for real option processing, do take a look at the core module Getopt::Long. It can also convert @ARGV into a hash and goes well beyond the list of argument processing features you discussed above: collapse multiple options into arrays, auto define options (e.g. --noredirect as well as just --redirect), and much, much more.

      Wow!

      While all these answers are helpful, this is the one that opened several doors for me that I wasn't even thinking about. From what I read I didn't think whole statements would fit inside map, much less just calling a subroutine.

      I played around and got things working quickly from your examples. (Before that, I was going over conditional regex abilities, and not getting too far.) Once I saw that, it was easy and an if...elsif...else block did fine for testing (but it ignored malformed arguments, this is just an experiment so far). Then I looked back and found what I came up with was pretty close to what you had come up with.

      But there is one big question I have when looking at your 2nd example. I wasn't sure if map was a loop or if it handled data in other ways. So what is the difference in whether I use map or a foreach loop with almost the same statements in it? Is there a speed difference or anything else?

      I will be looking at Getopt::Long. It's amazing the things you can miss that everyone else considers standard when you're self-taught. It's the kind of thing where I might see someone using it and say to them, "You never told me you could do that," and the response is, "You never asked." Sometimes there's so much it's hard to know what to ask about what is out there.

      Thank you, everyone, for the helpful posts. While this is quick and simple, so it might get used on some small programs, I can see Getopt::Long has much more to use for parsing arguments. My main intent was that I knew there had to be a way to do this and I wanted to see what it was and what it included that I didn't know about.

Re: Parsing @ARGV w/ Map Function
by wind (Priest) on Feb 14, 2011 at 06:01 UTC
    This is not a direct answer to your question, but I'd like to suggest that you take a look at the CPAN modules Getopt::Long and Pod::Usage. These can help you with parameter processing and good documentation for your scripts, like the following example using the params you stated:
    #!/usr/bin/perl use Getopt::Long; use Pod::Usage; # Parameters our $foo = ''; our $redirect = ''; our $nooutput = ''; INIT { my $help = ''; GetOptions( 'help|?' => \$help, 'foo=s' => \$foo, 'redirect' => \$redirect, 'nooutput' => \$nooutput, ) or pod2usage(2); pod2usage(-exitstatus => 0, -verbose => 2) if $help; } print "Hello World\n"; print "Files are " . join(',', @ARGV) . "\n"; 1; __END__ =head1 NAME yourscript.pl - This is what I DO =head1 SYNOPSIS ./yourscript.pl =head1 DESCRIPTION I'm a description. --foo I take a string --redirect I'm a trigger --nooutput I'm a trigger too --help This help text =head1 AUTHOR Written by you
Re: Parsing @ARGV w/ Map Function
by jwkrahn (Abbot) on Feb 14, 2011 at 06:46 UTC
        %setting = map {/^--(.*?)=/ => /=(.*)$/} @ARGV;
    The first one is the only one I think works well.

    It would also work just as well as:

    my %setting = map /^--(.*?)=(.*)$/, @ARGV;

        %file = map {$_ => (s/^([^=-]*)$/$1/)} @ARGV;
        %switch = map {$_ => (s/^--([^=]*)$/$1/)} @ARGV;
    The 2nd one had to go next since the 3rd messes up the arguments for future passes through @ARGV.

    Any expression that modifies $_ also modifies @ARGV so both the 2nd and 3rd modify @ARGV.    However, you don't have to modify $_ to get the results that you want:

    my %file = map { $_ => /^([^=-]*)$/ ) } @ARGV; my %switch = map { $_ => /^--([^=]*)$/ } @ARGV;
Re: Parsing @ARGV w/ Map Function
by Anonymous Monk on Feb 14, 2011 at 06:31 UTC
    I'm just wondering if there's enough flexibility to do all this, and do it without going through @ARGV multiple times.logic

    Yes, there is enough flexibility to do this without going through @ARGV multiple times; map is just a glorified foreach :) This is how you start

    You'll have to decide the rules, what combination of inputs produces what outputs (think in terms of edge cases), using what $spec (if any).