rookie_monk has asked for the wisdom of the Perl Monks concerning the following question:

Good evening Perl Gurus, I cant seem to figure out how to extract matches of multiple words that are encased by parentheses. How would I also go about counting how many it actually finds so that I know how many variables I will need to store the individual match. Here is an example that I made work but it only works if its the exact match:
my($text) = "(network)(test)(ifcfg)"; my($first, $second, $third) = $text =~ /\((.*)\)\((.*)\)\((.*)\)/; print "$first and $second and $third\n";
I need it to do something like this:
my($text) = "(network) (test)_(ifcfg)"; my($first, $second, $third) = $text =~ /\((.*)\)/; #have this match all 3 and store them print "$first and $second and $third\n";
I would also like it to count how many matches it makes because the input could possibly have more or less than 3 matches. Thanks in advance.... -Paul

Replies are listed 'Best First'.
Re: Extracting multiple matches from Reg Ex
by dasgar (Priest) on Sep 01, 2010 at 22:16 UTC

    I would suggest just storing the matches into an array. From there, you could easily access each element and also retrieve the number of elements in that array. The untested code below illustrates my idea.

    my($text) = "(network)(test)(ifcfg)"; my(@matches) = ($text =~ m/\((.*?)\)/g); my $match_count = scalar(@matches); print "The following $match_count matches were found:\n"; foreach my $match (@matches) {print " $match\n";}

    In the second line, I'm searching $text for all items that are within a set of parentheses and putting each match found into the @matches array. If you test this out with varying values for $test, the match finding part will still work correctly.

      I would like to thank everyone for their help and input. These are all great ways to do it and I will surely use it to accomplish what I am trying to do. Thanks again. -Paul
Re: Extracting multiple matches from Reg Ex
by ww (Archbishop) on Sep 01, 2010 at 22:32 UTC

    Part 1:
    Your problem may be that you've escaped the wrong parens: 1

    my($first, $second, $third) = $text =~ /(\([a-z]*\))(?:\s|_|$)/gi; print "$first and $second and $third\n";

    Part2:

    no warnings 'uninitialized'; my $count = $text =~ s/(\([a-z]*\))(?:\s|_|$)/$1$2/gi; print $count; use warnings;

    output, parts 1 and 2:

    (network) and (test) and (ifcfg) 3

    Part 3:
    I'm not at all sure I understand what you want. Please clarify whether you're talking about multiple copies of a parenthesized word or simply "more than 3 parenthesized words in your string."

    Update: Oops, meant to say but forgot to write it down, split to an array would also be a valid approach -- perhaps better than above.
    Also, though I haven't worked it out, it seems that tr/// might also represent a possibility.

    Update2: Adding note 1 -- or the problem may be that I read the OP as seeking to capture the encasing parentheses... which may be true, but is *not* specifically asserted. ...and if 'untrue,' apologies for the initial observation.

Re: Extracting multiple matches from Reg Ex
by graff (Chancellor) on Sep 01, 2010 at 23:34 UTC
    In response to this:

    I would also like it to count how many matches it makes because the input could possibly have more or less than 3 matches.

    Here's how that would look using split:

    while (<DATA>) { chomp; my @pieces = grep /[a-z]/i, split /[()]+/; printf( "Got %d pieces from '%s' : %s\n", scalar @pieces, $_, join( ' and ', @pieces )); } __DATA__ (network)(test)(ifcfg) (network) (test)_(ifcfg) (foo).(bar) (one)(two)--(three)/(four),(five)
    Now, if your real input ever contains letters between a close paren and a following open paren -- e.g. "(foo) X (bar)" -- this approach will catch those as "pieces" as well. If that's a problem, go with Text::Balanced as suggested above.
Re: Extracting multiple matches from Reg Ex
by johngg (Canon) on Sep 01, 2010 at 23:11 UTC
Re: Extracting multiple matches from Reg Ex
by aquarium (Curate) on Sep 02, 2010 at 00:53 UTC
    a larger sample data set would be more useful, as your first example is substantially different to the second. but in any case it sounds like you'll have variable number of tokens/words, each inside a round bracket pair. rather than trying to match the whole set of tokens in one go with a complex regex, probably better to match each token with a regex and use g modifier to create a loop..wherein you can create a array or even linked list structure for the one line of data. a linked list structure in perl becomes a easily navigable hash, and you can even make decisions based on matching a particular token at a given level, if further data semantic type validation is required.
    note in the regexes i personally prefer the idiom starting delimiter, followed by not ending delimiter, followed by ending delimiter; so as not to accidentally eat up too much with a .*
    example untested code..insert own code where comments are
    while($text=<>) { chomp $text; // undef/reset preferred data structure before loop while($text=~/\([^)]+\)/g) { $token=$1; // append token to chosen data structure and such } }
    the hardest line to type correctly is: stty erase ^H
Re: Extracting multiple matches from Reg Ex
by perlpie (Beadle) on Sep 02, 2010 at 00:12 UTC
    my($text) = "(network)(test)(ifcfg)"; my(@matches) = ($text =~ m/\((.*?)\)/g); print "The following @{[ $#matches+1 ]} matches were found:\n"; print join(' and ', @matches) . "\n";

    Just tweaking the version dasgar provided to remove a variable and generate the output in your original question.