Extracting multiple matches from Reg Ex

rookie_monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Extracting multiple matches from Reg Ex by dasgar (Priest) on Sep 01, 2010 at 22:16 UTC
I would suggest just storing the matches into an array. From there, you could easily access each element and also retrieve the number of elements in that array. The untested code below illustrates my idea. `my($text) = "(network)(test)(ifcfg)"; my(@matches) = ($text =~ m/$(.*?)$/g); my $match_count = scalar(@matches); print "The following $match_count matches were found:\n"; foreach my $match (@matches) {print " $match\n";}` [download] In the second line, I'm searching `$text` for all items that are within a set of parentheses and putting each match found into the `@matches` array. If you test this out with varying values for `$test`, the match finding part will still work correctly.	[reply] [d/l] [select]
Re^2: Extracting multiple matches from Reg Ex by rookie_monk (Novice) on Sep 02, 2010 at 14:22 UTC
I would like to thank everyone for their help and input. These are all great ways to do it and I will surely use it to accomplish what I am trying to do. Thanks again. -Paul	[reply]
Re: Extracting multiple matches from Reg Ex by ww (Archbishop) on Sep 01, 2010 at 22:32 UTC
Part 1: Your problem may be that you've escaped the wrong parens: ¹ `my($first, $second, $third) = $text =~ /($[a-z]$)(?:\s\|_\|$)/gi; print "$first and $second and $third\n";` [download] Part2: `no warnings 'uninitialized'; my $count = $text =~ s/($[a-z]$)(?:\s\|_\|$)/$1$2/gi; print $count; use warnings;` [download] output, parts 1 and 2: `(network) and (test) and (ifcfg) 3` [download] Part 3: I'm not at all sure I understand what you want. Please clarify whether you're talking about multiple copies of a parenthesized word or simply "more than 3 parenthesized words in your string." Update: Oops, meant to say but forgot to write it down, `split` to an array would also be a valid approach -- perhaps better than above. Also, though I haven't worked it out, it seems that `tr///` might also represent a possibility. Update2: Adding note ¹ -- or the problem may be that I read the OP as seeking to capture the encasing parentheses... which may be true, but is not specifically asserted. ...and if 'untrue,' apologies for the initial observation.	[reply] [d/l] [select]
Re: Extracting multiple matches from Reg Ex by graff (Chancellor) on Sep 01, 2010 at 23:34 UTC
In response to this: I would also like it to count how many matches it makes because the input could possibly have more or less than 3 matches. Here's how that would look using split: `while (<DATA>) { chomp; my @pieces = grep /[a-z]/i, split /[()]+/; printf( "Got %d pieces from '%s' : %s\n", scalar @pieces, $_, join( ' and ', @pieces )); } __DATA__ (network)(test)(ifcfg) (network) (test)_(ifcfg) (foo).(bar) (one)(two)--(three)/(four),(five)` [download] Now, if your real input ever contains letters between a close paren and a following open paren -- e.g. "(foo) X (bar)" -- this approach will catch those as "pieces" as well. If that's a problem, go with Text::Balanced as suggested above.	[reply] [d/l]
Re: Extracting multiple matches from Reg Ex by johngg (Canon) on Sep 01, 2010 at 23:11 UTC
You might also consider looking at Text::Balanced. Cheers, JohnGG	[reply]
Re: Extracting multiple matches from Reg Ex by aquarium (Curate) on Sep 02, 2010 at 00:53 UTC
a larger sample data set would be more useful, as your first example is substantially different to the second. but in any case it sounds like you'll have variable number of tokens/words, each inside a round bracket pair. rather than trying to match the whole set of tokens in one go with a complex regex, probably better to match each token with a regex and use g modifier to create a loop..wherein you can create a array or even linked list structure for the one line of data. a linked list structure in perl becomes a easily navigable hash, and you can even make decisions based on matching a particular token at a given level, if further data semantic type validation is required. note in the regexes i personally prefer the idiom starting delimiter, followed by not ending delimiter, followed by ending delimiter; so as not to accidentally eat up too much with a .* example untested code..insert own code where comments are `while($text=<>) { chomp $text; // undef/reset preferred data structure before loop while($text=~/$[^)]+$/g) { $token=$1; // append token to chosen data structure and such } }` [download] the hardest line to type correctly is: stty erase ^H	[reply] [d/l]
Re: Extracting multiple matches from Reg Ex by perlpie (Beadle) on Sep 02, 2010 at 00:12 UTC
`my($text) = "(network)(test)(ifcfg)"; my(@matches) = ($text =~ m/$(.*?)$/g); print "The following @{[ $#matches+1 ]} matches were found:\n"; print join(' and ', @matches) . "\n";` [download] Just tweaking the version dasgar provided to remove a variable and generate the output in your original question.	[reply] [d/l]