in reply to How can I access the number of repititions in a regex?

I don't know of a special variable for that but you can get the number of matches the following way

my @matches; .... while(<>) { my @matches = $_ =~ m/$string/g; my $repetition = @matches; print "$string was encountered $repetition times.\n" if($repetitio +n > 0); }

As for your second question, you are not closing the code tag correctly. You are using '\' where it should be '/'. So the closing tag should read </code>

update added the g modifier to the regexp. Thanks wfsp and Fletch for alerting me it was missing. It not being there was a typo. I tested it with the code shown bellow, but when adapting the test code to the block shown on the OP somehow the 'g' was skiped.

use strict; use warnings; my $text = "match other match not useful match sample word"; my $string = "match"; my @matches = $text =~ m/$string/g; my $count = @matches; print $count;

Replies are listed 'Best First'.
Re^2: How can I access the number of repititions in a regex?
by ack (Deacon) on Mar 11, 2008 at 17:50 UTC

    Here is an example that may work for the OP. It uses a similar approach as that found in perlretut (as suggested by the good monk, pancho) on page 22 of the tutorial. It allows multiple matches of various capturing subexpressions and keeps track of the number of matches for each of those subexpressions.

    I just tested it for the admittedly simple case shown and it appears to do what is sought.

    Fore the input:

    $text = "match other match not useful match not same match\n"

    the output is:

    frequency of regex capture (\b\w+\b) is 8 frequency of regex capture (\s) is 8 frequency of regex capture (\w+$) is 1

    The user has to specify the various capturing regex subexpressions and associate an element of the array @word with each of them. This array records each time it is matched by using the in-line regex code subpattern, (?{ }).

    I think this does what the OP was looking for.

    I'm not sure how robust it is (i.e., how flexible it is for example with nested capturing subexpressions, for alternating subexpressions, etc.

    I am always challenged, especially, by alternating subexpressions.

    ack Albuquerque, NM
Re^2: How can I access the number of repititions in a regex?
by pat_mc (Pilgrim) on Mar 11, 2008 at 13:06 UTC
    Hi, olus -

    Thanks for your quick answers on both accounts.

    Ad 1) I understand the solution you propose. It is certainly an option to include all matches into a list and return the size of the list. The downside of this approach is that in more complex regular expressions I need separate lists and regexes for every quantifier I want to access.
    while ( <> ) { print "$string_1 matched ", scalar @list_1, "times." if @list_1 = $_~ +/regex_1/g; print "$string_2 matched ", scalar @list_2, "times." if @list_2 = $_=~ + /regex_1/g; }


    Clearly, this could get messy ... are there any alternatives to this?

    Ad 2) Good stuff ... the code tags now work! How do I indent?

    Thanks again!

    Cheers -

    Pat
      Erm, just use that big wide key at the bottom of your keyboard? (Burma Shave)

      Update: And to clarify what I think you're saying your problem is: you've got a regex with multiple captures of varying length (say, /(a+)(b+)/) and you want to know how many repetitions each captured subexpression matched (i.e. how many "a"s and how many "b"s) for an arbitrary regex.

      Which is actually a kind of neat question (and I'm drawing a blank of an "elegant" solution off the cuff; I initially was going to comment about the operator-which-shall-not-be-named too (=()=) but then read the above post and saw you had possibly several subexpressions to count).

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

        On the off-chance that you're asking about indenting OUTSIDE <c>...</c> tags, try <blockquote>...</blockquote> which

        works like this, indenting both sides from the maintext margins and wrapping long lines like this one.

        HTH

        Fletch -

        Your are spot on in your interpretation of the problem.
        But what is your solution?

        And if thou shallt not see, there will not be another but one.

        Pat

      The first solution that occurred to me is shown in the following code. Note that I considered splitting the input text on spaces, and that may not be a solution for you depending on your actual input and the patterns you are looking for.

      use strict; use warnings; use Data::Dumper; my $text = "match other match not useful match sample word"; my $string1 = "match"; my $string2 = "not"; my %repetitions; map {$repetitions{$_}++;} grep /$string1|$string2/, split / /, $text; print Dumper(\%repetitions); ### Outputs: $VAR1 = { 'match' => 3, 'not' => 1 };
      We call it programming
      while(<>){ for my $pat ( qr/\d/, qr/string/ ){ my $count = () = /$pat/g; print "$pat matched $count times\n"; } }

      Abstract the matching logic out into a subroutine

      use warnings; use strict; my @strings = qw/ 1 2 3 4 5 6 7 8 90 /; while ( my $line = <DATA> ) { print $line; for (@strings) { my $count = matches( $line, $_ ); print "$_ matched $count times.\n" if $count; } print "\n"; } sub matches { return () = $_[0] =~ /\Q$_[1]\E/g; } __DATA__ 9087126348716340789126348907164 l3klj09934u098u5tio2354uj908rye qoiriopuj3u45098479183248r95r77 [q9u4r0983u490ru340u54ioeuf9p8h 23qioh89174y9843y7r9843r87e8714 [9490838945r8974r9834093409tr34