in reply to Regexes: finding ALL matches (including overlap)

For the in-bracket example, you could extract the contents of the brackets, then search through the extracted contents for 'y'. That problem is also very well suited for parsers:
use strict; use warnings; my $count; local $_ = "<Pooh,> said Rabbit kindly, <you haven't any brain> <I kn +ow,> said Pooh humbly."; our $c = 0; / ^ (?: # Outside of brackets [^<] | # Inside of brackets < [^y>]* (?: y (?{ local $c = $c + 1 }) [^y>]* )* >? # Optional in case of unmatched bracket. )* $ (?{ $count = $c }) # Save count. /x; print("$count\n");

Since the above will match every string without ever backtracking, using $c is optional. You can replace (?{ local $c = $c + 1 }) with (?{ $count++ }) and drop (?{ $count = $c }).

Sorry, I don't have any general solutions.

Update: Fixed a bug in the regexp.

Replies are listed 'Best First'.
Re^2: Regexes: finding ALL matches (including overlap)
by tlm (Prior) on Jun 04, 2005 at 11:40 UTC

    Just out of curiosity, why not make the "inside of brackets" subexpression something more like

    < (?: [^y>]* y (?{ local $c = $c + 1 }) )* .*? > # Closing bracket not optional
    ? I see that the original allows for the possibility of unmatched left angle brackets, but I don't see why one would want this; i.e. I don't see why one would want to count the "y" in "<xyz", for example, but not the one in "xyz>".

    the lowliest monk

      but I don't see why one would want this

      I had to make a decision since I had insufficient information. If someone needs a different behaviour, they can change the code or ask me to do so. I decided to adopt Windows quoting behaviour. For example, dir "c:\program files works.