The combinatorial approach has not much to do with perl, but I can try anyway.

You can for example walk through the string, and for all 'a's count the number of 'b's that are also followed by 'c's, counting their number.

So for a target string like 'a b  a c b c c' you find that for the first 'a' you have 1 b followed by 3 c's, and one b followed by 2 c's, which sums up to 5.

For the second 'a' you just have one 'b' followed by two 'c's, so all in all you have 7 possible matches.

Back to Perl, the regex module I mentioned above doesn't really do any work - it just exploits a feature of the perl built-in regex engine. It causes each match to fail, so it forces the regex engine to backtrack into other alternatives.

This small piece demonstrates that:

$ perl -wle '"abacbcc" =~ /a.*b.*c(?{ $count++ })(?!)/; print $count' 7

(?!) is just a "clever" way to write a regex that never matches (on perl 5.10 or newer you can also write (*FAIL) to achieve the same thing, but more readable).

The (?{...}) is just a block of perl code that regex engine runs after it matched the c but before it failed.


In reply to Re^3: all occurences of a target word by regular expression by moritz
in thread all occurences of a target word by regular expression by siskos1

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.