in reply to Global Regex sans //g

With the regexp that you posted:

'<a /><b >' =~ /<([\w\d]+)\s+(?{ print $^N })\s*?>/; __END__ ab
The reason is that when the engine first gets to the print statement, the match is still succeeding, so the print proceeds; but, the first match eventually fails, because of the /, so the engine starts searching again, and finds (and prints) the second match. I'm not sure if this answers your question, but at least it illustrates what appears like an iteration.

BTW, \w implies \d, so [\w\d] is redundant.

Update: Also, if in the regexp in Obfuscated regexp you get rid of the \1 at the very end, then the print statements get executed only once. Clearly this \1 is what causes the successive matches to fail, thus forcing the engine to start searching again. Once it is removed, the match succeeds, and the engine stops.

the lowliest monk

Replies are listed 'Best First'.
Re^2: Global Regex sans //g
by eibwen (Friar) on May 02, 2005 at 05:03 UTC

    Thanks for pointing that out! The output from use re qw/debug/; makes a lot more sense now that I realize what was happening.

    However, using \1 to force matches to fail seems problematic at best, as the string could be repetitive and thus succeed and stop prematurely. A contradiction (eg [^\D\d]) would preclude this possiblity, but are there any implications of such a contradiction beyond this usage?

      I think that such a contradiction would work fine, or the more succinct (and zero-width) (?!).

      the lowliest monk

        aka "doesn't match nothing"