http://qs1969.pair.com?node_id=636266

I've just been to demerphq's 5.10 regex talk at YAPC::Europe and I've been overwhelmed by the number of new features in the regex syntax. However, unless I'm missing something in the talk and perlre there still seems no simple way to use Perl code as an assertion. When I say "as an assertion" I mean that I assert that some condition must be met if the (sub)pattern is to match.

The way I would assert in a 5.8 regex that some expression, say whatever(), must be true would be.

/(?(?{! whatever() })(?!))/

Or

/(?(?{ whatever() })|(?!))/

The latter can, in 5.10, be written...

/(?(?{ whatever() })|(*FAIL))/

... which almost reads naturally as whatever or fail but this still seems unduly complex for what I would have thought would be a common desire. Indeed I've always felt that this should be the semantics of simple code constructs. (After all the documentation does call them assertions).

/(?{ whatever() })/

The above, of course, does not work because code assertions are defined to always succeed.

Is there a neater way to do this in 5.10 (or indeed 5.8) than the ways I'd currently use in 5.8?

In previous discussion Ilya Zakharevich suggested that we could add flags between the closing brackets so perhaps we could use:

/(?{ whatever() }?)/

Replies are listed 'Best First'.
Re: Regex code assertion should be able to fail
by nobull (Friar) on Sep 04, 2007 at 08:43 UTC
    Well here's a first cut at a patch against 5.9.5.

    It seems to test OK and I think the documentation is probably minimally acceptable.

Re: Regex code assertion should be able to fail
by zshzn (Hermit) on Sep 02, 2007 at 07:23 UTC
    Wait, something is coming through. I am getting an idea; nay, a premonition. I see...a future where Perl exists entirely between a / and a /, a world held between two edges of a crevice. A world where we lose our conscious knowledge of when the regular expression begins and ends, of where it should and should not, as choice becomes too intermingled with expressions. Alas, a world in confusion is a world true to Perl, no? Perhaps devious Perl Gods make things difficult in order to restrain the instincts of us lesser beings.

    Jokes aside, I really don't know, sorry. My mastery of post-modern Perl regex has yet to arrive, or even, to commence.

        The better link is S05 for the specification, and STD for the full Perl 6 grammar.

        I don't know if there is a tutorial (I wrote one, but it's in German).

Re: Regex code assertion should be able to fail
by diotalevi (Canon) on Sep 04, 2007 at 15:40 UTC

    It's worth noting that the entire contents of the implementation of whatever() must be 100% free of split(), m//, and s///. The engine isn't reentrant and the multiple simultaneous uses will use the same data and corrupt each other.

    This might have gotten better during 5.9. I couldn't say.

    ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Re: Regex code assertion should be able to fail
by mmmmtmmmm (Monk) on Sep 03, 2007 at 21:46 UTC
      Thanks, but that's the wrong usage of assertion.

      I'm taking about assertions in regular expressions. These are conditions that must hold true if the regex match is to continue. If the condition is not met, no exception is thrown, the re engine just backtracks and tries to find another match.

Re: Regex code assertion should be able to fail
by halley (Prior) on Sep 04, 2007 at 15:21 UTC
    Some time ago I figured out how to combine the succeed/fail operator and the execute-code operator. This includes the use of (?!) to force a failure. I think this is what you need, if I understood your question. It was not at all obvious to me, so I wrote it up here before.

    combining (?(condition)yes|no) and (?{code})

    Does this help?

    --
    [ e d @ h a l l e y . c c ]

      Unless I'm missing something, the technique you describe is essentially the one I described in my original post as the technique I'd use in 5.8 (and was hoping for something more concise).

      /(?(?{ whatever() })|(?!))/

      Did I miss something?

      Update: halley replied via /msg. His post was describing the same technique but gives more details for the benefit of on-lookers.

      To clarify how I see this being used here's the code example from my patch to perlre:

      @known_animal{ qw( cat dog fox horse rabbit rat ) } = (); @animals = /\b(\w++)(?{ exists $known_animal{$^N} }?)/g; print "@animals\n";