Your first two examples do differ in a few ways. First, you're capturing "infix" into \2 when you use the RE that contains lookahead/lookbehind. That's because you've got an extra set of capturing parens. The second regexp, the one where you're using literal anchors instead of lookahead and lookbehind... that one captures "infix" into \1, because it doesn't have that extra set of parens.

Also, your first example essentially looks for "infix", and then checks to see if "prefix" comes immediately before it, and if "suffix" comes immediately after. The second example looks for "prefixinfixsuffix", pretty much all at once. In your simple example, (discounting the difference in capturing parens) there is no practical difference; you won't see a difference. But there are plenty of cases where lookahead and lookbehind are useful. One example is given in perlretut.

As for why (?<=some(optional)?prefix) is allowed to generate an error, rather than simply being optimized to some(optional)?prefix, well, for one thing that would be inconsistent with the documentation which says that variable width lookbehind is not supported. Also, it would mean turning a lookbehind assertion (which doesn't "consume" any of the string it's matching against) into a string gobbling assertion in a way that couldn't be controlled by the person composing the regular expression; also contrary to what is defined in the documentation.

Just for kicks, have a look at the output of the following snippet:

use strict; use warnings; use YAPE::Regex::Explain; my( @REx ) = ( qr/((?<=prefix)(infix)(?=suffix))(??{ print $^N, "\n" } +)/, qr/prefix(infix)suffix(??{ print $^N, "\n" })/ ); my $string = "prefixinfixsuffix"; for( @REx ) { my $exp = YAPE::Regex::Explain->new($_)->explain; print $exp; 1 if $string =~ $_; }

Dave


In reply to Re: Regular Expression Constructs ?<= and ?= by davido
in thread Regular Expression Constructs ?<= and ?= by eibwen

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.