Re: Regular Expression Constructs ?<= and ?=

Your first two examples do differ in a few ways. First, you're capturing "infix" into \2 when you use the RE that contains lookahead/lookbehind. That's because you've got an extra set of capturing parens. The second regexp, the one where you're using literal anchors instead of lookahead and lookbehind... that one captures "infix" into \1, because it doesn't have that extra set of parens.

Also, your first example essentially looks for "infix", and then checks to see if "prefix" comes immediately before it, and if "suffix" comes immediately after. The second example looks for "prefixinfixsuffix", pretty much all at once. In your simple example, (discounting the difference in capturing parens) there is no practical difference; you won't see a difference. But there are plenty of cases where lookahead and lookbehind are useful. One example is given in perlretut.

As for why (?<=some(optional)?prefix) is allowed to generate an error, rather than simply being optimized to some(optional)?prefix, well, for one thing that would be inconsistent with the documentation which says that variable width lookbehind is not supported. Also, it would mean turning a lookbehind assertion (which doesn't "consume" any of the string it's matching against) into a string gobbling assertion in a way that couldn't be controlled by the person composing the regular expression; also contrary to what is defined in the documentation.

Just for kicks, have a look at the output of the following snippet:

use strict;
use warnings;
use YAPE::Regex::Explain;

my( @REx ) = ( qr/((?<=prefix)(infix)(?=suffix))(??{ print $^N, "\n" }
+)/,
               qr/prefix(infix)suffix(??{ print $^N, "\n" })/ );

my $string = "prefixinfixsuffix";

for( @REx ) {
  my $exp = YAPE::Regex::Explain->new($_)->explain;
  print $exp;
  1 if $string =~ $_;
}
[download]

Dave

Comment on Re: Regular Expression Constructs ?<= and ?= Select or Download Code