Wow, I didn't know the empty capture and backreference trick, nice!
I also didn't know the trick where you put the pattern on the left side of the =~ operator and still get results somehow ;-)
I suspect there's some unexpected behavior in the backtracking?
I tried:
use v5.20;
use Data::Dump "pp";
my @z = glob('{a,b,c}'x6);
my $z = '(?:a()|a()|b()|b()|c()|c()){6}\1\2\3\4\5\6';
for my $j (@z)
{
$j =~ $z and say pp {$j => \@- };
}
And I got a bunch of values likes:
...
{ acbcaa => [0, 6, 6, 3, 3, 4, 4] }
{ acbcab => [0, 5, 5, 6, 6, 4, 4] }
{ acbcac => [0, 5, 5, 3, 3, 6, 6] }
{ acbcba => [0, 6, 6, 5, 5, 4, 4] }
{ acbcca => [0, 6, 6, 3, 3, 5, 5] }
{ accaab => [0, 5, 5, 6, 6, 3, 3] }
{ accaba => [0, 6, 6, 5, 5, 3, 3] }
...
{ cccbab => [0, 5, 5, 6, 6, 3, 3] }
{ cccbac => [0, 5, 5, 4, 4, 6, 6] }
{ cccbba => [0, 6, 6, 5, 5, 3, 3] }
{ cccbca => [0, 6, 6, 4, 4, 5, 5] }
{ ccccab => [0, 5, 5, 6, 6, 4, 4] }
{ ccccba => [0, 6, 6, 5, 5, 4, 4] }
Where each pair of alternative match exactly (eg \1 and \2) at the same place, no matter what. I'd suspect that the identical branches are actually merged by the optimizer.
Is there some other magic to DWIM?
There's this:
my @y = glob('{a,b,c}'x6);
my $y = '(?:(?!\1)a()|(?!\2)a()|(?!\3)b()|(?!\4)b()|(?!\5)c()|(?!\6)c(
+)){6}\1\2\3\4\5\6';
for my $j (@y)
{
$j =~ $y and say $j;
}
aabbcc
aabcbc
aabccb
aacbbc
aacbcb
aaccbb
ababcc
abacbc
abaccb
...
ccbaab
ccbaba
ccbbaa
Edit: this also works actually (without \1\2\3\4\5\6 at the end):
# edit reformatted as a multiline regex for clarity
my $y = qr/(?:
(?!\1) a ()
| (?!\2) a ()
| (?!\3) b ()
| (?!\4) b ()
| (?!\5) c ()
| (?!\6) c ()
){6}
/x;
So TIL, (?!\x)XXX() is a pattern to only allow XXX to match once in the whole regex... Cool :-)
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.