Based on the examples, I don't believe that nicemank is requiring captured words to be adjacent.
Hmmm... After taking another look at the OP, I think you may be right. In which case:
>perl -wMstrict -le
"my $s = 'xxxx yy zzzzz xxxx qqq xxxx yy zzzzz xxxx qqq';
;;
for my $ar ([2, 4, 3], [5, 3]) {
my $rx = rxg(@$ar);
print $rx;
my @groups = $s =~ m{ $rx }xmsg;
print qq{'$_'} for @groups;
}
;;
sub rxg {
my ($rx) =
map qr{ \b $_ \b }xms,
join ' \b .+? \b ',
map qq{\\w{$_}},
@_
;
;;
return $rx;
}
"
(?^msx: \b \w{2} \b .+? \b \w{4} \b .+? \b \w{3} \b )
'yy zzzzz xxxx qqq'
'yy zzzzz xxxx qqq'
(?^msx: \b \w{5} \b .+? \b \w{3} \b )
'zzzzz xxxx qqq'
'zzzzz xxxx qqq'
Update: No, darn it, that's still not right! nicemank seems to want 'yy xxxx qqq' from 'yy zzzzz xxxx qqq'. Oh, well...
|