Sorry if my example was not clear. Yes, I would like all possible overlapping matches in a string to a regular expression, reporting back the starting position. I understand how to get the length fo the match from the expresssion. Thanks.
Vince | [reply] |
This is what I came up with. It loops, but should not be too slow. I hope that it's going to work for you.
#!perl
use strict;
use warnings;
my $string = qq[FGTXYZGTFABCGHABCFGTXYZADXYZGTYABC];
my ($pos, $nummatch) = (0, 0);
while ($string =~ /^(.{$pos,}?)[XYZ]{3}([A-Za-z]{0,21}?)[ABC]{3}/) {
my $tpos = $pos;
$pos = length($1) + 1;
$nummatch = length($2);
while ($string =~ /^(.{$tpos,}?)[XYZ]{3}([A-Za-z]{$nummatch,21}?)[A
+BC]{3}/) {
print length($1) . ', ' . length($2) . $/;
$nummatch = length($2) + 1;
}
}
__OUTPUT__
3, 3
3, 8
20, 8
25, 3
Explanation: $pos contains the minimal offset, $nummatch the minimum number of characters to match. The outer loop iterates through the starting positions, the inner one through the different matching lengths. As the matching is non-greedy, you will get the matches ordered by starting position first, second by length. | [reply] [d/l] [select] |
It works somewhat but I keep getting this repetitive loop when I try it on a larger string. After looking through "mastering regular expressions" by O'Reilly I came up with this:
use strict;
use warnings;
my $string = qq[FGTXYZGTFABCGHABCFGTXYZADXYZGTYABC];
#or try this my $string = qq[FGTXYZABCABC];
$string =~ m/[XYZ]{3}([A-Za-z]{0,21}?)[ABC]{3}?(?{print "matched at [<
+$&>] $-[0]\n" })(?!)/x;
This seems to work, but I am sort of confused as to how it works !!
Vince | [reply] [d/l] |