in reply to Challenge: Fast Common Substrings
Returns all the common substrings, since it might as well.. In scalar context returns the number.{ my @matches; my $push = qr/(?{ push @matches, $1 })/; sub match_all_ways { my ($string, $regex) = @_; @matches = (); $string =~ m/($regex)$push(?!)/; return @matches; } } sub common_substr { my ($str1, $str2, $len) = @_; my %substr = map { $_ => 1 } match_all_ways($str1 => qr/.{$len}/); $substr{$_} |= 2 for match_all_ways($str2 => qr/.{$len}/); return grep { $substr{$_} == 3 } keys %substr; } print "$_\n" for common_substr("ABCDEF", "ABDEFCBDEAB", 2); __END__ DE EF AB
Update: Or, if you want the regex engine to do all of the work, instead of computing the intersection of the two lists in perl:
Note that the match_all_ways sub was changed slightly (to account for the different capturing). Disclaimer: If input strings contain a newline or null character, or if $len > 65536, it doesn't work.. but I think you get the idea.{ my @matches; my $push = qr/(?{ push @matches, $1 })/; sub match_all_ways { my ($string, $regex) = @_; @matches = (); $string =~ m/$regex$push(?!)/; return @matches; } } sub common_substr { my ($str1, $str2, $len) = @_; my %subs; @subs{ match_all_ways("$str1\0$str2" => qr/(.{$len}).*\0.*\1/) } + = (); return keys %subs; }
These solutions are both trivial to extend to a range of lengths.. just pass "n,m" as the $len argument.
blokhead
|
---|