in reply to Comparing 2 different-sized strings
This will knock the spots of most every other algorithm implemented in perl and many of them when implemented in C:
#! perl -slw use strict; sub fuzzyMatch { my( $rHay, $rNee, $misses ) = @_; my $lNee = length $$rNee; my $min = $lNee - $misses; map { ( ( substr( $$rHay, $_, $lNee ) ^ $$rNee ) =~ tr[\0][] ) >= $min ? $_ : () } 0 .. length( $$rHay ) - $lNee; } my $hay = 'TCGAGTGGCCATGAACGTGCCAATTG'; my $nee = 'ATGATCCTG'; print substr( $hay, $_-5, length( $nee ) + 10 ) for fuzzyMatch( \$hay, + \$nee, 3 ); $hay = 'aacctgacctacgtttgacgatcgtacgtcagtcctccgtgctaactgacgtaaaaaaaata +cgtcccccccc'; $nee = 'acgtacgt'; print substr( $hay, $_-5, length( $nee ) + 10 ) for fuzzyMatch( \$hay, + \$nee, 3 ); __END__ C:\test>1048594 TGGCCATGAACGTGCCAAT acctgacctacgtttgac gacctacgtttgacgatc gtttgacgatcgtacgtc gacgatcgtacgtcagtc atcgtacgtcagtcctcc gtcagtcctccgtgctaa tgctaactgacgtaaaaa aactgacgtaaaaaaaat aaaaaaaatacgtccccc aaaatacgtcccccccc
The subroutine returns the offset where the fuzzily matched substrings are found in the primary; one for each match.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Comparing 2 different-sized strings
by AdrianJ217 (Novice) on Aug 09, 2013 at 09:22 UTC | |
by BrowserUk (Patriarch) on Aug 09, 2013 at 09:40 UTC | |
by AdrianJ217 (Novice) on Aug 09, 2013 at 11:46 UTC | |
by BrowserUk (Patriarch) on Aug 09, 2013 at 12:00 UTC | |
by AdrianJ217 (Novice) on Aug 10, 2013 at 19:27 UTC | |
by BrowserUk (Patriarch) on Aug 10, 2013 at 21:27 UTC | |
by AdrianJ217 (Novice) on Aug 11, 2013 at 08:43 UTC | |
| |
|
Re^2: Comparing 2 different-sized strings
by AdrianJ217 (Novice) on Aug 14, 2013 at 10:08 UTC | |
by BrowserUk (Patriarch) on Aug 14, 2013 at 10:46 UTC |