why not just from 0 to length(nee)?
Because if you compare at position lenght( hay), you aren't comparing anything.
Take the case of a 20-byte haystack:acgtacgtacgtacgtacgt and a 4-byte needle: acct; at position 20:
000000001111111111112
012345678901234567890
acgtacgtacgtacgtacgt
acct
The last position you can get a full match is at 20-4 position 16: 000000001111111111112
012345678901234567890
acgtacgtacgtacgtacgt
acct
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
Hi,
Thank you so much for your help. Could you just tell me what the "for" is when you call the subroutine in the main program? I have seen "for" only in the context of a for loop where you also supply the 3 parameters like initial index, final, and increment.
By the way, everything else you explained to me I completely understood and my script now works perfectly. Thank you so much!!
| [reply] |
Could you just tell me what the "for" is when you call the subroutine in the main program? I have seen "for" only in the context of a for loop where you also supply the 3 parameters like initial index, final, and increment.
Sure.
If there are multiple matches in the haystack, the subroutine will return a list of start positions, one for each match.
By giving that list to for, it will execute the print substr statement for each position returned; with $_ taking on each of those start positions one after the other.
Hence, this
$hay = 'aacctgacctacgtttgacgatcgtacgtcagtcctccgtgctaactgacgtaaaaaaaata
+cgtcccccccc';
$nee = 'acgtacgt';
print substr( $hay, $_-5, length( $nee ) + 10 ) for fuzzyMatch( \$hay,
+ \$nee, 3 );
prints the 10 matches (+the 5 bytes before and after):
acctgacctacgtttgac
gacctacgtttgacgatc
gtttgacgatcgtacgtc
gacgatcgtacgtcagtc
atcgtacgtcagtcctcc
gtcagtcctccgtgctaa
tgctaactgacgtaaaaa
aactgacgtaaaaaaaat
aaaaaaaatacgtccccc
aaaatacgtcccccccc
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
Ok, thank you. Now I understand a lot more about using bitwise approaches to Perl. I also just noticed your post from a while ago regarding Hamming Distance:
my $s1 = 'AAAAA';
my $s2 = 'ATCAA';
my $s3 = 'AAAAA';
print "$s1:$s2 hd:", hd( $s1, $s2 ); # will give value 2
print "$s1:$s3 hd:", hd( $s1, $s3 ); # will give value 0
sub hd{ length( $_ 0 ) - ( ( $_ 0 ^ $_ 1 ) =~ tr\0\0 ) }
I just didnt understand the line above defining the subroutine. How do you know which part refers to which sequence ($s1 vs $s2 for example)?
Thank you so much! I can't believe how helpful and patient you are.
| [reply] |