Re^7: Comparing 2 different-sized strings

why not just from 0 to length(nee)?

Because if you compare at position lenght( hay), you aren't comparing anything.

Take the case of a 20-byte haystack:acgtacgtacgtacgtacgt and a 4-byte needle: acct; at position 20:

000000001111111111112
012345678901234567890
acgtacgtacgtacgtacgt
                    acct
[download]

The last position you can get a full match is at 20-4 position 16:

000000001111111111112
012345678901234567890
acgtacgtacgtacgtacgt
                acct
[download]

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Re^7: Comparing 2 different-sized strings Select or Download Code

Replies are listed 'Best First'.
Re^8: Comparing 2 different-sized strings by AdrianJ217 (Novice) on Aug 12, 2013 at 14:08 UTC
Hi, Thank you so much for your help. Could you just tell me what the "for" is when you call the subroutine in the main program? I have seen "for" only in the context of a for loop where you also supply the 3 parameters like initial index, final, and increment. By the way, everything else you explained to me I completely understood and my script now works perfectly. Thank you so much!!	[reply]
Re^9: Comparing 2 different-sized strings by BrowserUk (Patriarch) on Aug 12, 2013 at 14:41 UTC
Could you just tell me what the "for" is when you call the subroutine in the main program? I have seen "for" only in the context of a for loop where you also supply the 3 parameters like initial index, final, and increment. Sure. If there are multiple matches in the haystack, the subroutine will return a list of start positions, one for each match. By giving that list to for, it will execute the print substr statement for each position returned; with `$_` taking on each of those start positions one after the other. Hence, this `$hay = 'aacctgacctacgtttgacgatcgtacgtcagtcctccgtgctaactgacgtaaaaaaaata +cgtcccccccc'; $nee = 'acgtacgt'; print substr( $hay, $_-5, length( $nee ) + 10 ) for fuzzyMatch( \$hay, + \$nee, 3 );` [download] prints the 10 matches (+the 5 bytes before and after): `acctgacctacgtttgac gacctacgtttgacgatc gtttgacgatcgtacgtc gacgatcgtacgtcagtc atcgtacgtcagtcctcc gtcagtcctccgtgctaa tgctaactgacgtaaaaa aactgacgtaaaaaaaat aaaaaaaatacgtccccc aaaatacgtcccccccc` [download] With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]
Re^10: Comparing 2 different-sized strings by Anonymous Monk on Aug 18, 2013 at 13:20 UTC
Hi, I hope you are doing well. Thank you for your help. I had another question. If I am searching for 2 sequences within the same haystack, and what separates the 2 sequences is always a "T" followed by one other nucleotide (either A,G,C,or T), how can I do that using the substr? I know how to do this with regular expressions easily, but here it seems I cannot incorporate: `substr( $hay, $_, length( $nee )) for fuzzyMatch( \$hay, + \$nee, 3 )` [download] into a regular expression.	[reply] [d/l]
Re^11: Comparing 2 different-sized strings by BrowserUk (Patriarch) on Aug 18, 2013 at 13:41 UTC
Re^12: Comparing 2 different-sized strings by AdrianJ217 (Novice) on Aug 18, 2013 at 16:19 UTC
Some notes below your chosen depth have not been shown here
Re^11: Comparing 2 different-sized strings by AdrianJ217 (Novice) on Aug 18, 2013 at 13:26 UTC
Re^9: Comparing 2 different-sized strings by AdrianJ217 (Novice) on Aug 12, 2013 at 22:44 UTC
Ok, thank you. Now I understand a lot more about using bitwise approaches to Perl. I also just noticed your post from a while ago regarding Hamming Distance: my $s1 = 'AAAAA'; my $s2 = 'ATCAA'; my $s3 = 'AAAAA'; print "$s1:$s2 hd:", hd( $s1, $s2 ); # will give value 2 print "$s1:$s3 hd:", hd( $s1, $s3 ); # will give value 0 sub hd{ length( $_ 0 ) - ( ( $_ 0 ^ $_ 1 ) =~ tr\0 \0 ) } I just didnt understand the line above defining the subroutine. How do you know which part refers to which sequence ($s1 vs $s2 for example)? Thank you so much! I can't believe how helpful and patient you are.	[reply]
Re^10: Comparing 2 different-sized strings by BrowserUk (Patriarch) on Aug 12, 2013 at 22:54 UTC
How do you know which part refers to which sequence ($s1 vs $s2 for example)? Sorry, but you are going to have to clarify that question. Which "part" of what? (You should also have used `<code></code>` tags; and it is helpful when you reference another post to link to it Re: Hamming Distance Between 2 Strings - Fast(est) Way? using `[id://500244]`) With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]
Re^11: Comparing 2 different-sized strings by AdrianJ217 (Novice) on Aug 13, 2013 at 20:27 UTC
Re^12: Comparing 2 different-sized strings by BrowserUk (Patriarch) on Aug 13, 2013 at 21:06 UTC
Re^12: Comparing 2 different-sized strings by choroba (Cardinal) on Aug 13, 2013 at 20:41 UTC