Re^8: Comparing 2 different-sized strings

Replies are listed 'Best First'.
Re^9: Comparing 2 different-sized strings by BrowserUk (Patriarch) on Aug 12, 2013 at 14:41 UTC
Could you just tell me what the "for" is when you call the subroutine in the main program? I have seen "for" only in the context of a for loop where you also supply the 3 parameters like initial index, final, and increment. Sure. If there are multiple matches in the haystack, the subroutine will return a list of start positions, one for each match. By giving that list to for, it will execute the print substr statement for each position returned; with `$_` taking on each of those start positions one after the other. Hence, this `$hay = 'aacctgacctacgtttgacgatcgtacgtcagtcctccgtgctaactgacgtaaaaaaaata +cgtcccccccc'; $nee = 'acgtacgt'; print substr( $hay, $_-5, length( $nee ) + 10 ) for fuzzyMatch( \$hay, + \$nee, 3 );` [download] prints the 10 matches (+the 5 bytes before and after): `acctgacctacgtttgac gacctacgtttgacgatc gtttgacgatcgtacgtc gacgatcgtacgtcagtc atcgtacgtcagtcctcc gtcagtcctccgtgctaa tgctaactgacgtaaaaa aactgacgtaaaaaaaat aaaaaaaatacgtccccc aaaatacgtcccccccc` [download] With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]
Re^10: Comparing 2 different-sized strings by Anonymous Monk on Aug 18, 2013 at 13:20 UTC
Hi, I hope you are doing well. Thank you for your help. I had another question. If I am searching for 2 sequences within the same haystack, and what separates the 2 sequences is always a "T" followed by one other nucleotide (either A,G,C,or T), how can I do that using the substr? I know how to do this with regular expressions easily, but here it seems I cannot incorporate: `substr( $hay, $_, length( $nee )) for fuzzyMatch( \$hay, + \$nee, 3 )` [download] into a regular expression.	[reply] [d/l]
Re^11: Comparing 2 different-sized strings by BrowserUk (Patriarch) on Aug 18, 2013 at 13:41 UTC
If I am searching for 2 sequences within the same haystack, and what separates the 2 sequences is always a "T" followed by one other nucleotide (either A,G,C,or T), Could you explain that a bit more? I get that you are looking for `???...????T[acgt]???..???`; but that criteria will match everywhere a T occurs in a sequence, other than if it is the first, or second or third last, characters in the sequence. And without some constraints on the lengths of the pre & post T sequence length, there would be multiple (100s or 1000s or millions) possible matches at every T position. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^12: Comparing 2 different-sized strings by AdrianJ217 (Novice) on Aug 18, 2013 at 16:19 UTC
Re^13: Comparing 2 different-sized strings by BrowserUk (Patriarch) on Aug 18, 2013 at 17:03 UTC
Some notes below your chosen depth have not been shown here
Re^11: Comparing 2 different-sized strings by AdrianJ217 (Novice) on Aug 18, 2013 at 13:26 UTC
Hi, the last post was from me, Adrian. I'm sorry I forgot to log in.	[reply]
Re^9: Comparing 2 different-sized strings by AdrianJ217 (Novice) on Aug 12, 2013 at 22:44 UTC
Ok, thank you. Now I understand a lot more about using bitwise approaches to Perl. I also just noticed your post from a while ago regarding Hamming Distance: my $s1 = 'AAAAA'; my $s2 = 'ATCAA'; my $s3 = 'AAAAA'; print "$s1:$s2 hd:", hd( $s1, $s2 ); # will give value 2 print "$s1:$s3 hd:", hd( $s1, $s3 ); # will give value 0 sub hd{ length( $_ 0 ) - ( ( $_ 0 ^ $_ 1 ) =~ tr\0 \0 ) } I just didnt understand the line above defining the subroutine. How do you know which part refers to which sequence ($s1 vs $s2 for example)? Thank you so much! I can't believe how helpful and patient you are.	[reply]
Re^10: Comparing 2 different-sized strings by BrowserUk (Patriarch) on Aug 12, 2013 at 22:54 UTC
How do you know which part refers to which sequence ($s1 vs $s2 for example)? Sorry, but you are going to have to clarify that question. Which "part" of what? (You should also have used `<code></code>` tags; and it is helpful when you reference another post to link to it Re: Hamming Distance Between 2 Strings - Fast(est) Way? using `[id://500244]`) With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]
Re^11: Comparing 2 different-sized strings by AdrianJ217 (Novice) on Aug 13, 2013 at 20:27 UTC
Oh, sorry. In the part of the code where you have: `sub hd{ length( $_[ 0 ] ) - ( ( $_[ 0 ] ^ $_[ 1 ] ) =~ tr[\0][\0] ) }` [download] You are submitting 2 parameters (2 dna sequences) to the subroutine, so in the above code which computes the hamming distance, how do you know which variable refers to which sequence? Or could you just explain what the code above is doing? Thank you so much!	[reply] [d/l]
Re^12: Comparing 2 different-sized strings by BrowserUk (Patriarch) on Aug 13, 2013 at 21:06 UTC
Re^12: Comparing 2 different-sized strings by choroba (Cardinal) on Aug 13, 2013 at 20:41 UTC