Re^9: Comparing 2 different-sized strings

Could you just tell me what the "for" is when you call the subroutine in the main program? I have seen "for" only in the context of a for loop where you also supply the 3 parameters like initial index, final, and increment.

Sure.

If there are multiple matches in the haystack, the subroutine will return a list of start positions, one for each match.

By giving that list to for, it will execute the print substr statement for each position returned; with $_ taking on each of those start positions one after the other.

Hence, this

$hay = 'aacctgacctacgtttgacgatcgtacgtcagtcctccgtgctaactgacgtaaaaaaaata
+cgtcccccccc';
$nee = 'acgtacgt';

print substr( $hay, $_-5, length( $nee ) + 10 ) for fuzzyMatch( \$hay,
+ \$nee, 3 );
[download]

prints the 10 matches (+the 5 bytes before and after):

acctgacctacgtttgac
gacctacgtttgacgatc
gtttgacgatcgtacgtc
gacgatcgtacgtcagtc
atcgtacgtcagtcctcc
gtcagtcctccgtgctaa
tgctaactgacgtaaaaa
aactgacgtaaaaaaaat
aaaaaaaatacgtccccc
aaaatacgtcccccccc
[download]

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Re^9: Comparing 2 different-sized strings Select or Download Code

Replies are listed 'Best First'.
Re^10: Comparing 2 different-sized strings by Anonymous Monk on Aug 18, 2013 at 13:20 UTC
Hi, I hope you are doing well. Thank you for your help. I had another question. If I am searching for 2 sequences within the same haystack, and what separates the 2 sequences is always a "T" followed by one other nucleotide (either A,G,C,or T), how can I do that using the substr? I know how to do this with regular expressions easily, but here it seems I cannot incorporate: `substr( $hay, $_, length( $nee )) for fuzzyMatch( \$hay, + \$nee, 3 )` [download] into a regular expression.	[reply] [d/l]
Re^11: Comparing 2 different-sized strings by BrowserUk (Patriarch) on Aug 18, 2013 at 13:41 UTC
If I am searching for 2 sequences within the same haystack, and what separates the 2 sequences is always a "T" followed by one other nucleotide (either A,G,C,or T), Could you explain that a bit more? I get that you are looking for `???...????T[acgt]???..???`; but that criteria will match everywhere a T occurs in a sequence, other than if it is the first, or second or third last, characters in the sequence. And without some constraints on the lengths of the pre & post T sequence length, there would be multiple (100s or 1000s or millions) possible matches at every T position. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^12: Comparing 2 different-sized strings by AdrianJ217 (Novice) on Aug 18, 2013 at 16:19 UTC
Hi, I'm not just looking for the T. For example, if I have the following sequence: `$hay = AACCCAGGATGCGCCATGCAGGACACAGGACGCCACGGAA $nee1 = AGGA $nee2 = CGCCAC` [download] What I want is the following in regular expression: `$hay =~ /($nee1)T[ATGC]($nee2)/` [download] So I only want $nee1 when it is directly followed by a T, some other nucleotide and $nee2. I don't want $nee1 and $nee2 anywhere else.	[reply] [d/l] [select]
Re^13: Comparing 2 different-sized strings by BrowserUk (Patriarch) on Aug 18, 2013 at 17:03 UTC
Re^14: Comparing 2 different-sized strings by AdrianJ217 (Novice) on Aug 18, 2013 at 17:11 UTC
Some notes below your chosen depth have not been shown here
Re^11: Comparing 2 different-sized strings by AdrianJ217 (Novice) on Aug 18, 2013 at 13:26 UTC
Hi, the last post was from me, Adrian. I'm sorry I forgot to log in.	[reply]