I have at my work a little tricky task. It is a string matching problem. I want to compare two amino acid sequences (here in Letter code, single char represents one amino acid) to find at which positions one sequence lies in the other! example?
simple? new examplesMAAGAAAAFAAAATTTTTTTTFTTTTTTTTTTTTAAAAEAAAARAAAAAA # 1. sequence TTTTTTTTFTTTTTTTTTTTT # 2. sequence result is: 2. lies at position 14 to 34 in 1.
I tried with the regexp and the module String::Approx and aslice with the option 'minimal_distance', but I don't like the return values for this module.SUBSTITUTION AAAAEAAAARGAAATTTTFTTTTTTTTTTTTTTTTAAAAAAAAILVAAAAAAAA # 1. sequence TTTTFTTTATTTTTTDTTTTT # 2. sequence DELETION AAAAAAAAAAAAATTGTTTTTTTXXXXXTTTTTTTTTTMAAAAAAAAAAAAAAAA # 1. sequence TTGTTTTTTTTTTTTTTTTTM # 2. sequence REVERSE TTTTTTTTTTTTTTTTTTTT # 1. sequence AAAAAAAAAAAATTTTTTTTTTTTTTTTTTTTTAAAAAAAAAAAAAAAA # 2. sequence PERFECT MATCHING ONLY AT BEGIN AND END OF 2. SEQUENCE AAAAAAAAAAATTTTTTTTGGGGGGGGGGGGGGGGGGGGGTTTTTTTTTAAAAAAA # 1.sequence TTTTTTTTGGGNNGGGEEGGGEGGGGGGTTTTTTTTT # 2. Sequence
Any hints how to do "the best way"?
Murcia
edit (broquaint): dropped <pre> tags aand added formatting
In reply to similar string matching by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |