The data I am working with are DNA sequences. (I know of the Bioperl project but have found nothing there that could help.) I have distilled the data to the DNA sequences alone. I have a second file with the sequence IDs, which is why I also print out the indicies from the matching strings. The data look like this.
ATGGAGAACATCACATCAGGACTCCTAGGACCCCTTCTCGTGTTACAGGC
ATGGAGAACATCACATCACGACTCCTAGGACCCCTTCACGTGAAACAGGC
ATGCTCAACGTCACATCAGGACTCCTAGGACCACGTCTCGTGTTACAGGG
ATGGTGTACATCACGACAGGATTCCTCGGAATCGCGCTGGTGACACAGGC
With the sequence IDs the data would look like this.
>seq1
ATGGAGAACATCACATCAGGACTCCTAGGACCCCTTCTCGTGTTACAGGC
>seq2
ATGGAGAACATCACATCACGACTCCTAGGACCCCTTCACGTGAAACAGGC
>seq3
ATGCTCAACGTCACATCAGGACTCCTAGGACCACGTCTCGTGTTACAGGG
>seq4
ATGGTGTACATCACGACAGGATTCCTCGGAATCGCGCTGGTGACACAGGC
### update ###
I have placed real data in my public scratchpad.
In reply to Re^2: Search for identical substrings
by bioMan
in thread Search for identical substrings
by bioMan
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |