in reply to Re: Bracketing Substring(s) in the String
in thread Bracketing Substring(s) in the String

Take a closer look at the actual original strings. The second two strings match multiple substrings, so the desired output actually shows this, but in a sort-of-confusing way. Take this example:
STRING ------- GCGCTCGACGC SUBSTRINGS ---------- GCGC ACG == [GCGC]TC[ACG]C
But, when we have OVERLAPPING sequences the output should 'mash-up' a bit:
STRING ------- GCGCTCGACGC SUBSTRINGS ---------- GCGC GCTC == [GCGCTC]GACGC
Do you see how GCGC AND GCTC MERGE into one single substring for the desired output?

So I think the algorithm should look like this:

  • Make as many straight matches as you can
  • If your match is within a string that has already been matched, modify that match to include the new match

    Can you imagine how messy this would look if you had 100 substrings and a main string running 10,000 letters long (which I assume is possible because this stuff looks like gene sequence data)?

    Celebrate Intellectual Diversity