Here's a question that's scrambled my brains, I'm designing right now so I don't have code. I'm kind of looking for a "big picture" answer.

I am working with BioPerl (not a bioperl question), which provides me with gene sequence data which I can pull out and manipulate as a large unbroken string of characters (several hundred to several thousand characters long).

It looks something like this...
atgcatgcatgcatgcatgcatgcatgcaattggccatgcatgcatgcaattggccgcat...
The eventual output will be displayed in a Text or ROText widget, in GenBank sequence format which simply means spaced out every ten characters and the index of the base pair is displayed like this...
1 atgcatgcat gcatgcatgc atgcatgcaa 31 ttggccatgc atgcatgcaa ttggccgcat ...
I want to be able to input a small subsequence of that sequence (say aattggcc which is in my sample string), then I want to highlight any found sequence in my formatted string. So if capital letters were highlighted, this would look something like...
1 atgcatgcat gcatgcatgc atgcatgcAA 31 TTGGCCatgc atgcatgcAA TTGGCCgcat ...
So I'm wondering about a nice clean way to do this, and it's kind of escaping me at the moment. I was thinking I could:
  1. Look for the subsequence in the unformatted string (I can do this!)
  2. Mark it in a way that isn't lost when formatting (maybe change the letters to uppercase) (I can do this!)
  3. Format the string and place it in the Text or ROText object (I can do this!)
  4. Tag the data there with a tag configure/regex search (I can do this!)
  5. Remove my old marks so that it's properly in the format
    (kind of at a loss here, I'm not sure if changing the text within a tag using FindAndReplaceAll will cause the widget to clear the tags, any help here would be FANTASTIC!)
  6. Highlight off the tags. (I can do this!)
I'm relatively new to perl, and I was thinking there might be a way easier way (maybe tag the unformatted sequence within the Text widget, and build the structure within the widget with a regex? can I place tags on the data outside the widget? is there a tool already that I just have to write a formatter for?). Any ideas and/or help with #5 would be greatly appreciated.

In reply to Searching Formatting and Highlighting Text Problem by janusmccarthy

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.