Actually close to a much simpler example here, but not quite what I'm looking for:
http://www.perlmonks.org/?node=169190
Here's a quick example of the problem I have at this point. Let's say I want to search for 'gca'. There are 13 occurrences in my unformatted string. When I format the string however, the sequence gets broken up, spread across groups and lines, and I can't find them anymore. Here is the formatted string:
1 atggcgacga aggccgtgtg cgtgctgaag ggcgacggcc cagtgcaggg catcatcaat
61 ttcgagcaga aggaaagtaa tggaccagtg aaggtgtggg gaagcattaa aggactgact
121 gaaggcctgc atggattcca tgttcatgag tttggagata atacagcagg ctgtaccagt
181 gcaggtcctc actttaatcc tctatccaga aaacacggtg ggccaaagga tgaagagagg
241 catgttggag acttgggcaa tgtgactgct gacaaagatg gtgtggccga tgtgtctatt
301 gaagattctg tgatctcact ctcaggagac cattgcatca ttggccgcac actggtggtc
361 catgaaaaag cagatgactt gggcaaaggt ggaaatgaag aaagtacaaa gacaggaaac
421 gctggaagtc gtttggcttg tggtgtaatt gggatcgccc aataaacatt cccttggatg
481 tagtctgagg cccct
The triplet's I would find if I just searched the unformatted string are at (with problem strings in the formatted string bolded):
- Base pair 55 (Row 1, Group 5, 5th letter in)
- Base pair 60 (Row 1, Group 5, 10th letter in)
- Base pair 66 (Row 2, Group 1, 6th letter in)
- Base pair 104 (Row 2, Group 5, 4th letter in)
- Base pair 129 (Row 3, Group 1, 9th letter in)
- Base pair 166 (Row 3, Group 5, 6th letter in)
- Base pair 181 (Row 4, Group 1, 1st letter in)
- Base pair 240 (Row 4, Group 6, 10th letter in) ** BIG PROBLEM
- Base pair 257 (Row 5, Group 2, 7th letter in)
- Base pair 335 (Row 6, Group 4, 5th letter in)
- Base pair 347 (Row 6, Group 5, 7th letter in)
- Base pair 370 (Row 7, Group 1, 10th letter in)
- Base pair 383 (Row 7, Group 3, 3rd letter in)
The problem is that the search can go across group and across lines. A pure regex across the newlines won't work, because I'll end up highlighting the numbers. I suppose that I *could* go back in after tagging, find the numbers, and retag them back, and that might be an option.
I tried setting some code to uppercase, then marking the uppercase letters, and then switching the letters back to lowercase. It cleared the highlight tags when I did that, so that option is out.
I might try just building a widget where the position numbers are outside the Text widget, but it seems like a lot of work.
Here's some test code that's a little easier to swallow than yours if anyone wants to play around with it.
use Tk;
$sequence=" 1 atggcgacga aggccgtgtg cgtgctgaag ggcgacggcc cagtg
+caggg catcatcaat
61 ttcgagcaga aggaaagtaa tggaccagtg aaggtgtggg gaagcattaa aggac
+tgact
121 gaaggcctgc atggattcca tgttcatgag tttggagata atacagcagg ctgta
+ccagt
181 gcaggtcctc actttaatcc tctatccaga aaacacggtg ggccaaagga tgaag
+agagg
241 catgttggag acttgggcaa tgtgactgct gacaaagatg gtgtggccga tgtgt
+ctatt
301 gaagattctg tgatctcact ctcaggagac cattgcatca ttggccgcac actgg
+tggtc
361 catgaaaaag cagatgactt gggcaaaggt ggaaatgaag aaagtacaaa gacag
+gaaac
421 gctggaagtc gtttggcttg tggtgtaatt gggatcgccc aataaacatt ccctt
+ggatg
481 tagtctgagg cccct";
# Main Window
my $mainWindow = MainWindow->new();
$mainWindow->title("Regex Problem Example"); #Sets Title
$mainWindow->minsize(qw(500 500));
$mainWindow->geometry('+500+200');
$mainWindow->optionAdd('*font'=>'Courier 10');
$ROText = $mainWindow->Scrolled('ROText',
-scrollbars=>'osoe');
$ROText->Insert($sequence);
$ROText->pack;
highlightText($ROText, "gca");
MainLoop();
sub highlightText {
my ($widget, $searchString) = @_;
# Create a tag to configure the text
$widget->tagConfigure('foundtag',
-foreground => "white",
-background => "red");
$widget->FindAll(-regex, -nocase, $searchString);
if ($widget->tagRanges('sel'))
{
my %startfinish = $widget->tagRanges('sel');
foreach(sort keys %startfinish)
{
$widget->tagAdd("foundtag", $_, $startfinish{$_});
}
$widget->tagRemove('sel', '1.0', 'end');
}
}
| [reply] [d/l] [select] |