in reply to Re: Regex related question
in thread Regex related question

I think there needs to be a condition so that the last substr is only run if needed. I came up with a similar coding.. If speed is of interest, then I would benchmark these substr approach vs the regex. I've found that sometimes the s/// can be slow, but the regex engine evolves all the time so benchmarking would be the only way to really know for the Perl that is being used.
#!/usr/bin/perl -w use strict; my @strings = qw ( ACTGCTAGGGGGGG TCAGCTAGCNA ACTGSCGACAAAA GTCTGAGTTATTT); foreach my $str (@strings) { my $last_char = substr ($str,-1,1); my $cur_index = -1; while (substr ($str, --$cur_index,1) eq $last_char){} print "old: $str \n"; substr ($str,$cur_index+1,-$cur_index-3,"") if ($cur_index < 3); print "new: $str\n"; } __END__ old: ACTGCTAGGGGGGG new: ACTGCTAGG old: TCAGCTAGCNA new: TCAGCTAGCNA old: ACTGSCGACAAAA new: ACTGSCGACAA old: GTCTGAGTTATTT new: GTCTGAGTTATT

Replies are listed 'Best First'.
Re^3: Regex related question
by davido (Cardinal) on Aug 08, 2011 at 08:28 UTC

    I usually would say that the minor speed difference shouldn't matter. But all I know about genome mapping is that it's computationally intensive, so checking it out is probably a good idea.


    Dave