in reply to Finding Start/End Position of the Uppercase Substring

Using pos and extracting character positions and according actual offsets (useful to substr) it could look as outlined below. I'm not saying however, that this 'solution' is better (just another possible approach):
use strict; use warnings; my @strings = do { local $/; split /\n/, <DATA> }; foreach my $str (@strings) { my $ret = offset($str); my $substring = substr($ret->[0], $ret->[2][0], $ret->[2][1]); print <<"EOT"; $substring start character: $ret->[1][0] end character: $ret->[1][1] start offset: $ret->[2][0] end offset: $ret->[2][1] EOT } sub offset { my $str = shift; my $hyphens = 0; $hyphens++ while $str =~ /-/g; $str =~ /[A-Z]/g and my $pos_start = pos($str); $str =~ /[a-z]/g and my $pos_end = pos($str); return [ $str, [ ($pos_start - $hyphens), ($pos_end - $hyphens) - 1 ], [ $pos_start - 1, ($pos_end - $pos_start) - 1 ] ]; } __DATA__ ccaatTTTGACACACACAGAAgggca --aatTTTGACACACACAGAAgggca
outputs
TTTGACACACACAGA start character: 6 end character: 21 start offset: 5 end offset: 15 TTTGACACACACAGA start character: 4 end character: 19 start offset: 5 end offset: 15
Update: fix formatting.

Replies are listed 'Best First'.
Re^2: Finding Start/End Position of the Uppercase Substring
by johngg (Canon) on Jun 25, 2007 at 09:49 UTC
    my $hyphens = 0; $hyphens++ while $str =~ /-/g;

    Although the data given seems clean in this regard, your code will give wrong results if there are any hyphens in the string other than leading ones. Capturing zero or more hyphens at the beginning of the string and finding the length of the capture might be safer.

    my $hyphens = length $1 if $str =~ m{\A(-*)};

    The match will always succeed so if there are no leading hyphens the length of the capture will be zero.

    $ perl -Mstrict -Mwarnings -le ' > my @strings = qw{--aacgtACG ctgGTTAtga}; > foreach my $str ( @strings ) > { > my $hyphens = length $1 if $str =~ m{\A(-*)}; > print qq{$str - $hyphens}; > }' --aacgtACG - 2 ctgGTTAtga - 0 $

    Cheers,

    JohnGG