in reply to Proper usage of rindex function?

Hi, what makes you think that you can pass a list of substrings to rindex?

If the substring is not found, index returns -1.

Your code is trying to match the entire substring. You should check each character individually.

perl -Mstrict -wE ' my $str = "GAGAACATTAGTGGGTGCAGCGCACAAGCATGGCACATGTATACGTATGTAA"; say sprintf q{%s is last found at pos %s}, $_, rindex( $str, $_ ) for +qw/A C G T/; '
Output:
A is last found at pos 51 C is last found at pos 43 G is last found at pos 48 T is last found at pos 49

Also, always use strict; in your code: it will make Perl tell you about mistakes you've made:

perl -Mstrict -wE ' my $str = "GAGAACATTAGTGGGTGCAGCGCACAAGCATGGCACATGTATACGTATGTAA"; say rindex( $str, [ATCG] ) '
Output:
Bareword "ATCG" not allowed while "strict subs" in use at -e line 3. Execution of -e aborted due to compilation errors.

Hope this helps!


The way forward always starts with a minimal test.

Replies are listed 'Best First'.
Re^2: Proper usage of rindex function?
by Anonymous Monk on Dec 29, 2017 at 12:26 UTC
    Aha, I see, I thought it would work... I tried this:
    if($str=~/[ATCG]+([ATCG])\.+$/) { $last_char=$1; $rightmost_position_of_letter = rindex($str, $last_mapped_char); }

    and it seems to work. Does that make sense?

      See perlrequick.

      Your regexp is capturing into $1 just one character, since you are using a character class. It is looking for:

      [ATCG]+ # one or more of any of A,T,C,G # followed by ([ATCG]) # exactly one of A,T,C,G # which is captured into $1 # followed by \.+ # one or more dots # followed by $ # the end of the string
      So, ignoring the fact that you have two variable $last_char and $last_mapped_char, it makes sense that you get a result, since you will have captured the last, last occurrence with your regexp, because it's greedy and the first part of it will eat up all the letters except for the last one. It's probably not how you want to code it, though.

      Try running some tests:

      perl -Mstrict -wE ' my $str = "ATCGATCG..."; if($str=~/[ATCG]+([ATCG])\.+$/) { my $last_char=$1; say $last_char; my $pos = rindex($str, $last_char); say $pos; } ' G 7
      perl -Mstrict -wE ' my $str = "ATCGATCGA..."; if($str=~/[ATCG]+([ATCG])\.+$/) { my $last_char=$1; say $last_char; my $pos = rindex($str, $last_char); say $pos; } ' A 8

      Hope this helps!



      The way forward always starts with a minimal test.
        Yes, sorry, I copied-pasted it from the actual code, hence the two different variable names...
        I changed it a bit:
        if($str=~/([ATCG])\.+$/) { $last_mapped_char=$1; $rightmost_position_of_letter = rindex($str, $last_mapped_char)+1; }

        and now I think it is more sensible (at least to me). The problem I would have with the solution kindly provided above is that I would always need to compare the 4 values and find the largest one (because I only care about the last found A,T,G or C (does not matter which), before the dots start.