For repeated searches, building a lookup table can speed the program up, but you should really Benchmark instead of guessing.

Here's what I had tried before you published your code, it counts the dashes, so it adjusts the positions in larger chunks instead of by one:

#!/usr/bin/perl use warnings; use strict; use Syntax::Construct qw{ // }; sub position { my ($seq, $query) = @_; my $pos = my $sum = $query->[1] - $seq->{from}; my $start = 0; my $changed = 0; while (my $count = substr($seq->{string}, $start, $pos + 1) =~ tr/ +-//) { ++$changed; $start = $sum; $pos = $count - 1; $sum += $count; } --$sum if $changed > 1; my $expected = substr $seq->{string}, $sum, 1; return $sum, $expected } sub find { my ($seq, $idx) = @_; my $char; my $pos; if('-' ne ( $char = substr $seq->{string}, $idx, 1 )) { $pos = $seq->{from} + $idx; $pos -= substr($seq->{string}, 0, $idx) =~ tr/-//; } return $char, $pos } my %seq_a = ( from => 36, to => 190, string => 'LTIEAVPSNAAEGKEVLLLVHNLPQDPRGYNWYKGETVDANRRIJ +GYVISNQQITPGPAYSNRETIYPNASLXMRNVTRNDTGSYTLQVIKLNLMSEEVTGQ-FSVHPETPKPS +ISSNNSNPVEDKDAVAFTCEPETQNTTYLWWVNGQSLPVSP' ); my %seq_b = ( from => 206, to => 334, string => 'PTISPSYTYYRPGVNLSLSCHAASNPPAQYSWLIDGNIQQHTQE- +--------------------------LFISNITEKNSGLYTCQANNSASGHSRTTVKTIYVSAELPKPS +ISSNNSKPVEDKDAVAFTCEPEAQNTTYLWWVNGQSLPVSP' ); use Test::More; my %tab; for my $pos ($seq_b{from} .. $seq_b{to}) { my ($idx, $char) = position(\%seq_b, [ q() => $pos ]); $tab{"$char$pos"} = join q(), map $_ // 'undef', find(\%seq_a, $id +x); } sub assert { is $tab{"$_[0]$_[1]"}, "$_[2]$_[3]", "$_[0]$_[1]"; } assert(P => 206, L => 36); assert(E => 249, I => 79); assert(L => 250, L => 107); assert(F => 251, X => 108); assert(I => 252, M => 109); assert(S => 253, R => 110); assert(N => 254, N => 111); assert(I => 255, V => 112); assert(E => 257, R => 114); assert(A => 271, N => 128); assert(S => 272, L => 129); assert(G => 273, M => 130); assert(H => 274, S => 131); assert(S => 275, E => 132); assert(R => 276, E => 133); assert(T => 277, V => 134); assert(T => 278, T => 135); assert(V => 279, G => 136); assert(K => 280, Q => 137); assert(T => 281, '-' => 'undef'); assert(I => 282, F => 138); done_testing();

I've changed the sequnces a bit to detect off-by-one errors more easily.

($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

In reply to Re^2: Per residue sequence alignment - per character string comparison? by choroba
in thread Per residue sequence alignment - per character string comparison? by proteins

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.