in reply to finding substrings that have been inserted into a string

I don't know why you have a second output there, maybe I misunderstood something.

Nevertheless: here is my attempt which assumes (= doesn't verify) that both strings are identical except for the number of dashes.

#!/usr/bin/perl use strict; use warnings; my $oldstring = 'ATTGC---AGTCCATGC------ATGC'; my $newstring = 'AT-TGC---AGTCCATGC--------ATGC'; my @oldstring= ($oldstring=~ /(-*)(?:A|C|G|T|$)/g); my @newstring= ($newstring=~ /(-*)(?:A|C|G|T|$)/g); my $pos= 0; for (my $i= 0; $i < @oldstring; ++$i) { if ( $oldstring[$i] ne $newstring[$i]) { print length($newstring[$i]) - length($oldstring[$i])," at pos +ition ",$pos,'-',$pos+length($oldstring[$i]),"\n"; } $pos+= length($oldstring[$i])+1; }

Result
1 at position 2-2
2 at position 17-23

It works by collecting all groups of dashes in front of any letter or the line end. Groups may heave length 0. Then both results are compared.


s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
+.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e