Thanks so much again rnahi.
I hope you don't mind looking at my other instances. I'm really sorry, I didn't mentioned it before because I thought it may appear too complex and too discouraging to read.
Suppose I have this:
my $fseq6 = 'CCGCGCTC'; my @nsub6 = ( 'CCGCG', '*****', 'CGCTC' '*****',); my $fseq5 = 'CCGCGCTC'; my @nsub5 = ( 'CCGCG', '*****', '*****', 'CGCTC'); my $fseq4 = 'CCCCGCGC'; my @nsub4 = ('CCCCG', '*****', 'CGCGC');


I would like to produce this:
$result4 = [ [ 0,'CCG--'], [ 1,'*****'], [ 2,'--CTC'] ]; $result5 = [ [ 0,'CCG--'], [ 1,'*****'], [ 2,'*****'], [ 3,'--CTC'] ]; $result6 = [ [ 0,'CCG--'], [ 1,'*****'], [ 2,'--CTC'] [ 3,'*****'], ];


Basically 'skipping' the asterisk(*) but yet still keep its position in array in place.

Update: I've finally succeeded in improving your code such that it can take care those situations. It is not entirely neat and 'super-naive' but it does the job. I think I can't use "grep" function in this case because I still need to keep '*' in its position.

My sincere thanks, for providing an excellent starting point to me.
Here is the final code:
my $count; my @ar; foreach (@nsub) { $c++ if ($_ =~ /[ATCG]/); next if ($_ =~ /^\*/); push @ar, $_; last if ($count == 2); } my $sec_str= $ar[$#ar]; #Second non-* strings print "$llm\n"; my @results; my %seen; my $previous = $nsub[0]; my $tmp = $previous; my ($found) = "$nsub[0]#$sec_str" =~ /(\w+)#\1/; if ($found) { $tmp =~ s/$found$/"-" x length($found)/e; } push @results, [ 0, $tmp ]; for (1 .. $#nsub) { my $current = $nsub[$_]; if ($current =~ /^\*/) { push @results, [ $_, $current] unless $seen{$_}++; } elsif ( "$previous#$current" =~ /(\w+)#\1/ ) { my $found = $1; (my $tmp = $current) =~ s/^$found/"-" x length($found)/e; push @results, [ $_, $tmp]; $previous = $current; } else { push @results, [$_,$previous]; push @results, [$_+1,$current]; #printf "%d -> no overlap\n", $_; $previous = $current; } } print Data::Dumper->Dump([ \@results], ['result']);


Please kindly advice. Really hope to hear from you again.
Regards,
Edward

In reply to Re^4: Identifying Overlapping Area in a Set of Strings by monkfan
in thread Identifying Overlapping Area in a Set of Strings by monkfan

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.