Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^4: Identifying Overlapping Area in a Set of Strings

by monkfan (Curate)
on Jul 30, 2005 at 04:01 UTC ( [id://479572]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Identifying Overlapping Area in a Set of Strings
in thread Identifying Overlapping Area in a Set of Strings

Thanks so much again rnahi.
I hope you don't mind looking at my other instances. I'm really sorry, I didn't mentioned it before because I thought it may appear too complex and too discouraging to read.
Suppose I have this:
my $fseq6 = 'CCGCGCTC'; my @nsub6 = ( 'CCGCG', '*****', 'CGCTC' '*****',); my $fseq5 = 'CCGCGCTC'; my @nsub5 = ( 'CCGCG', '*****', '*****', 'CGCTC'); my $fseq4 = 'CCCCGCGC'; my @nsub4 = ('CCCCG', '*****', 'CGCGC');


I would like to produce this:
$result4 = [ [ 0,'CCG--'], [ 1,'*****'], [ 2,'--CTC'] ]; $result5 = [ [ 0,'CCG--'], [ 1,'*****'], [ 2,'*****'], [ 3,'--CTC'] ]; $result6 = [ [ 0,'CCG--'], [ 1,'*****'], [ 2,'--CTC'] [ 3,'*****'], ];


Basically 'skipping' the asterisk(*) but yet still keep its position in array in place.

Update: I've finally succeeded in improving your code such that it can take care those situations. It is not entirely neat and 'super-naive' but it does the job. I think I can't use "grep" function in this case because I still need to keep '*' in its position.

My sincere thanks, for providing an excellent starting point to me.
Here is the final code:
my $count; my @ar; foreach (@nsub) { $c++ if ($_ =~ /[ATCG]/); next if ($_ =~ /^\*/); push @ar, $_; last if ($count == 2); } my $sec_str= $ar[$#ar]; #Second non-* strings print "$llm\n"; my @results; my %seen; my $previous = $nsub[0]; my $tmp = $previous; my ($found) = "$nsub[0]#$sec_str" =~ /(\w+)#\1/; if ($found) { $tmp =~ s/$found$/"-" x length($found)/e; } push @results, [ 0, $tmp ]; for (1 .. $#nsub) { my $current = $nsub[$_]; if ($current =~ /^\*/) { push @results, [ $_, $current] unless $seen{$_}++; } elsif ( "$previous#$current" =~ /(\w+)#\1/ ) { my $found = $1; (my $tmp = $current) =~ s/^$found/"-" x length($found)/e; push @results, [ $_, $tmp]; $previous = $current; } else { push @results, [$_,$previous]; push @results, [$_+1,$current]; #printf "%d -> no overlap\n", $_; $previous = $current; } } print Data::Dumper->Dump([ \@results], ['result']);


Please kindly advice. Really hope to hear from you again.
Regards,
Edward

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://479572]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (2)
As of 2024-04-20 07:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found