yeah dude,
you are rite, but i ve already done that, i mean, i have used index for string matching, but this is a try to find string matching in a different method, am jus giving a try to do it.
this is what i have done so far.
#!/usr/bin/perl
#ENTERING A MOTIF TO BE SEARCHED FROM USER INPUT
print "Enter a pattern to be searched:";
$seq =<STDIN>;
chomp($seq);
@seqss = split ('',$seq);
$m=scalar @seqss;
#FINDING OUT THE FIRST AND THE LAST CHARACTER OF THE MOTIF
$firstch=$seqss[0];
$lastch=$seqss[$m-1];
#ASSIGNING KEY VALUES TO MOTIFS
my @unique = ();
my %seen = ();
@pats=reverse @seqss;
foreach my $elem ( @pats )
{
next if $seen{ $elem }++;
push @uni, $elem;
@unique= reverse @uni;
}
$cc=1;
foreach (@uni)
{
$count{$_} = $cc;
$cc++;
}
$zen = $m+1;
for(my $i=0;$i<scalar @unique;$i++)
{
$mcut=scalar @unique;
$m=$mcut-$i;
%num=("$unique[$i]"=>"$m");
while (($key1, $val1) = each(%num))
{
push(@key,$key1);
push(@val,$val1);
}
}
#OPENING A DATABASE (A TEXT FILE WHERE STRINGS OF ALPHABETS SAVED)
open (PIR,'/home/httpd/heidi/fasta/pir/heidi_pir/pirdb.txt');
$count=0;
while (<PIR>)
{
if (/^ENTRY/)
{$entry = $_;}
elsif (/^>gi/)
{$gi = $_;}
elsif(/^TITLE/)
{$title = (s/ /\n\t\t /g,$_);}
elsif(/^ORGANISM/)
{$org = (s/ /\n\t\t /g,$_);}
elsif(/^ACCESSIONS/)
{$acc = $_;}
else
{
@array2 = $_;
}
#what i need is only this from the database (ie., the string where
+ i have to match the motif or pattern)
if (defined $array2[0])
{
@onlyseq = split('',$array2[0]);
}
@array2=();
#ASSIGNING KEY VALUES FOR THE STRING INCLUDING MOTIFS(VALUES)
foreach $_(@onlyseq)
{
if($count{$_} != $zen)
{
if (defined $count{$_})
{
push(@a,$count{$_});
push(@b,$_);
#print @a;
#print @b;
}
else
{
push(@a,$zen);
push(@b,$_);
#print @a;
#print @b;
}
}
}
#SEARCHING FOR MOTIFS IN THE STRING<< THIS IS WHERE I AM STUCK WIT
+H>>
$m=scalar @seqss;
for(my $i=$m;$i<=scalar @a;$i++)
{
#COMPARING THE LAST AND THE FIRST CHARACTER
if(($lastch eq $b[$i]) && ($firstch eq $b[$i-($m-1)]))
{
#I WANT TO COMPARE THE INBETWEEN CHARACTERS HERE ITSELF.((
+( WHERE I NEED YOUR HELP)))
#for(my $j=($i-($m-1));$j<=$i;$j++)
#{
# push(@fnum,$a[$j]);
# push(@flet,$b[$j]);
#}
#$t=$i+1;
#$i=$t+$i-1;
}
}
$m=scalar @seqss;
while (@flet)
{
push(@words, join('', splice(@flet, 0, $m)));
}
#print "@words\n";
foreach (@words)
{
next unless $_ =~ /$seq/;
print("$_\n");
$count++;
}
@onlyseq=();
@a=();
@b=();
@flet=();
@fnum=();
@words=();
}
print "\nThe number of patterns found in PIR database : $count\n";
I bet you wont understand half of wat i have done, thats because i havent explained u the whole algorithm.so just see near the comment
#SEARCHING FOR MOTIFS IN THE STRING<< THIS IS WHERE I AM STUCK WITH>>
if possible , do suggest a solution.
| [reply] [d/l] |
Since that you are checking first and last characters with $b[$i-($m-1)] and $b[$i] respectively, your in-between characters should be
@b[$i-($m-2)..$i-1]
That is,
say $i = 10;
$m = 8;
Your first character according to your check should be in
$b[3] and last character should be in $b[10]. So the in-between characters should be
@b[4..9]
But you are checking 3..10 again. From your code,
for(my $j=($i-($m-1));$j<=$i;$j++)
I guess this should be
for(my $j=($i-($m-2));$j<=$i-1;$j++)
#or
for(my $j=($i-($m-2));$j<$i;$j++)
Does this help you.
Thanks..
| [reply] [d/l] [select] |