in reply to Re: doubt in string matching.
in thread doubt in string matching.

yeah dude, you are rite, but i ve already done that, i mean, i have used index for string matching, but this is a try to find string matching in a different method, am jus giving a try to do it. this is what i have done so far.
#!/usr/bin/perl #ENTERING A MOTIF TO BE SEARCHED FROM USER INPUT print "Enter a pattern to be searched:"; $seq =<STDIN>; chomp($seq); @seqss = split ('',$seq); $m=scalar @seqss; #FINDING OUT THE FIRST AND THE LAST CHARACTER OF THE MOTIF $firstch=$seqss[0]; $lastch=$seqss[$m-1]; #ASSIGNING KEY VALUES TO MOTIFS my @unique = (); my %seen = (); @pats=reverse @seqss; foreach my $elem ( @pats ) { next if $seen{ $elem }++; push @uni, $elem; @unique= reverse @uni; } $cc=1; foreach (@uni) { $count{$_} = $cc; $cc++; } $zen = $m+1; for(my $i=0;$i<scalar @unique;$i++) { $mcut=scalar @unique; $m=$mcut-$i; %num=("$unique[$i]"=>"$m"); while (($key1, $val1) = each(%num)) { push(@key,$key1); push(@val,$val1); } } #OPENING A DATABASE (A TEXT FILE WHERE STRINGS OF ALPHABETS SAVED) open (PIR,'/home/httpd/heidi/fasta/pir/heidi_pir/pirdb.txt'); $count=0; while (<PIR>) { if (/^ENTRY/) {$entry = $_;} elsif (/^>gi/) {$gi = $_;} elsif(/^TITLE/) {$title = (s/ /\n\t\t /g,$_);} elsif(/^ORGANISM/) {$org = (s/ /\n\t\t /g,$_);} elsif(/^ACCESSIONS/) {$acc = $_;} else { @array2 = $_; } #what i need is only this from the database (ie., the string where + i have to match the motif or pattern) if (defined $array2[0]) { @onlyseq = split('',$array2[0]); } @array2=(); #ASSIGNING KEY VALUES FOR THE STRING INCLUDING MOTIFS(VALUES) foreach $_(@onlyseq) { if($count{$_} != $zen) { if (defined $count{$_}) { push(@a,$count{$_}); push(@b,$_); #print @a; #print @b; } else { push(@a,$zen); push(@b,$_); #print @a; #print @b; } } } #SEARCHING FOR MOTIFS IN THE STRING<< THIS IS WHERE I AM STUCK WIT +H>> $m=scalar @seqss; for(my $i=$m;$i<=scalar @a;$i++) { #COMPARING THE LAST AND THE FIRST CHARACTER if(($lastch eq $b[$i]) && ($firstch eq $b[$i-($m-1)])) { #I WANT TO COMPARE THE INBETWEEN CHARACTERS HERE ITSELF.(( +( WHERE I NEED YOUR HELP))) #for(my $j=($i-($m-1));$j<=$i;$j++) #{ # push(@fnum,$a[$j]); # push(@flet,$b[$j]); #} #$t=$i+1; #$i=$t+$i-1; } } $m=scalar @seqss; while (@flet) { push(@words, join('', splice(@flet, 0, $m))); } #print "@words\n"; foreach (@words) { next unless $_ =~ /$seq/; print("$_\n"); $count++; } @onlyseq=(); @a=(); @b=(); @flet=(); @fnum=(); @words=(); } print "\nThe number of patterns found in PIR database : $count\n";
I bet you wont understand half of wat i have done, thats because i havent explained u the whole algorithm.so just see near the comment #SEARCHING FOR MOTIFS IN THE STRING<< THIS IS WHERE I AM STUCK WITH>> if possible , do suggest a solution.

Replies are listed 'Best First'.
Re^3: doubt in string matching.
by Mandrake (Chaplain) on Oct 17, 2006 at 06:07 UTC
    Since that you are checking first and last characters with $b[$i-($m-1)] and $b[$i] respectively, your in-between characters should be
    @b[$i-($m-2)..$i-1]
    That is,
    say
    $i = 10; $m = 8;
    Your first character according to your check should be in $b[3] and last character should be in $b[10]. So the in-between characters should be
    @b[4..9]
    But you are checking 3..10 again.
    From your code, for(my $j=($i-($m-1));$j<=$i;$j++)
    I guess this should be
    for(my $j=($i-($m-2));$j<=$i-1;$j++) #or for(my $j=($i-($m-2));$j<$i;$j++)
    Does this help you.
    Thanks..