in reply to doubt in string matching.

heidi:

What is: @a= "ABCDEFGHIJKLMNOPCDEFQRST";? Do you want a scalar variable with a long string (no pun intended) of characters? Or do you want an array like: ( "A", "B", "C", ...)?

As ikegami points out, if you want to find where your $find string occurs in your "array", use index. After making it a scalar, of course.

So, cleaning it up a tad:

my $string = "ABCDEFGHIJKLMNOPCDEFQRST"; my $find = "CDEF"; if( $string =~ /$find/ ){ print "found $find in $string\n"; } # OR my $index = index( $string, $find ); if( $index > 0 ){ print "found $find in $string at position $index\n"; }

That will show you if your "find" string occurs in your main string. As far as: " and then only check for inbetween characters(D and E)", I don't know what you mean. Solid examples will speak volumes, show us what output you want to get.

For bonus points, show us what you've done so far, in case I've completely misinterpreted your first question. For reference, see I know what I mean. Why don't you?.



--chargrill
s**lil*; $*=join'',sort split q**; s;.*;grr; &&s+(.(.)).+$2$1+; $; = qq-$_-;s,.*,ahc,;$,.=chop for split q,,,reverse;print for($,,$;,$*,$/)

Replies are listed 'Best First'.
Re^2: doubt in string matching.
by heidi (Sexton) on Oct 17, 2006 at 05:35 UTC
    yeah dude, you are rite, but i ve already done that, i mean, i have used index for string matching, but this is a try to find string matching in a different method, am jus giving a try to do it. this is what i have done so far.
    #!/usr/bin/perl #ENTERING A MOTIF TO BE SEARCHED FROM USER INPUT print "Enter a pattern to be searched:"; $seq =<STDIN>; chomp($seq); @seqss = split ('',$seq); $m=scalar @seqss; #FINDING OUT THE FIRST AND THE LAST CHARACTER OF THE MOTIF $firstch=$seqss[0]; $lastch=$seqss[$m-1]; #ASSIGNING KEY VALUES TO MOTIFS my @unique = (); my %seen = (); @pats=reverse @seqss; foreach my $elem ( @pats ) { next if $seen{ $elem }++; push @uni, $elem; @unique= reverse @uni; } $cc=1; foreach (@uni) { $count{$_} = $cc; $cc++; } $zen = $m+1; for(my $i=0;$i<scalar @unique;$i++) { $mcut=scalar @unique; $m=$mcut-$i; %num=("$unique[$i]"=>"$m"); while (($key1, $val1) = each(%num)) { push(@key,$key1); push(@val,$val1); } } #OPENING A DATABASE (A TEXT FILE WHERE STRINGS OF ALPHABETS SAVED) open (PIR,'/home/httpd/heidi/fasta/pir/heidi_pir/pirdb.txt'); $count=0; while (<PIR>) { if (/^ENTRY/) {$entry = $_;} elsif (/^>gi/) {$gi = $_;} elsif(/^TITLE/) {$title = (s/ /\n\t\t /g,$_);} elsif(/^ORGANISM/) {$org = (s/ /\n\t\t /g,$_);} elsif(/^ACCESSIONS/) {$acc = $_;} else { @array2 = $_; } #what i need is only this from the database (ie., the string where + i have to match the motif or pattern) if (defined $array2[0]) { @onlyseq = split('',$array2[0]); } @array2=(); #ASSIGNING KEY VALUES FOR THE STRING INCLUDING MOTIFS(VALUES) foreach $_(@onlyseq) { if($count{$_} != $zen) { if (defined $count{$_}) { push(@a,$count{$_}); push(@b,$_); #print @a; #print @b; } else { push(@a,$zen); push(@b,$_); #print @a; #print @b; } } } #SEARCHING FOR MOTIFS IN THE STRING<< THIS IS WHERE I AM STUCK WIT +H>> $m=scalar @seqss; for(my $i=$m;$i<=scalar @a;$i++) { #COMPARING THE LAST AND THE FIRST CHARACTER if(($lastch eq $b[$i]) && ($firstch eq $b[$i-($m-1)])) { #I WANT TO COMPARE THE INBETWEEN CHARACTERS HERE ITSELF.(( +( WHERE I NEED YOUR HELP))) #for(my $j=($i-($m-1));$j<=$i;$j++) #{ # push(@fnum,$a[$j]); # push(@flet,$b[$j]); #} #$t=$i+1; #$i=$t+$i-1; } } $m=scalar @seqss; while (@flet) { push(@words, join('', splice(@flet, 0, $m))); } #print "@words\n"; foreach (@words) { next unless $_ =~ /$seq/; print("$_\n"); $count++; } @onlyseq=(); @a=(); @b=(); @flet=(); @fnum=(); @words=(); } print "\nThe number of patterns found in PIR database : $count\n";
    I bet you wont understand half of wat i have done, thats because i havent explained u the whole algorithm.so just see near the comment #SEARCHING FOR MOTIFS IN THE STRING<< THIS IS WHERE I AM STUCK WITH>> if possible , do suggest a solution.
      Since that you are checking first and last characters with $b[$i-($m-1)] and $b[$i] respectively, your in-between characters should be
      @b[$i-($m-2)..$i-1]
      That is,
      say
      $i = 10; $m = 8;
      Your first character according to your check should be in $b[3] and last character should be in $b[10]. So the in-between characters should be
      @b[4..9]
      But you are checking 3..10 again.
      From your code, for(my $j=($i-($m-1));$j<=$i;$j++)
      I guess this should be
      for(my $j=($i-($m-2));$j<=$i-1;$j++) #or for(my $j=($i-($m-2));$j<$i;$j++)
      Does this help you.
      Thanks..