in reply to Re: FASTA riddle.
in thread FASTA riddle.

Thanks for all the help!
Here is a simple coding question I suspect you may be able to help me with.
Here is a slice from my first subroutine.
In it, I am going to try to basically put the first line of each fasta sequence into a key, and the coding lines of the fastA sequences into the hash.

I am trying to use the /$search/gi code near the bottom in order to find how many times my search sequences are found amongst ALL the sequences.
AKA, if I search for the amino acid "R", it should come up 350 times. However, $hits is instead equal to 55.
Basically it stops searching once a line is classified as containing "R", and gives me the number of lines which satisfy the condition, rather than the amount of times the search sequence is found.


while (my $line =<INPUT>) { if ($line =~m/^>/) { if ($seq) { $o{$header} = "$seq"; } $count++; print "\n\n"; print "$line"; $header = "$line"; $seq=(); } else { chomp $line; print $line; $seq .= "$line"; if ($line =~/$search/gi) { $hits++; } } }

Replies are listed 'Best First'.
Re^3: FASTA riddle.
by MaroonBalloon (Acolyte) on Dec 12, 2009 at 23:12 UTC
    Alternatively, Is it possible to look at the values of a hash and find out how many times my search string is present?
    #input = $search foreach my $key (keys %hash) { $searchline = $hash{$key}; if ($searchline =~ /$search/gi) { $contains ++; } }
    Because this seems also to stop counting once it finds the FIRST $search, making the resulting "$contains" merely equal to the number of values that happen to contain at least one of the $search strings....