in reply to how do i count the 22 selected di-peptides from a multifasta file separately for each sequence
G'day SOMEN,
Welcome to the Monastery.
[Assumption: $sum is the global dipeptide count.]
Your problem lies here:
my $sum = sum(values %{$count{$_}});
You're counting the number of unique dipeptides found; not a sum of their values.
Look at how you populate %count:
$count{$seq}{$1}++
So, %count has a series of keys representing sequences ($seq). Each of those has a series of keys representing a dipeptide ($_).
When you access %count, i.e. with $count{$_}, you're only drilling down to the sequence. You need to go one more level to access the dipeptides.
-- Ken
|
|---|