Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, My problem is that given a DNA sequence is am trying to find all the pairs in it e.g. ATACG = AT TA AC CG. I have written a sub-routine that handles this fine. I am trying to extend the sub-routine so it calculates the frequency of each pair and sorts the pairs accordingly. However, what I have written makes no change to the output. Please could you try to point out my mistakes?! Any help is massively appreciated.
# snippet @pairs = get_nn_pairs (@segment1); sub get_nn_pairs { (@segment1) = @_; my $base; my $pair; my @pairs; my $i; my $nn_pair; # find and print all nn pairs form the sequence. foreach $i (1..$#segment1) { $base = $segment1[$i-1]; $pair = $segment1[$i]; @pairs = "$segment1[$i-1]$segment1[$i]\n"; # only seems to know the value of @pairs from inside foreach loop # problems start here. @pairs = split ('', $nn_pair); my %freq; foreach $nn_pair (@pairs) { {$freq{$_}++} } my @sorted_array = sort { $freq{$b} <=> $freq{$a} } keys %freq; print @sorted_array; } }

Replies are listed 'Best First'.
Re: subroutine problem
by Paladin (Vicar) on Feb 26, 2003 at 16:46 UTC
    The indentation in your code makes it a bit difficult to see what is in your foreach loops and what isn't. In addition to the problems with @pairs that jasonk and meetraz pointed out, there are some other things that don't make sense with the code pasted.
    • You declare and assign to $base and $pair but never use them.
    • In your last foreach you have $nn_pair as the loop variable, but use $_ inside the loop.
    • At the top you say @pairs = get_nn_pairs (@segment1); but you are printing @sorted_array at the end of the sub, not returning the array.

    You may want to try something like the code below:

    sub get_nn_pairs { my @segment1 = @_; my @pairs; # find and print all nn pairs form the sequence. foreach my $i (1 .. $#segment1) { push @pairs, "$segment1[$i-1]$segment1[$i]"; } my %freq; foreach my $nn_pair (@pairs) { $freq{$nn_pair}++; } my @sorted_array = sort { $freq{$b} <=> $freq{$a} } keys %freq; return @sorted_array; } my @ordered_pairs = get_nn_pairs(@input);
Re: subroutine problem
by jasonk (Parson) on Feb 26, 2003 at 16:28 UTC

    You have several problems with your use of the @pairs array, first you are assigning to it using @pairs = "$segment1[$i-1]$segment1[$i]\n";, which ensures that the array will never contain more than one entry, you probably want push(@pairs, $segment1[$i-1].$segment1[$i]) instead. Immediately after that assignment however, you are assigning it again with @pairs = split('', $nn_pair), so the content from the previous assignment is useless because you just overwrote it with whatever was split from $nn_pair (which you never assigned, and thus @pairs is always empty after this line). Perhaps your first assignment to @pairs should have been assigning to $nn_pair instead?. (use strict; will help you find these kinds of problems)

Re: subroutine problem
by meetraz (Hermit) on Feb 26, 2003 at 16:28 UTC
    What is the value of @segment1 ?

    And why are you assigning a string to an array in this line:

    @pairs = "$segment1[$i-1]$segment1[$i]\n";

    You probably mean to do:

    $nn_pair = $segment1[$i-1] . $segment1[$i] . "\n";