in reply to Re^2: Sorting issue
in thread Sorting issue

You didn't show how you tried my suggestions, so I'm not sure why it didn't work for you. Here's a more complete example, which takes your sample input and sorts it by the frequencies (largest to smallest), outputting with a header built to your latest spec. Make sure you understand what's going on in the sort {block}: what $tags{$a} means, for instance. I'm sorting on the values, not the keys. The keys go into $a and $b, and I'm using those as keys into the hash to sort on the values.

#!/usr/bin/perl use warnings; use strict; my %tags; # hash to store tags/freqs while(<DATA>){ chomp; my($tag, $freq) = split; # split the line on whitespace $tags{$tag} = $freq; # save the tag and freq in the hash } # sort the hash numerically on its values, descending for my $tag ( sort { $tags{$b} <=> $tags{$a} } keys %tags ){ my $freq = $tags{$tag}; # put the freq for $tag in $freq my $header = make_header($tag, $freq); # make the header print ">$header\t$tag\t$freq\n"; # print it out } sub make_header { my $tag = shift; # get parameters my $freq = shift; my $r = int(rand(500000)); # pick a random number return "HWTI_${freq}_$r$tag"; # build the header } #input data __DATA__ CCCDEDFFFES 45 EEBBBBGGGBB 1700 BBBCDDERFGG 850
#output >HWTI_1700_494932EEBBBBGGGBB EEBBBBGGGBB 1700 >HWTI_850_10814BBBCDDERFGG BBBCDDERFGG 850 >HWTI_45_187939CCCDEDFFFES CCCDEDFFFES 45

Replies are listed 'Best First'.
Re^4: Sorting issue
by Cristoforo (Curate) on Nov 05, 2011 at 03:07 UTC
    There is no need for the sort and print loop after the while loop. His input file is already sorted by frequency in descending order (your sample data would be sorted in descending order). So, the make_header() call and print routine could be done within the while loop.
      True, his sample data was already sorted that way; but it was only three lines, and he said he still needed to sort on that, so I assumed that was coincidence.
        The input file was already sorted by frequency. I tried without the sort command. It didn't give me the sorted results.
Re^4: Sorting issue
by bluray (Sexton) on Nov 05, 2011 at 03:59 UTC
    Hi Aaron,

    Thanks! It worked with:

    split(' ', $line).

      Glad that helped. If you're having trouble using $line in place of $_, that's probably because if you give split a variable, you have to give it a pattern first. So you could do split ' ', $line; to use split's special case where giving it a single space as the pattern argument makes it split on whitespace like it does when you give it no arguments.