odegbon has asked for the wisdom of the Perl Monks concerning the following question:

I have a file which is tab delimited(a hit from a DB). I would like to first store say the first and tenth item on each(row) as well as the entire row in a hash. And then build an array of hashes from 2nd,3rd,...... rows respectively. I would then sort each of the rows by score and ID(which are the first and 10th elements of each row) and print the row with the best score.

I have tried this out and it seems I was just able to pick only one row:

my @CHIMP = <CBLAT>;#CBLAT hold the file to be processed</p> my %hITS; my $score; my @bestHITS; if(!@CHIMP){</p> my $nm = "no_match"; next} my $cblast = shift @CHIMP;#process blat output lines(r +ows)</p> my @row = split(/\t/,$cblast);</p> unshift @{$hITS{$score}},{MATCH=>$row[0],qID=>$row[9], +cLINE=>$cblast};#keep score,query id and whole lines in hash</p> @{$hITS{$score}} = sort {$b->{MATCH} <=> $a->{MATCH} | +| $a->{qID} cmp $b->{qID}} </p>@{$hITS{$score}};#sort by match score/ +query ID</p> foreach my $key (@{$hITS{$score}}){ my $hits = $key->{cLINE}; push @bestHITS, $hits;#keep sorted lines in ar +ray last;}

my program seem to be processing only the first row. I need hints/snippets!

Replies are listed 'Best First'.
Re: Building an array of hashes, and then sorting keys with each hash
by ikegami (Patriarch) on Dec 30, 2009 at 01:42 UTC

    my program seem to be processing only the first row.

    The rows are in @CHIMP. You grab the first row using my $cblast = shift @CHIMP;. You only execute that once, and you never touch @CHIMP again.

    $score is never assigned a value, yet you use it all over the place. Boooo! for disabling warnings or ignoring them. I think you meant $hITS{score} rather than $hITS{$score}.

    Anyway, this is what you want:

    my @hITS; while (<CBLAT>) { chomp; my ($score, $qID) = ( split /\t/ )[ 0, 9 ]; push @hITS, [ $_, $score, $qID ]; } @hITS = sort { $b->[1] <=> $a->[1] || $a->[2] cmp $b->[2] } @hITS; for (@hITS) { print $_->[0], "\n"; }

    Update: I suppose you did say AoH. It's more readable, especially if you save the other fields:

    my @hITS; while (<CBLAT>) { chomp; my %row; @row{qw( score qID )} = ( split /\t/ )[ 0, 9 ]; push @hITS, \%row; } @hITS = sort { $b->{score} <=> $a->{score} || $a->{qID} cmp $b->{qID} } @hITS; for (@hITS) { print "score: $_->{score} qID: $_->{qID}\n"); }

      thanks.

      I am also interested in printing/retrieving the entire row after sorting as in the last line above. Does $_ give the entire row, sorted?
        Run it...