in reply to Re: How to (get started on) sort AoA or AoH by frequency
in thread How to (get started on) sort AoA or AoH by frequency

See attached code. I used the map trick again: foreach ( map{$_-> [ 4]}@results) iterates over all of the contents of column 4 and a freq hash is built. A list of references to rows is what is going into the map. The map then de-references and transforms this such that the output is list of every contents of column 4.

The way sort works: <---output sort{...} <---input
is that what goes in is what comes out. What is coming in are references to rows of the @results array. What sort needs is a way to compare 2 rows: row A<row B, row A equal row B or row A>rowB. The function that provides the comparison can be anything that you want as long as it produces a consistent result (reverses the answer if a and b are reversed).

So I look up the value of col 4 for say row A, then I ask the frequency hash what the frequency is and I compare that result with a likewise computation for row B. In the case of a tie, I use an alphabetic comparison of row 0. Note that I reversed a and b to get highest frequency first while I am sorting on lowest column 0 first.

The way that the sort decider function is written may appear a bit strange, but it is just returning a: -1, 0 or 1 depending upon how row A and row B compare.

It is completely legal to assign the sorted result set back to the input variable and I did that. To get your printout, just do the column 4 look up in the freq hash to get frequency. The order of my @result jives with the order of your output.

For printing, of course you can access each element as a 2-D coordinate, but usually better is to iterate over the rows with row reference like this:

foreach my $row (@results) { print "$row->[0] $row->[1]\n"; }
I think the following code does what you want...
#!/usr/bin/perl use warnings; use strict; use Data::Dumper; use Data::Dump qw(pp); my @results = (["chpt10_2", "sent. 2", "alice", "nsubj", "animals", "p +rotect"], ["chpt12_1", "sent. 54", "bob", "nsubj", "cells", "prot +ect"], ["chpt25_4", "sent. 47", "carol", "nsubj", "plants", "p +rotect"], ["chpt34_1", "sent. 1", "dave", "nsubj", "cells", "prot +ect"], ["chpt35_1", "sent. 2", "eli", "nsubj", "cells", "prote +ct"], ["chpt38_1", "sent. 1", "fred", "nsubj", "animals", "pr +otect"], ["chpt54_1", "sent. 1", "greg", "nsubj", "uticle", "pro +tect"] ); my %freq; foreach ( map{$_->[4]}@results) #feeds in list of animals, cells, utic +le, etc. { $freq{lc $_}++; } @results = sort {$freq{lc $b->[4]} <=> $freq{lc $a->[4]} #freq order or $a->[0] cmp $b->[0] #text col 0 + } @results; print pp(\@results); __END__ [ ["chpt12_1", "sent. 54", "bob", "nsubj", "cells", "protect"], ["chpt34_1", "sent. 1", "dave", "nsubj", "cells", "protect"], ["chpt35_1", "sent. 2", "eli", "nsubj", "cells", "protect"], ["chpt10_2", "sent. 2", "alice", "nsubj", "animals", "protect"], ["chpt38_1", "sent. 1", "fred", "nsubj", "animals", "protect"], ["chpt25_4", "sent. 47", "carol", "nsubj", "plants", "protect"], ["chpt54_1", "sent. 1", "greg", "nsubj", "uticle", "protect"], ]

Replies are listed 'Best First'.
Re^3: How to (get started on) sort AoA or AoH by frequency
by jonc (Beadle) on Jun 14, 2011 at 00:28 UTC

    Marshall,

    You are awesome! This is more than I ever could have asked for, you really helped me understand sorting and the power of hashes. I will try and award you whatever I can, since you've helped me out so much (when I get this rumoured vote fairy). Thanks for your time!

    p.s. I don't think you need to lc the 1st line in sort, since it's numbers... the 2nd line would need it