Well, I couldn't say your problem description is crystal clear, but the following may get you started:
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dump;
srand 0;
my @aoa = (
[map {int rand 3} 1 .. 10],
[map {int rand 6} 1 .. 10],
[map {int rand 20} 1 .. 10],
);
my @aof = map {{asFreq (@$_)}} @aoa;
print Data::Dump::dump (\@aof);
sub asFreq {
my @elements = @_;
my %freqs;
++$freqs{$_} for @elements;
return %freqs;
}
Prints:
[
{ "0" => 6, "1" => 2, "2" => 2 },
{ "0" => 3, "1" => 2, "2" => 2, "3" => 1, "4" => 1, "5" => 1 },
{ "0" => 1, "1" => 1, "5" => 1, "11" => 1, "12" => 1, "13" => 1, "15
+" => 1, "17" => 1, "19" => 2 },
]
True laziness is hard work
| [reply] [d/l] [select] |
I took your question to mean: how do I make a sorted printout by frequency of the structure that my code builds? First, I found the OP's code to be confusing, so I recoded it.
It is not possible to sort a hash, but it is possible to sort the keys of the hash into an array. Then use that array of keys to print the hash. Below, I used pp() to assist in the printing.
In this example, the sub hash is actually not necessary. A HoA would have sufficed because the value of the array evaluated in a scalar context is the "count". Not quite sure what you mean in terms of AoA to sort.
Update: As clarification to the OP, @all_arrays is an array of references to arrays. The map{@$_} takes each array reference and expands it into a list of numbers. So this is the answer to one of the questions about needing to know that there are 3 rows, you don't. The code below "flattens" the whole structure into a long list of numbers no matter how many rows that there are.
#!/usr/bin/perl -w
use strict;
use Data::Dump qw(pp);
my @all_arrays = ([1 .. 20],
[10 .. 30],
[19 .. 40],
);
my %unique_descriptive;
foreach my $num (map{@$_}@all_arrays)
{
$unique_descriptive{$num}{count}++;
push @{$unique_descriptive{$num}{values}}, $num;
}
#print pp(\%unique_descriptive);
# example for num=10
#10 => { count => 2, "values" => [10, 10] },
my @sorted_keys = sort{ $unique_descriptive{$a}{count} <=> $unique_des
+criptive{$b}{count}
or
$a <=> $b
}keys %unique_descriptive;
foreach my $key (@sorted_keys)
{
printf "%2d=", $key; #make the print out look nice
print pp($unique_descriptive{$key}),"\n";
}
__END__
Program output:
Update: Another set of code - probably is not what OP needs for AoA, but it does demo how to add a column and how to sort a 2-D array by different column positions...
#!/usr/bin/perl -w
use strict;
use 5.010; #for new //= operator
use Data::Dump qw(pp);
my @all_arrays = ([1 .. 20],
[10 .. 30],
[19 .. 40],
);
my @unique_descriptive;
foreach my $num (map{@$_}@all_arrays)
{
$unique_descriptive[$num]++; #simple peg counter
}
# add a column to the 2-D array with row number
# undef counts as freq of zero, the //=0 does that
my $i=0;
@unique_descriptive = map{[$i++,$_//=0]}@unique_descriptive;
@unique_descriptive = sort{ $a->[1] <=> $b->[1] #by freq
or
$a->[0] <=> $b->[0] #by peg number
}@unique_descriptive;
foreach my $row (@unique_descriptive)
{
print "num = $row->[0] \tfreq=$row->[1]\n" if ($row->[1] > 0);
}
AoA output:
| [reply] [d/l] [select] |
Great! The output (more of the 2nd one) is what I was looking for. I guess I'll include it in the question next time. The HoA won't work for me, because in my actual code: in the AoA/AoH, the "inner" array/hash (the references) are 6 strings, not numbers. The "outer" array is the a list of all these "sets" of strings (which come from a search engine type of code and needs to be sorted). >(How do you indent?)>(which also means I'm going to create a more complex sorting method, where I sort the "outer" array by certain *values* of the hash, or elements by certain elements of the "inner" array). Should I include this type of background in my questions? Thanks a lot, I'm going to try and understand these codes. Thanks for explaining map{@$_}...$_ really screws me up, I take it that one comes from the elements of @all_matches .
| [reply] [d/l] [select] |
Whew... This is intense. I'm sorry, but I've tried, and read some articles on hashes/data structures, but still need some assistance understanding this. I get the map statement. But:
$unique_descriptive{$num}{count}++;
push @{$unique_descriptive{$num}{values}}, $num;
Is a little confusing. Here's what I got so far: The first statement increments the value of "count"(which I guess is a new key made then and there?). The value is in the HoH %unique_descriptive, at the key: That is the number, which is the element of the de-referenced array being looped through. Then the 2nd line is AoHoH?? But that array is never used later? The keys of the most inner hash are the values of something(what?). The end value of this is the number from the loop being pushed in. The 2nd inner hash is at the key of $num. Was the @ in front only necessary b/c push takes list context? Then the other problem is: my @sorted_keys = sort{ $unique_descriptive{$a}{count} <=> $unique_des
+criptive{$b}{count}
or
$a <=> $b
}keys %unique_descriptive;
The numbers are being sorted based on count first (did you know to put $a where it is b/c $num was there before?) And if that is equal, the numbers themselves are compared. The keys are being sorted. Sorry for the trouble I'm having with this, I hope I was close/this makes sense to you. | [reply] [d/l] [select] |
Okay, So without using hashes, here is a sample showing a crude way of getting the sort I need (annotated for clarity):
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
my @results = (["chpt10_2", "sent. 2", "alice", "nsubj", "animals", "p
+rotect"],
["chpt12_1", "sent. 54", "bob", "nsubj", "cells", "prot
+ect"],
["chpt25_4", "sent. 47", "carol", "nsubj", "plants", "p
+rotect"],
["chpt34_1", "sent. 1", "dave", "nsubj", "cells", "prot
+ect"],
["chpt35_1", "sent. 2", "eli", "nsubj", "cells", "prote
+ct"],
["chpt38_1", "sent. 1", "fred", "nsubj", "animals", "pr
+otect"],
["chpt54_1", "sent. 1", "greg", "nsubj", "uticle", "pro
+tect"]
);
my @sort_results = sort {lc $a->[4] cmp lc $b->[4]} @results; ##By alp
+habet of arg1
my $last_word; my $current_word; my $word_count;
$sort_results[-1][6] = 1; ##This weird step is b/c last element didn't
+ get 7th column appended
for my $j (0 .. $#sort_results) { ##[ROW][COLUMN]
$current_word = $sort_results[$j][4]; ## current word is arg1 of w
+hichever matchset is being looked at (alphabetical)
if (lc $last_word eq lc $current_word) {
$word_count++; ##If seen before, increment freq. count
}
else { ##new word
if ($j != 0) ##unless it's the first row
{
for (my $k = 1; $k <= $word_count; $k++)
{
##make a new column with freq. Each of the previous see
+n word will have to have the same freq. number so iterate back and ma
+ke them all the same word count
$sort_results[($j-$k)][6] = $word_count;
}
}
##Now set up for next iteration
$last_word = $current_word;
$word_count = 1;
}
}
@sort_results = sort {$b->[6] <=> $a->[6]} @sort_results; ##Sort the r
+esults by the new 7th freq. column
for my $i (0 .. $#sort_results) {
print "$sort_results[$i][0], $sort_results[$i][1]: "; ##chptnum,
+ sent num
print "$sort_results[$i][2]\n\n"; ##sentence
print "gramatical relation: $sort_results[$i][3]; argument: $sor
+t_results[$i][4]; freq: $sort_results[$i][6]\n\n\n"; ##dependency a
+rgs
}
I would appreciate either a new, better way to do this (I think hashes are the way to get it done), or just an improvement on this crude code. Thanks again for all your help! | [reply] [d/l] |
See attached code. I used the map trick again: foreach ( map{$_-> [ 4]}@results) iterates over all of the contents of column 4 and a freq hash is built. A list of references to rows is what is going into the map. The map then de-references and transforms this such that the output is list of every contents of column 4.
The way sort works: <---output sort{...} <---input
is that what goes in is what comes out. What is coming in are references to rows of the @results array. What sort needs is a way to compare 2 rows: row A<row B, row A equal row B or row A>rowB. The function that provides the comparison can be anything that you want as long as it produces a consistent result (reverses the answer if a and b are reversed).
So I look up the value of col 4 for say row A, then I ask the frequency hash what the frequency is and I compare that result with a likewise computation for row B. In the case of a tie, I use an alphabetic comparison of row 0. Note that I reversed a and b to get highest frequency first while I am sorting on lowest column 0 first.
The way that the sort decider function is written may appear a bit strange, but it is just returning a: -1, 0 or 1 depending upon how row A and row B compare.
It is completely legal to assign the sorted result set back to the input variable and I did that. To get your printout, just do the column 4 look up in the freq hash to get frequency. The order of my @result jives with the order of your output.
For printing, of course you can access each element as a 2-D coordinate, but usually better is to iterate over the rows with row reference like this:
foreach my $row (@results)
{
print "$row->[0] $row->[1]\n";
}
I think the following code does what you want...
#!/usr/bin/perl
use warnings;
use strict;
use Data::Dumper;
use Data::Dump qw(pp);
my @results = (["chpt10_2", "sent. 2", "alice", "nsubj", "animals", "p
+rotect"],
["chpt12_1", "sent. 54", "bob", "nsubj", "cells", "prot
+ect"],
["chpt25_4", "sent. 47", "carol", "nsubj", "plants", "p
+rotect"],
["chpt34_1", "sent. 1", "dave", "nsubj", "cells", "prot
+ect"],
["chpt35_1", "sent. 2", "eli", "nsubj", "cells", "prote
+ct"],
["chpt38_1", "sent. 1", "fred", "nsubj", "animals", "pr
+otect"],
["chpt54_1", "sent. 1", "greg", "nsubj", "uticle", "pro
+tect"]
);
my %freq;
foreach ( map{$_->[4]}@results) #feeds in list of animals, cells, utic
+le, etc.
{
$freq{lc $_}++;
}
@results = sort {$freq{lc $b->[4]} <=> $freq{lc $a->[4]} #freq order
or
$a->[0] cmp $b->[0] #text col 0
+
} @results;
print pp(\@results);
__END__
[
["chpt12_1", "sent. 54", "bob", "nsubj", "cells", "protect"],
["chpt34_1", "sent. 1", "dave", "nsubj", "cells", "protect"],
["chpt35_1", "sent. 2", "eli", "nsubj", "cells", "protect"],
["chpt10_2", "sent. 2", "alice", "nsubj", "animals", "protect"],
["chpt38_1", "sent. 1", "fred", "nsubj", "animals", "protect"],
["chpt25_4", "sent. 47", "carol", "nsubj", "plants", "protect"],
["chpt54_1", "sent. 1", "greg", "nsubj", "uticle", "protect"],
]
| [reply] [d/l] [select] |