nefertiti has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I was looking for a solution to my problem of getting the top x matches for each error category. I already have the entire table stored in an array - @dataarray. I used a hash to get the count for each category. Colum 3 of the array holds the error code. The below is a simple code showing how I get the count for each category.
my %error = ( 'disk' => 0, 'cpu' => 0, 'hdd' => 0, 'graphics' => 0, 'keyboard' => 0, '' => 0 #no error code ); for my $i (0 .. $#dataarray) { $error{$dataarray[$i][3]}++; }
Column 0 and column 1 holds the computer location and name (respectively). They are unique only as a pair. Can any experts advice how I can get the top x matches for each category? In addition, I also need the error occurance count for each of the top x computer location/name. Thanks. Nefertiti

Replies are listed 'Best First'.
Re: Complex sorting and counting question.
by Corion (Patriarch) on Sep 18, 2008 at 14:40 UTC
      Hi Monks, Below is the code provided by someone. What I do'nt get is the %code = map onwards. I'm not sure what's he's trying to get. I have difficulty understanding this part. Thanks for the help.
      my $query = $db->prepare("select code, username, userid from transacti +ons"); $query->execute(); my %code = map { $_ => [] } 0 .. 13; while (my $r = $qeury->fetchrow_hashref()) { my $c = $$r{code}; push @{$code{$c}}, $r; } # For each code $c, $code{$c} is now an arrayref which # contains the relevant records (as hashrefs). The # number of elements in the array equals the number # of occurances. Now, for the second part... my %topfive; for my $c (keys %code) { my %user; for my $row (@{$code{$c}}) { # Note: this assumes that the userid field # is unique to each user. $user{$$row{userid}}{count}++; $user{$$row{userid}}{name} = $$row{username}; } $topfive{$c} = [(sort { $$b[2] <=> $$a[2] } map { [$_, $user{$_}{name}, $user{$_}{count}] } keys %user)[0 .. 4]]; }

        Maybe you should ask the monk who wrote it directly then? That's why we have threaded replies here. Anyway:

        This prepares and executes an SQL statement, see DBI:

        my $query = $db->prepare("select code, username, userid from transacti +ons"); $query->execute();

        This initializes the %code hash with the keys from 0 to 13. See perlop about the range operator, map and then Data::Dumper for how to inspect data structures.

        my %code = map { $_ => [] } 0 .. 13;

        This line contains a typo that use strict; will catch for you. See strict and likely then Coping with Scoping.

        while (my $r = $qeury->fetchrow_hashref()) {

        This loop overall fetches each retrieved hash and puts it into the corresponding array in %code:

        my $c = $$r{code}; push @{$code{$c}}, $r; } # For each code $c, $code{$c} is now an arrayref which # contains the relevant records (as hashrefs). The # number of elements in the array equals the number # of occurances. Now, for the second part...

        This code determines the top five userids.

        my %topfive; for my $c (keys %code) { my %user; for my $row (@{$code{$c}}) { # Note: this assumes that the userid field # is unique to each user. $user{$$row{userid}}{count}++; $user{$$row{userid}}{name} = $$row{username}; } $topfive{$c} = [(sort { $$b[2] <=> $$a[2] } map { [$_, $user{$_}{name}, $user{$_}{count}] } keys %user)[0 .. 4]]; }

        I already recommended pushing more work back into the database instead of manually sorting and counting.