remluvr has asked for the wisdom of the Perl Monks concerning the following question:

Hi everyone.
I have a new question. I've written some code and I need to modify it in order to make it work with a new input I have. My code was a simple one, used to compute Average Precision on some data.
Now my data has changed because I have a list like the following. I need for it to work on separate elements, so to say, I need it to compute Average precision for different element.
My input is the following:

1 acacia-n hyper tree-n 0.838364743354488 2 acacia-n hyper plant-n 0.740581661563839 3 acacia-n mero wood-n 0.687569086370938 4 acacia-n mero flower-n 0.650909477491374 5 acacia-n coord oak-n 0.610092594991099 6 acacia-n coord pine-n 0.537690234715029 7 acacia-n mero branch-n 0.510899195919917 8 acacia-n mero root-n 0.491198721010063 9 acacia-n coord willow-n 0.481680704877001 1 ant-n hyper animal-n 0.634580215370739 2 ant-n hyper insect-n 0.53621081509255 3 ant-n mero head-n 0.535980012877533 4 ant-n mero body-n 0.505827598873949 5 ant-n coord bee-n 0.481918895790599 1 apricot-n hyper fruit-n 0.76748797529242 2 apricot-n coord apple-n 0.685155565667883 3 apricot-n mero juice-n 0.560337418489082 4 apricot-n coord banana-n 0.559525300446683

My output (given that I compute AP for hyper) should be like the following:

acacia-n hyper 1 ant-n hyper 1 apricot-n hyper 1

I've picked lucky cases in which AP is 1!
My code at the moment is the following:

my $conteggio_trovato=0; my $conteggio_hyper=0; my $precision_glob=0; my $precision=0; my $average_precision=0; my ($rank, $u, $relaz, $v, $score); while (<INPUT>) { ($rank, $u, $relaz, $v, $score) = split; if ($relaz eq "hyper"){ print "REl:".$relaz; $conteggio_hyper++; my $precision = $conteggio_trovato/$rank; print "precision:".$precision."\n"; $precision_glob=$precision_glob+$precision; } $conteggio_trovato++; # $precision_glob=$precision_glob+$precision; } #$precision_glob=$precision_glob+$precision; $average_precision=$precision_glob/$conteggio_hyper; #print "precision totale: ".$precision_glob."\n"; #print "Conteggio: ".$conteggio_hyper."\n"; print $u."\t"."hyper"."\t".$average_precision."\n";

How can I modify it?
Thanks!
Giulia </code>

Replies are listed 'Best First'.
Re: Iterate over multiple data
by kcott (Archbishop) on Mar 08, 2012 at 19:32 UTC

    I can see you've had problems with the code. It would have been useful to provide an algorithm for calculating what you refer to as Average Precision. I've made a guess at what you've attempting to achieve in the following script. If it's not right, please provide more concrete information.

    #!/usr/bin/env perl use strict; use warnings; my %result = (); my $wanted_relaz = q{hyper}; my $last_u = q{}; my $u_count = 0; my $relaz_count = 0; my $precision = 0; my ($rank, $u, $relaz); while (<DATA>) { ($rank, $u, $relaz, undef) = split; if ($u ne $last_u) { if ($last_u) { $result{$last_u} = $precision / $relaz_count; } $last_u = $u; $u_count = 0; $relaz_count = 0; $precision = 0; } ++$u_count; if ($relaz eq $wanted_relaz) { ++$relaz_count; $precision += $u_count / $rank; } } $result{$u} = $precision / $relaz_count; for my $key (sort keys %result) { print qq{$key\t$wanted_relaz\t$result{$key}\n}; } __DATA__ 1 acacia-n hyper tree-n 0.838364743354488 2 acacia-n hyper plant-n 0.740581661563839 3 acacia-n mero wood-n 0.687569086370938 4 acacia-n mero flower-n 0.650909477491374 5 acacia-n coord oak-n 0.610092594991099 6 acacia-n coord pine-n 0.537690234715029 7 acacia-n mero branch-n 0.510899195919917 8 acacia-n mero root-n 0.491198721010063 9 acacia-n coord willow-n 0.481680704877001 1 ant-n hyper animal-n 0.634580215370739 2 ant-n hyper insect-n 0.53621081509255 3 ant-n mero head-n 0.535980012877533 4 ant-n mero body-n 0.505827598873949 5 ant-n coord bee-n 0.481918895790599 1 apricot-n hyper fruit-n 0.76748797529242 2 apricot-n coord apple-n 0.685155565667883 3 apricot-n mero juice-n 0.560337418489082 4 apricot-n coord banana-n 0.559525300446683

    Here's the output:

    $ pm_avg_precision.pl acacia-n hyper 1 ant-n hyper 1 apricot-n hyper 1

    -- Ken

Re: Iterate over multiple data
by JavaFan (Canon) on Mar 08, 2012 at 17:47 UTC
    Can you explain what you want the output to be, and how the calculations go? And what are the different elements in this case?

      My output (given that I compute AP for hyper) should be like the following:

      acacia-n hyper 1 ant-n hyper 1 apricot-n hyper 1

      Basically, for each different element of the first row of data I submitted as input, I should repeat the calculations my code does. The only problem I have is that I don't know how to make the code iterate only while acacia, and do the calculations on that data, then do them only for ant, and then do them only for apricot.
      I hope I made it a little clearer.
      Thanks again,
      Giulia

        Seems you want to use a hash, keyed on $u where you can keep track of the different values you need.