Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

i really need some wisdom!!!! i am not sure how to go about this problem... i have a table which i have turned into an array, it looks something like this;
1 info 1 info 1 info 2 info 3 info 3 info 4 info 4 info 4 info 4 info
i want to return to the screen a results in order of frequency of the number in $array[0]. eg,
output: 4 info 1 info 3 info 2 info
so the results with the most hits are returned at the top. i am new to perl and can't think of how to do it. would i need to count along element $array[1]?? thanks

Edit kudra, 2002-06-06 Changed title, replaced literal brackets outside of code tags

Replies are listed 'Best First'.
Re: wisdom needed
by Abigail-II (Bishop) on Jun 06, 2002 at 09:37 UTC
    Using hashes is one way to tackle the problem, as others have shown. But since your example suggests that the data is clumped - that is, numbers which are the same follow each other, you don't need a hash, as the program below shows.

    I'm just showing it here for the sake of showing an alternative way. I don't expect it to be faster - hashes are pretty fast, and the sort is likely to dominate the running time anyway.

    Abigail

    #!/usr/bin/perl use strict; use warnings 'all'; my @info; while (<DATA>) { my ($num, $info) = split ' ', $_, 2; if (@info && $num == $info [-1] [0]) { $info [-1] [2] ++; } else { push @info => [$num, $info, 1] } } print map {"@{$_}[0, 1]"} sort {$b -> [2] <=> $a -> [2]} @info; __DATA__ 1 info1 1 info1 1 info1 2 info2 3 info3 3 info3 4 info4 4 info4 4 info4 4 info4 $ ./count 4 info4 1 info1 3 info3 2 info2 $
Re: wisdom needed
by Anonymous Monk on Jun 06, 2002 at 10:38 UTC
    Hi monks, not sure if i explained my problem clearly enough, sorry! anyway i'll show you what i've done and it might make it easier to correct me. i am trying to count every number in $array[0] (shown below); all of the same numbers are the same thing (eg all the 1's are the same), i just need to know which occurs the most. i have tried counting every number etc but i cant seem to find which is the most frequent, i just get the number of numbers present returned to me. hope someone can help.
    1 1 1 1 2 2 3 4 4 4
    i want the number which occurs the most to be ranked at the top of the output (only one of each number needs to be ranked)
    e.g 1 4 2 3
    n.b
    #! /usr/local/bin/perl -w use strict; my $num_of_params; $num_of_params = @ARGV; if ($num_of_params < 2) { die ("\n You haven't entered enough parameters !! \n\n"); } open (FILE, $ARGV[0]) or die "unable to open file"; open (OUTFILE, ">$ARGV[1]"); my $line; my @array; my $number; my $count=0; while (<FILE>) { $line = $_; chomp ($line); @array = (); @array = split (/\s+/, $line); foreach $number ($array[0]) { ++$count; print "$count\n"; print OUTFILE "$count\n"; if ($number != $number-1 ) { print "$number\n"; print OUTFILE "$number\n"; } } }

      sub occurence { my %count; $count{$_}++ for @_; return sort { $count{$b} <=> $count{$a} } keys %count; } print "$_\n" for occurence qw(1 1 1 2 2 3 4 4 4 4);
      In the future, please reply to the node you're replying to (sounds logical, doesn't it?).

      - Yes, I reinvent wheels.
      - Spam: Visit eurotraQ.
      

      The principal of counting frequency is based on this type of code:

      my @array = (1,1,1,2,3,3,4,4,4,4); my %count; $count{$_}++ for @array; my @ordered = sort {$count{$b} <=> $count{$a}} keys %count; print map {"$_\n"} @ordered;

      The hash stores the count in the value and the item in the key. The keys are then sorted by comparing the values with eachother, with respect to the keys.

      --
      Steve Marvell

        thanks marvell, this runs well, but the only problem is that the output prints every number (i only want one of each number printed) it also prints every number again when it comes across a new number!! e.g
        @array = (1,1,1,1,2,2,3,3,3,4,4,4,5,5) OUTPUT 1 1 1 1 2 1 2 1 3 2 1 3 2 1 ETC
        I am also not sure if is printing the most frequent value at the top or not. any suggestions??? :-) the help so far is much appreciated.
Re: wisdom needed
by marvell (Pilgrim) on Jun 06, 2002 at 09:11 UTC

    How about ...

    my %counter; $counter{$_}++ while (<DATA>); print sort {$counter{$b} <=> $counter{$a}} keys %counter; __DATA__ 1 info 1 info 1 info 2 info 3 info 3 info 4 info 4 info 4 info 4 info

    This increments a counter for each of the lines. Then sorts them in reverse count order.

    Maybe this is close to what you want. Let us know if it needs any changes.

    --
    Steve Marvell

Re: wisdom needed
by Zaxo (Archbishop) on Jun 06, 2002 at 09:20 UTC

    First, read your data in a way that you get:

    my @data = ( [1,"info"], [1,"info"], [1,"info"], [2,"info"], [3,"info"], [3,"info"], [4,"info"], [4,"info"], [4,"info"], [4,"info"], );

    Then make a hash keyed on both values (assuming the 'info' can have another value):

    my %freq = (); foreach my $datum (@data) { $freq{ sprintf("%d %s", @$datum) }++; }

    Finally we print out the keys sorted by frequency. Setting $, to $/ puts a line break in after each element:

    { local $,=$/; print sort { $freq{$b} <=> $freq{$a} } keys %freq; }

    This sort of frequency counting hash is a pretty common idiom.

    After Compline,
    Zaxo

Re: wisdom needed
by Joost (Canon) on Jun 06, 2002 at 09:19 UTC
    #!/usr/bin/perl -w use strict; my @array = qw( 1 info 1 info 1 info 2 info 3 info 3 info 4 info 4 info 4 info 4 info ); my %score; for (my $i=0;$i<$#array;$i+=2) { # get numbers from @array $score{ $array[$i] }++; # count the numbers } my %infos = @array; my @output = map { "$_\t$infos{$_}\n"; } sort { $score{$b} <=> $score{$a} } keys %score; print @output;
    Altough I would rather use some other datastructure to store the data.. like
    my @rated_data = ( [$number1,$info1], [$number2,$info2], [$number3,$info3], );
    etcetera, this will make it a bit easier to sort and print later on.
    -- Joost downtime n. The period during which a system is error-free and immune from user input.