wisdom needed: sorting an array

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: wisdom needed by Abigail-II (Bishop) on Jun 06, 2002 at 09:37 UTC
Using hashes is one way to tackle the problem, as others have shown. But since your example suggests that the data is clumped - that is, numbers which are the same follow each other, you don't need a hash, as the program below shows. I'm just showing it here for the sake of showing an alternative way. I don't expect it to be faster - hashes are pretty fast, and the sort is likely to dominate the running time anyway. Abigail `#!/usr/bin/perl use strict; use warnings 'all'; my @info; while (<DATA>) { my ($num, $info) = split ' ', $_, 2; if (@info && $num == $info [-1] [0]) { $info [-1] [2] ++; } else { push @info => [$num, $info, 1] } } print map {"@{$_}[0, 1]"} sort {$b -> [2] <=> $a -> [2]} @info; __DATA__ 1 info1 1 info1 1 info1 2 info2 3 info3 3 info3 4 info4 4 info4 4 info4 4 info4 $ ./count 4 info4 1 info1 3 info3 2 info2 $` [download]	[reply] [d/l]
Re: wisdom needed by Anonymous Monk on Jun 06, 2002 at 10:38 UTC
Hi monks, not sure if i explained my problem clearly enough, sorry! anyway i'll show you what i've done and it might make it easier to correct me. i am trying to count every number in $array[0] (shown below); all of the same numbers are the same thing (eg all the 1's are the same), i just need to know which occurs the most. i have tried counting every number etc but i cant seem to find which is the most frequent, i just get the number of numbers present returned to me. hope someone can help. `1 1 1 1 2 2 3 4 4 4` [download] i want the number which occurs the most to be ranked at the top of the output (only one of each number needs to be ranked) `e.g 1 4 2 3` [download] n.b #! /usr/local/bin/perl -w use strict; my $num_of_params; $num_of_params = @ARGV; if ($num_of_params < 2) { die ("\n You haven't entered enough parameters !! \n\n"); } open (FILE, $ARGV[0]) or die "unable to open file"; open (OUTFILE, ">$ARGV[1]"); my $line; my @array; my $number; my $count=0; while (<FILE>) { $line = $_; chomp ($line); @array = (); @array = split (/\s+/, $line); foreach $number ($array[0]) { ++$count; print "$count\n"; print OUTFILE "$count\n"; if ($number != $number-1 ) { print "$number\n"; print OUTFILE "$number\n"; } } } [download]	[reply] [d/l] [select]
Re: Re: wisdom needed by Juerd (Abbot) on Jun 06, 2002 at 10:43 UTC
`sub occurence { my %count; $count{$_}++ for @_; return sort { $count{$b} <=> $count{$a} } keys %count; } print "$_\n" for occurence qw(1 1 1 2 2 3 4 4 4 4);` [download] In the future, please reply to the node you're replying to (sounds logical, doesn't it?). - Yes, I reinvent wheels. - Spam: Visit eurotraQ.	[reply] [d/l]
Re: Re: wisdom needed by marvell (Pilgrim) on Jun 06, 2002 at 10:50 UTC
The principal of counting frequency is based on this type of code: `my @array = (1,1,1,2,3,3,4,4,4,4); my %count; $count{$_}++ for @array; my @ordered = sort {$count{$b} <=> $count{$a}} keys %count; print map {"$_\n"} @ordered;` [download] The hash stores the count in the value and the item in the key. The keys are then sorted by comparing the values with eachother, with respect to the keys. -- Steve Marvell	[reply] [d/l]
Re: Re: Re: wisdom needed by Anonymous Monk on Jun 06, 2002 at 14:15 UTC
thanks marvell, this runs well, but the only problem is that the output prints every number (i only want one of each number printed) it also prints every number again when it comes across a new number!! e.g `@array = (1,1,1,1,2,2,3,3,3,4,4,4,5,5) OUTPUT 1 1 1 1 2 1 2 1 3 2 1 3 2 1 ETC` [download] I am also not sure if is printing the most frequent value at the top or not. any suggestions??? :-) the help so far is much appreciated.	[reply] [d/l]
Re: wisdom needed by marvell (Pilgrim) on Jun 06, 2002 at 09:11 UTC
How about ... `my %counter; $counter{$_}++ while (<DATA>); print sort {$counter{$b} <=> $counter{$a}} keys %counter; __DATA__ 1 info 1 info 1 info 2 info 3 info 3 info 4 info 4 info 4 info 4 info` [download] This increments a counter for each of the lines. Then sorts them in reverse count order. Maybe this is close to what you want. Let us know if it needs any changes. -- Steve Marvell	[reply] [d/l]
Re: wisdom needed by Zaxo (Archbishop) on Jun 06, 2002 at 09:20 UTC
First, read your data in a way that you get: `my @data = ( [1,"info"], [1,"info"], [1,"info"], [2,"info"], [3,"info"], [3,"info"], [4,"info"], [4,"info"], [4,"info"], [4,"info"], );` [download] Then make a hash keyed on both values (assuming the 'info' can have another value): `my %freq = (); foreach my $datum (@data) { $freq{ sprintf("%d %s", @$datum) }++; }` [download] Finally we print out the keys sorted by frequency. Setting $, to $/ puts a line break in after each element: `{ local $,=$/; print sort { $freq{$b} <=> $freq{$a} } keys %freq; }` [download] This sort of frequency counting hash is a pretty common idiom. After Compline, Zaxo	[reply] [d/l] [select]
Re: wisdom needed by Joost (Canon) on Jun 06, 2002 at 09:19 UTC
`#!/usr/bin/perl -w use strict; my @array = qw( 1 info 1 info 1 info 2 info 3 info 3 info 4 info 4 info 4 info 4 info ); my %score; for (my $i=0;$i<$#array;$i+=2) { # get numbers from @array $score{ $array[$i] }++; # count the numbers } my %infos = @array; my @output = map { "$_\t$infos{$_}\n"; } sort { $score{$b} <=> $score{$a} } keys %score; print @output;` [download] Altough I would rather use some other datastructure to store the data.. like `my @rated_data = ( [$number1,$info1], [$number2,$info2], [$number3,$info3], );` [download] etcetera, this will make it a bit easier to sort and print later on. `-- Joost downtime n. The period during which a system is error-free and immune from user input.` [download]	[reply] [d/l] [select]