Re: Numeric Sort for Stringified Value (How to Avoid Warning)
by Zaxo (Archbishop) on Sep 16, 2005 at 01:59 UTC
|
Instead of running perl with the -w switch, use warnings; and turn off the particular warning in a scope. That needs Perl 5.6+.
$ perl -Mwarnings -MData::Dumper -e '
> @old = ( "10.5 AA", "9 AC", "2 BB");
> @new = do {no warnings q/numeric/; sort {$b <=> $a} @old};
> print Dumper \@new;'
$VAR1 = [
'10.5 AA',
'9 AC',
'2 BB'
];
$
(Added) If you're in pre-5.6 perl, or you just want to do it this way, this works:
$ perl -w -MData::Dumper -e '
> @old = ( "10.5 AA", "9 AC", "2 BB");
> @new = do {local $^W = 0; sort {$b <=> $a} @old};
> print Dumper \@new;'
$VAR1 = [
'10.5 AA',
'9 AC',
'2 BB'
];
$
That knocks out all warnings in the scope.
| [reply] [d/l] [select] |
Re: Numeric Sort for Stringified Value (How to Avoid Warning) (Use the ST)
by bobf (Monsignor) on Sep 16, 2005 at 04:27 UTC
|
my @new = map { $_->[0] }
sort { $b->[1] <=> $a->[1] }
map { [ $_, (split( /\s+/, $_, 2 ))[0] ] }
@old;
The example assumes all of your data is in the same format as it is given in your post (a number separated from the rest of the string by whitespace).
HTH
| [reply] [d/l] |
|
|
my @new = sort {(split /\s+/, $b)[0] <=> (split /\s+/, $a)[0]} @old;
And it is faster:
use Data::Dumper;
use strict;
use warnings;
my @old = ("10.5 AA", "10.6 AA", "9 AC", "2 BB");
my $t0 = time();
for (1..200000) {#yours
my @new = map { $_->[0] }
sort { $b->[1] <=> $a->[1] }
map { [ $_, (split( /\s+/, $_, 2 ))[0] ] }
@old;
}
print time() - $t0, "\n";
$t0 = time();
for (1..200000) {#mine
my @new = sort {(split /\s+/, $b)[0] <=> (split /\s+/, $a)[0]} @ol
+d;
}
print time() - $t0, "\n";
I ran four times, yours took: 13, 14, 14, 13 seconds, when mine took 8, 10, 11, 9 seconds. | [reply] [d/l] [select] |
|
|
The ST is a bit more complicated, but unless we know what the actual data looks like and how much of it there is, I would hesitate to say it is over-engineered. We encourage people to post minimal examples, so I would not be surprised if the OP simplified the input data.
I am certainly no expert on benchmarking, but my tests yield opposite results. I created arrays containing 5, 10, 20, 40, and 80 elements each and compared our sort routines. I ran the code 5 times and averaged the results. The ST approach was faster in all cases, and the difference increased with the size of the array.
Array size = 5
Rate pg bobf
pg 23058/s -32%
bobf 33853/s 47%
Array size = 10
Rate pg bobf
pg 8606/s -51%
bobf 17506/s 103%
Array size = 20
Rate pg bobf
pg 3099/s -64%
bobf 8648/s 179%
Array size = 40
Rate pg bobf
pg 1207/s -71%
bobf 4167/s 245%
Array size = 80
Rate pg bobf
pg 490/s -75%
bobf 1987/s 305%
Benchmarking code and complete results:
Unless the OP is dealing with large data sets, the time difference is probably negligible. In that case, I'd recommend whatever approach the OP is most comfortable maintaining.
TMTOWTDI. :)
| [reply] [d/l] [select] |
|
|
For tiny sets, N log N is so similar to N that Schwartzian is pretty worthless. I changed
my @old = ("10.5 AA", "10.6 AA", "9 AC", "2 BB");
to
my @old = ("10.5 AA", "10.6 AA", "9 AC", "2 BB") x 100;
(and lowered the number of iterations to 2000) and yours took 1.5 times longer (25s vs 10s). Whether it's over-engineered or not depends on the input set.
| [reply] [d/l] [select] |
Re: Numeric Sort for Stringified Value (How to Avoid Warning)
by monkfan (Curate) on Sep 16, 2005 at 01:52 UTC
|
perl -w -MData::Dumper -e '
@old = ( "10.5 AA", "9 AC", "2 BB");
@new = sort {($b =~ /(\d+)/)[0] <=> ($a =~ /(\d+)/)[0]} @old;
print Dumper \@new;
'
Hope that helps.
| [reply] [d/l] |
|
|
use Data::Dumper;
@old = ( "10.5 AA", "10.6 AA", "9 AC", "2 BB");
@new = sort {($b =~ /(\d+)/)[0] <=> ($a =~ /(\d+)/)[0]} @old;
print Dumper \@new;
Your code returns
$VAR1 = [
'10.5 AA',
'10.6 AA',
'9 AC',
'2 BB'
];
Which is wrong. Perl already does what your code is supposed to do (if it is implemented correctly). Thus your code has performance implication, as it is doing more than needed. In this particular case, I would rather trun off the warnings temperarily:
use Data::Dumper;
use strict;
use warnings;
my @old = ( "10.5 AA", "10.6 AA", "9 AC", "2 BB");
my @new;
{
no warnings;
@new = sort {$b <=> $a} @old;
}
"a" == "a"; #meaingless other than to demo the fact that the warnings
+is back on
print Dumper \@new;
| [reply] [d/l] [select] |
Re: Numeric Sort for Stringified Value (How to Avoid Warning)
by radiantmatrix (Parson) on Sep 16, 2005 at 16:56 UTC
|
Strings are sorted on a place-basis, while numbers are considered as a whole. For example, if the first chars of a string are different, they will be sorted only on that char. '1' is less than '9' so '10' is less than '9' in string-land.
If it is very important for you to be able to use string-sort without warnings (and you can't locally turn off warnings), you could try:
my @old = ( "10.5 AA", "9 AC", "2 BB");
my $max_len = 0;
for (@old) { $max_len = length($_) if length($_) > $max_len }
my @new = sort {$b cmp $a} map {
my $x='';
$x.='0' for (1..$max_len-length($_));
$x.$_;
} @old;
Dumping @new results in:
$VAR1 = [
'10.5 AA',
'0009 AC',
'0002 BB'
];
You may or may not need to trim the zeroes.
Update: a space works equally well as a 0 for this application, and might be better for post-trimming if you need to preserve leading zeroes in the source.
<-radiant.matrix->
Larry Wall is Yoda: there is no try{} (ok, except in Perl6; way to ruin a joke, Larry! ;P)
The Code that can be seen is not the true Code
"In any sufficiently large group of people, most are idiots" - Kaa's Law
| [reply] [d/l] [select] |
|
|
This breaks, e.g. on @old = ("10.5 AA", "100 NO");
10.5 < 100, but "10.5" gt "0100".
| [reply] [d/l] |
|
|
my @old = ("10.5 AA", "100 NO");
my @new = map { join(' ',@$_) } sort {
($b->[0] <=> $a->[0]) || ($b->[1] cmp $a->[1])
} map { ($1,$2) if m/(.*?)\s(.*)/ } @old;
That's a bit cryptic. Basically, split each component of @old into number and string "columns", then use a two-criteria sort (sort on number, then string) and rejoin the "columns" into a string again.
<-radiant.matrix->
Larry Wall is Yoda: there is no try{} (ok, except in Perl6; way to ruin a joke, Larry! ;P)
The Code that can be seen is not the true Code
"In any sufficiently large group of people, most are idiots" - Kaa's Law
| [reply] [d/l] |
Re: Numeric Sort for Stringified Value (How to Avoid Warning)
by Moron (Curate) on Oct 07, 2005 at 08:24 UTC
|
IMO, the only thing wrong with the regexp approach used in some responses was forgetting to include the optional \.*\d* in the number being read in for sorting (I also made some style and performance adjustments to the sort algorithm, i.e. the use of a subroutine and an orcish manoeuvre)
perl -w -MData::Dumper -e '
@old = ( "10.5 AA", "9 AC", "2 BB");
%orcish = ();
@new = sort {($orcish{$b} ||= ForceFloat($b)) <=> ($orcish{$a} ||= For
+ceFloat($a))} @old;
print Dumper \@new;
sub ForceFloat{
return $_[0] =~ /(\d+\.*\d*)/ ? $1 : 0.0;
}'
| [reply] [d/l] |