After a discussion in the chatterbox, with input from tye and Fastolfe, I tried a few different ways of solving a certain kind of sorting problem.
I've got a number of programs that output alphabetically sorted lists, but often there's an "other" category that we want to list after all the "real" elements in the list.
I had a solution that worked, but it looks really ugly, so I tried out two other suggestions and benchmarked those suckers.
#!/usr/bin/perl -w use strict; use Benchmark; # Yeah, I'm using globals; this is to get around troubles # I was having since Benchmark uses eval ... use vars qw/@array %last/; # some values to put in there : mix cases to avoid "asciibetical" sort @array=qw(xavier colin melissa wally Edward joan Arlen George mohandas + ralph other); $last{other}=1; sub using_hash { defined($last{lc($a)}) cmp defined($last{lc($b)}) or lc($a) cmp lc +($b); } sub using_return { return 1 if lc($a) eq 'other'; return -1 if lc($b) eq 'other'; lc($a) cmp lc($b); } sub using_eq { ( (lc($a) eq 'other') cmp (lc($b) eq 'other') ) || lc($a) cmp l +c($b); } # for gosh sakes, use a sensible value here # I needed a large number on a 4 CPU system timethese (100000, {'use_hash'=> q{ my @sorted = sort using_hash @array; }, use_return=> q{ my @sorted = sort using_return @array }, use_eq=> q{ my @sorted = sort using_eq @array } } );
The results. Surprising (to me) is that my ugly using_return benchmarked the fastest (when I added lc(); going with "ASCIIbetical" sorting made it marginally slower than using_eq) using_hash is nice because it makes it easy to change the value you want placed last on the fly, but it came in third in the speed category. Not that it's *that much* slower, however.
use_eq: 24 wallclock secs (24.27 usr + 0.00 sys = 24.27 CPU) use_hash: 27 wallclock secs (27.68 usr + 0.00 sys = 27.68 CPU) use_return: 23 wallclock secs (22.97 usr + 0.00 sys = 22.97 CPU) $ ./sort_test.pl Benchmark: timing 100000 iterations of use_eq, use_hash, use_return .. use_eq: 24 wallclock secs (23.59 usr + 0.00 sys = 23.59 CPU) use_hash: 28 wallclock secs (27.75 usr + 0.00 sys = 27.75 CPU) use_return: 24 wallclock secs (22.86 usr + 0.00 sys = 22.86 CPU)
The differences *are* pretty minor, but I'd like to try them with larger lists, and also after thinking more about optimization.
Philosophy can be made out of anything. Or less -- Jerry A. Fodor
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
RE: A discriminating sort (and some damned lies)
by Fastolfe (Vicar) on Nov 03, 2000 at 04:29 UTC | |
by arturo (Vicar) on Nov 03, 2000 at 20:12 UTC | |
by Fastolfe (Vicar) on Nov 03, 2000 at 20:22 UTC |