in reply to Sorting IP addresses, lots of them, quickly

and faster...
use Benchmark qw(cmpthese); use Socket; use Sort::Key qw(ukeysort keysort); my $n = 1000000; my @address = map { join ('.', map {int rand 256 } 0..3) } 0..$n; print "\@address populated\n"; sub gloryhackish { my @in = @address; my @sorted = map { $_ = inet_ntoa($_) } (sort (map {$_ = inet_aton($_)} @in)); } sub ks { my @sorted = keysort { inet_aton($_) } @address; } sub uks { my @sorted = ukeysort { /^(\d+)\.(\d+)\.(\d+)\.(\d+)$/; ($1 << 24) + ($2 << 16) + ($3 << 8) + $4 } @ +address; } cmpthese -1, { gloryhackish => \&gloryhackish, keysort => \&ks, ukeysort => \&uks };
on my PC, the results for 1 million elements are:
s/iter gloryhackish keysort ukeysort gloryhackish 24.4 -- -29% -48% keysort 17.2 42% -- -26% ukeysort 12.7 92% 35% --

Replies are listed 'Best First'.
Re^2: Sorting IP addresses, lots of them, quickly
by gloryhack (Deacon) on May 16, 2007 at 10:36 UTC
    Ooh, I like it! ++ and then some!

    I had to install Sort::Key to give it a whirl, and I can't manage to get the same 92% number, but I still like it just fine. Here's what I get, best of three, with ST and GRT thrown into the mix:

    @address populated Benchmark: running gloryhackish, grt, ks, schwartzian, uks for at leas +t 30 CPU seconds... gloryhackish: 34 wallclock secs (33.28 usr + 0.19 sys = 33.47 CPU) @ + 0.12/s (n=4) grt: 34 wallclock secs (34.62 usr + 0.06 sys = 34.68 CPU) @ 0 +.12/s (n=4) ks: 34 wallclock secs (34.04 usr + 0.03 sys = 34.07 CPU) @ 0 +.18/s (n=6) schwartzian: 55 wallclock secs (54.62 usr + 0.11 sys = 54.73 CPU) @ +0.04/s (n=2) (warning: too few iterations for a reliable count) uks: 36 wallclock secs (35.53 usr + 0.24 sys = 35.77 CPU) @ 0 +.17/s (n=6) s/iter schwartzian grt gloryhackish uks + ks schwartzian 27.4 -- -68% -69% -78% + -79% grt 8.67 216% -- -3% -31% + -35% gloryhackish 8.37 227% 4% -- -29% + -32% ukeysort 5.96 359% 45% 40% -- + -5% keysort 5.68 382% 53% 47% 5% + --

    Still, even at 47% I'm suitably impressed and will be using keysort. It strikes me as odd that in your results keysort was bested by ukeysort... hardware differences, perhaps? (Mine's an AMD64 X2 4400+ w/2G DDR2 RAM.)

    Thanks for pointing Sort::Key out to me!

    Edit: Running on my original list of 11818 addresses, gloryhackish comes in 31% faster than ukeysort but 47% slower than keysort. Hmmmm.... curious, but still convincing.