in reply to Sorting ip addresses quickly

Sorting IP addresses is the canonical example of the Guttman-Rosler Transform (aka the packed default sort) given in A Fresh Look at Efficient Perl Sorting.

The paper is well-worth reading in detail, but here is the IP sorting code.

@out = map substr($_, 4) => sort map pack('C4' => /(\d+)\.(\d+)\.(\d+)\.(\d+)/) . $_ => @in;

Notice that by careful choice of a 'pack' function they can use the default sort behaviour, rather than writing a custom sort routine.

The benchmarks in the paper give this version as being about twice as fast as the Schwartzian Transform.

Update: URL replaced with one that works. Thanks to grinder for pointing it out.

--
<http://www.dave.org.uk>

European Perl Conference - Sept 22/24 2000, ICA, London
<http://www.yapc.org/Europe/>

Replies are listed 'Best First'.
RE: RE: Sorting ip addresses quickly
by arturo (Vicar) on Oct 05, 2000 at 23:13 UTC

    I found that split() benchmarks a little faster than the regex in the following code

    #!/usr/bin/perl -w use strict; use Benchmark; use vars qw/@ip_strings/; # these are all made up I hope @ip_strings = qw(192.168.1.1. 192.168.1.2 152.2.100.2 204.165.43.1 1.2 +.3.4 152.2.1.23 112.145.165.205 2.2.2.2. 4.5.6.7 8.9.10.12 2.4.6.8 10 +.12.14.16); timethese (10000, { 'split-em', q{ my @packed_ips; foreach (@ip_strings) { push @packed_ips, pack 'C4', split /\./, $_; } }, 'regex-em', q{ my @packed_ips; foreach (@ip_strings) { push @packed_ips, pack 'C4', /(\d+)\.(\d+)\.(\d+)\.(\d+) +/; } } } );

    Some typical results:

    Benchmark: timing 10000 iterations of regex-em, split-em... regex-em: 2 wallclock secs ( 3.64 usr + 0.00 sys = 3.64 CPU) split-em: 3 wallclock secs ( 3.13 usr + 0.00 sys = 3.13 CPU) bash-2.04$ perl pack_ip.pl Benchmark: timing 10000 iterations of regex-em, split-em... regex-em: 3 wallclock secs ( 4.03 usr + 0.00 sys = 4.03 CPU) split-em: 3 wallclock secs ( 3.76 usr + 0.00 sys = 3.76 CPU)

    Philosophy can be made out of anything -- or less