RE: Sorting ip addresses quickly

Sorting IP addresses is the canonical example of the Guttman-Rosler Transform (aka the packed default sort) given in A Fresh Look at Efficient Perl Sorting.

The paper is well-worth reading in detail, but here is the IP sorting code.

@out =
  map  substr($_, 4) =>
  sort
  map  pack('C4' =>
    /(\d+)\.(\d+)\.(\d+)\.(\d+)/)
      . $_ => @in;
[download]

Notice that by careful choice of a 'pack' function they can use the default sort behaviour, rather than writing a custom sort routine.

The benchmarks in the paper give this version as being about twice as fast as the Schwartzian Transform.

Update: URL replaced with one that works. Thanks to grinder for pointing it out.

--
<http://www.dave.org.uk>

European Perl Conference - Sept 22/24 2000, ICA, London
<http://www.yapc.org/Europe/>

Comment on RE: Sorting ip addresses quickly Download Code

Replies are listed 'Best First'.
RE: RE: Sorting ip addresses quickly by arturo (Vicar) on Oct 05, 2000 at 23:13 UTC
I found that split() benchmarks a little faster than the regex in the following code #!/usr/bin/perl -w use strict; use Benchmark; use vars qw/@ip_strings/; # these are all made up I hope @ip_strings = qw(192.168.1.1. 192.168.1.2 152.2.100.2 204.165.43.1 1.2 +.3.4 152.2.1.23 112.145.165.205 2.2.2.2. 4.5.6.7 8.9.10.12 2.4.6.8 10 +.12.14.16); timethese (10000, { 'split-em', q{ my @packed_ips; foreach (@ip_strings) { push @packed_ips, pack 'C4', split /\./, $_; } }, 'regex-em', q{ my @packed_ips; foreach (@ip_strings) { push @packed_ips, pack 'C4', /(\d+)\.(\d+)\.(\d+)\.(\d+) +/; } } } ); [download] Some typical results: `Benchmark: timing 10000 iterations of regex-em, split-em... regex-em: 2 wallclock secs ( 3.64 usr + 0.00 sys = 3.64 CPU) split-em: 3 wallclock secs ( 3.13 usr + 0.00 sys = 3.13 CPU) bash-2.04$ perl pack_ip.pl Benchmark: timing 10000 iterations of regex-em, split-em... regex-em: 3 wallclock secs ( 4.03 usr + 0.00 sys = 4.03 CPU) split-em: 3 wallclock secs ( 3.76 usr + 0.00 sys = 3.76 CPU)` [download] Philosophy can be made out of anything -- or less	[reply] [d/l] [select]

Replies are listed 'Best First'.

RE: RE: Sorting ip addresses quickly
by arturo (Vicar) on Oct 05, 2000 at 23:13 UTC

I found that split() benchmarks a little faster than the regex in the following code

#!/usr/bin/perl -w

use strict;
use Benchmark;

use vars qw/@ip_strings/;

# these are all made up I hope

@ip_strings = qw(192.168.1.1. 192.168.1.2 152.2.100.2 204.165.43.1 1.2
+.3.4 152.2.1.23 112.145.165.205 2.2.2.2. 4.5.6.7 8.9.10.12 2.4.6.8 10
+.12.14.16);

timethese (10000, {
        'split-em',
        q{
                my @packed_ips;
                foreach (@ip_strings) { 
                        push @packed_ips, pack 'C4', split /\./, $_;
                }
        },
        'regex-em',
        q{
                my @packed_ips;
                foreach (@ip_strings) {
                        push @packed_ips, 
                                pack 'C4', /(\d+)\.(\d+)\.(\d+)\.(\d+)
+/;
                }
        }
        }
);
[download]

Some typical results:

Benchmark: timing 10000 iterations of regex-em, split-em...
  regex-em:  2 wallclock secs ( 3.64 usr +  0.00 sys =  3.64 CPU)
  split-em:  3 wallclock secs ( 3.13 usr +  0.00 sys =  3.13 CPU)
bash-2.04$ perl pack_ip.pl
Benchmark: timing 10000 iterations of regex-em, split-em...
  regex-em:  3 wallclock secs ( 4.03 usr +  0.00 sys =  4.03 CPU)
  split-em:  3 wallclock secs ( 3.76 usr +  0.00 sys =  3.76 CPU)
[download]

Philosophy can be made out of anything -- or less

[reply]
[d/l]
[select]