This is a fast storage and lookup method to match IP-addresses against IP-prefixes (or networks).
In a recent SoPW, Fangorn asked about caching of Net::Patricia objects, to make that cache available to multiple processes.
I had a brief look at Net::Patricia and what all that was about and thought "hmm, tree lookup? don't we have that close to the perl core?"
Turns out that DB_File, using a database of type DB_BTREE, fits that purpose very well. Memory requirements are minimal, and lookup time is no more than 5 micromiliseconds worst case (tested with 1.000.000 randomly generated prefixes).
Example: having the CIDR notation 10.223.2.0/23, we get
net: 0 0 0 0 1 0 1 0 1 1 0 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 key: 1010101011101110111110111111111110101010101011000000000000000000
We get thus a 64 bit key (well, 64 char string :-). Upon search, the searched IP adress is zipped with the netmask 0xffffffff in the same manner, and starting with the leftmost 2 bits, a partial match against the tree is done. DB_BTREE returns the next key which is equal or greater than the key at hand. The resulting key is splitted back into network and netmask, and it is checked whether the ip matches the resulting range. If so, the value is pushed to a list. While the search key matches the returned key, further bits (2 at a time) are added. The search is repeated until there's a mismatch or no new key is returned, which means we're done.
The last element in the resulting list is the network prefix matching the given IP, with the most bits set in its mask.
#!/usr/bin/perl use DB_File; use Fcntl; use Getopt::Std; # 120000 2.217429, avg. 16 lookups, 360.434319750957/sec my $maxprefix = 1000000; # ip prefix limit my %o; getopts('f:ast',\%o); unless (@ARGV || grep { $_ } @o{qw(t s)}) { die <<EOH; usage: $0 -s [-f file] | [-a ] [ -t | ipaddress ] -s : setup -f : use values from file for setup ("xxx.xxx.xxx.xxx/nn nnnnn") -t : average timing -a : show all matching prefixes EOH } # unless @ARGV || grep {defined $_ } values %o; my $filename = 'astable.db'; my $x; # DB_File handle if ($o{s}) { if(! -e $filename) { $x = tie %h, "DB_File", $filename, O_RDWR|O_CREAT, 0666, $DB_B +TREE or die "cannot open $filename: $!\n"; } else { die "file '$filename' exists, won't overwrite!\n"; } } else { $x = tie %h, "DB_File", $filename, O_RDONLY, 0666, $DB_BTREE or die "cannot open $filename: $!\n"; } # set up AS table. To perform a binary search, # we build a bit string as a sequence of alternating bits # of mask and network address, i.e. having 10.223.2.0/23, we get # # net: 0 0 0 0 1 0 1 0 1 1 0 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 + 0 # mask: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 +0 # key: 10101010111011101111101111111111101010101010110000000000000000 +00 # # for ip searches, we set the "mask part bits" of an ip all to 1, and +do # a incremental partial match against the keys in the table. if ($o{s}) { # setup # generate some random IP addresses and masks # set all non-mask bits to 0 and store the generated # IP address (minus 1, if last byte is non-zero) # in the files 'ips' for further use in lookup timing unless (-f 'ips') { open IP,'>', 'ips'; } else { die "file 'ips' exist, won't overwrite!\n"; } my $have_file; if ($o{f}) { $have_file = open I,'<', $o{f} or die "Can't read '$o{f}': $!\ +n"; } my $c; while ($. <= $maxprefix) { my (@bytes,$mask,$value); if($have_file) { $_ = <I> or last; chomp; (@bytes[0..3],$mask,$value) = m!(\d+).(\d+).(\d+).(\d+)/(\ +d+)\s+(.*)!; } else { @bytes = map { int rand 255 } 0..3; $mask = int rand (30) + 1; # no null mask :) } my @nbits = map {split//,unpack "B*",pack "C",$_ } @bytes; my @mbits = ((1) x $mask, (0) x (32 - $mask)); my $key = ''; my @b = (); for (0..31) { my $mbit = shift @mbits; my $n = shift @nbits; my $nbit = $mbit ? $n : 0; $key .= $mbit . $nbit; push @b, $nbit; } redo if $h{$key}; my $prefix = join('.',unpack "C4",pack"B32",join'',@b).'/'.$ma +sk; $bytes[3] += 1 if $bytes[3] == 0; # XXX should check network # boundaries here... my $ip = join('.',@bytes); print IP $ip,"\n"; $h{$key} = $value || $prefix; $.++ unless $have_file; } untie %h; exit 0; } #setup my $c; # lookup counter if($o{t}) { # average open I, '<', 'ips'; eval { require Time::HiRes }; $@ and die "Sorry, no average here - Time::HiRes missing"; Time::HiRes->import(qw(gettimeofday tv_interval)); my $t0 = [gettimeofday()]; my $t1 = [gettimeofday()]; my $l; while(<I>) { chomp; my $net = match($_); $l += $c; unless($. % 1000) { my $e = tv_interval ( $t1, [gettimeofday()]); my $g = tv_interval ( $t0, [gettimeofday()]); print "$. $e, avg. ",int($l/$.); print ' lookups, ',$./$g,"/sec\n"; $t1 = [gettimeofday()]; } } my $elapsed = tv_interval ( $t0, [gettimeofday()]); print "elapsed: $elapsed\n"; print "avg. lookup time: ",$elapsed / $.,"\n"; } else { /^\d+\.\d+\.\d+\.\d+$/ and print "$_ => ",match($_),"\n" for @ARGV +; } sub match { my $ip = shift; my $packedip = pack "C4",split/\./,$ip; my $numericip = unpack "N", $packedip; my @bits = split //,unpack "B*",$packedip; # reset cursor. $x->seq(my $foo, my $bar, R_FIRST); # now search until mismatch. my ($key, $ok, $v, @net); $c = 1; # lookup counter for(@bits) { my $lk = $key .= 1 . $_; #print "$key\n"; next if $ok and $ok =~ /^$key/; # shortcut $x->seq($lk,$v,R_CURSOR); # check if this key is a candidate push @net, $v if $v and in_range($numericip, $lk); unless ($lk =~ /^$key/ && $v && length ($lk) == 64) { if ($o{a}) { print " $_\n" for @net; } print "$c lookups - " if wantarray; return pop @net; # return net with longest mask } $ok = $lk; $c++; die "more than 32 lookups" if $c > 32; } # only one or no prefix found print "$c lookups - " if wantarray; pop @net; } sub in_range { my ($ip, $bits) = @_; my $net; $bits =~ s/(.)(.)/$net .= $2; $1/ge; my ($addr, $mask) = map { pack "B32",$_ } $net, $bits; my ($n,$m,$b) = map { unpack"N",$_} $addr,$mask,$addr|~$mask; $ip >= $n && $ip <= $b; }
The big advantages of this method over Net::Patricia:
Disadvantage:
update: tweaked so you can feed it a real AS-table (e.g. ...ip_to_as/bgp.txt) to setup the database. Running this script against the aformentioned file, I get an average of 0.0021 secs / lookup.
--shmem
_($_=" "x(1<<5)."?\n".q·/)Oo. G°\ /
/\_¯/(q /
---------------------------- \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: IP prefixes: match IP against "best" network
by Fangorn (Initiate) on Jul 03, 2007 at 12:41 UTC | |
by kaiserbiz (Initiate) on Mar 01, 2008 at 15:58 UTC |