in reply to IP Lookup Tables

I think you're looking for Binary Search. First, you convert the IP addresses to strings of bytes, then sort them, then cut the list in half repeatedly until you find the one that matches. Here's the problem solved for IPv4. You'd need to do some additional work to solve for IPv6.
use v5.36; use JSON::MaybeXS; use Path::Tiny; use Socket "inet_aton"; my @ranges; for (path($ARGV[0])->lines) { my $range= decode_json($_); if ($range->{start_ip} =~ /^\d+\./) { $range->{min}= inet_aton($range->{start_ip}); $range->{max}= inet_aton($range->{end_ip}); push @ranges, $range; } } @ranges= sort { $a->{min} cmp $b->{min} } @ranges; sub find_ip($ip) { my $ip_str= inet_aton($ip); my ($min, $max)= (0, $#ranges); while ($min <= $max) { my $mid= int(($min+$max)/2); if ($ranges[$mid]{min} gt $ip_str) { $max= $mid-1; } elsif ($ranges[$mid]{max} lt $ip_str) { $min= $mid+1; } else { return $ranges[$mid]; } } return undef; } use Data::Printer; say "Enter IPv4"; say "Type ^D to terminate"; while (<STDIN>) { chomp; &p( find_ip($_) ); }

Examples:

$ perl test.pl country_asn.json Enter IPv4 Type ^D to terminate 1.1.1.1 { as_domain "cloudflare.com", as_name "Cloudflare, Inc.", asn "AS13335", continent "OC", continent_name "Oceania", country "AU", country_name "Australia", end_ip "1.1.1.255" (dualvar: 1.1), max "&#65533;", min "\0", start_ip "1.1.1.0" (dualvar: 1.1) } 8.8.4.4 { as_domain "google.com", as_name "Google LLC", asn "AS15169", continent "NA", continent_name "North America", country "US", country_name "United States", end_ip "8.8.4.255" (dualvar: 8.8), max "\b\b&#65533;", min "\b\b\0", start_ip "8.8.4.0" (dualvar: 8.8) }

Parsing that blob of json is fairly slow, so you probably want this to stay loaded in memory.

I don't have any good ideas offhand for how to query this out of a database... maybe someone else has ideas. I bet Postgres has some special index type that handles ranges of values.

If you loaded this into a database, I think you'd get decent performance from

CREATE TABLE ip_ranges ( ... ipmin varbinary(4), ipmax varbinary(4), ... ); CREATE INDEX ON ip_ranges (ipmax ASC); SELECT * FROM ip_ranges WHERE ipmax >= ? and ipmin <= ? ORDER BY ipmax LIMIT 1
That should follow the index straight to the record you want, then stop iterating as soon as it finds it.

Replies are listed 'Best First'.
Re^2: IP Lookup Tables
by monsignor (Acolyte) on Mar 30, 2024 at 15:56 UTC

    I just wanted to say a quick thank you for all the input. I did some experimenting, and for my use case I ended up with an sqlite3 database that I built myself from the csv tables.

    It took way too long to ingest the data every time I needed it, and I have sqlinte3 on the small box that I want to run this. Using an external call takes to long, and could potentially rate limit. I put a primary index using the first octet, and that made the search way faster.