monsignor has asked for the wisdom of the Perl Monks concerning the following question:

Greeting fellow monks. I am again trying my hand at Perl and hope that someone can point me in the right direction.

I found a good free IP Range to Country/City/ASN, CSV database:

https://ipinfo.io/pricing - (Partway down the page you can see the free db downloads)

The Database seems good, but that leaves the issue of how do do lookups. I'm wondering if anybody has any suggestions?

I would think that there should be code available for find an IPv4 address in a table of Low/High addresses.

I'm open to other free solutions, or some code that I can hack. Its just for personal use to help with spam filtering.

Any suggestions would be much appreciated.

Replies are listed 'Best First'.
Re: IP Lookup Tables
by afoken (Chancellor) on Mar 03, 2024 at 11:19 UTC

    Just a little reminder: Guessing the user's location from the user's IP address is just that - guessing. I've explained one problem with IP address location years ago in Re: Help with Geo::IP output. And if you think that things got better over the past years, you are completely wrong. It got worse:

    Google tries to guess my location from my IP address. I'm still using the same internet provider, who is strictly local to Hamburg and the surrounding areas in Schleswig-Holstein and Lower Saxony. My IP address changes every 24 hours, simply because almost all providers in Germany do so (and demand an extra fee for a fixed IP address, or just completely refuse to provide private users with a fixed IP address). And so, every now and then, Google guesses that I'm somewhere in Ukraine, switches the user interface to Ukrainain with Cyrillic letters all over. And of course, because Youtube belongs to Google, all of the video ads are in Ukrainain. I don't mind video ads in foreign languages, because I simply bypass them. But switching the search engine to Ukrainian is really annoying, because most search results are Ukrainian and it is hard to switch the Google website back to English or German.

    My guess is that perhaps some of the Ukrainian refugees run VPN servers located in or around Hamburg for the people in the Ukraine, and so parts of my provider's IP range look like they are in the Ukraine. Or perhaps it's just that a lot of the Ukrainian refugees use my provider, and use Google to search in Ukrainain for information from the Ukraine. Combined with the regular change of IP addresses, every now and then I might get an IP address that had a very strong relation to the Ukraine for a few days.

    Of course, that guessing by Google is utterly nonsense. The IP address ranges used by my provider are used only in and around Hamburg. And that does not change, even if a lot of traffic is related to the Ukraine at the moment.

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: IP Lookup Tables
by NERDVANA (Priest) on Mar 03, 2024 at 08:34 UTC
    I think you're looking for Binary Search. First, you convert the IP addresses to strings of bytes, then sort them, then cut the list in half repeatedly until you find the one that matches. Here's the problem solved for IPv4. You'd need to do some additional work to solve for IPv6.
    use v5.36; use JSON::MaybeXS; use Path::Tiny; use Socket "inet_aton"; my @ranges; for (path($ARGV[0])->lines) { my $range= decode_json($_); if ($range->{start_ip} =~ /^\d+\./) { $range->{min}= inet_aton($range->{start_ip}); $range->{max}= inet_aton($range->{end_ip}); push @ranges, $range; } } @ranges= sort { $a->{min} cmp $b->{min} } @ranges; sub find_ip($ip) { my $ip_str= inet_aton($ip); my ($min, $max)= (0, $#ranges); while ($min <= $max) { my $mid= int(($min+$max)/2); if ($ranges[$mid]{min} gt $ip_str) { $max= $mid-1; } elsif ($ranges[$mid]{max} lt $ip_str) { $min= $mid+1; } else { return $ranges[$mid]; } } return undef; } use Data::Printer; say "Enter IPv4"; say "Type ^D to terminate"; while (<STDIN>) { chomp; &p( find_ip($_) ); }

    Examples:

    $ perl test.pl country_asn.json Enter IPv4 Type ^D to terminate 1.1.1.1 { as_domain "cloudflare.com", as_name "Cloudflare, Inc.", asn "AS13335", continent "OC", continent_name "Oceania", country "AU", country_name "Australia", end_ip "1.1.1.255" (dualvar: 1.1), max "&#65533;", min "\0", start_ip "1.1.1.0" (dualvar: 1.1) } 8.8.4.4 { as_domain "google.com", as_name "Google LLC", asn "AS15169", continent "NA", continent_name "North America", country "US", country_name "United States", end_ip "8.8.4.255" (dualvar: 8.8), max "\b\b&#65533;", min "\b\b\0", start_ip "8.8.4.0" (dualvar: 8.8) }

    Parsing that blob of json is fairly slow, so you probably want this to stay loaded in memory.

    I don't have any good ideas offhand for how to query this out of a database... maybe someone else has ideas. I bet Postgres has some special index type that handles ranges of values.

    If you loaded this into a database, I think you'd get decent performance from

    CREATE TABLE ip_ranges ( ... ipmin varbinary(4), ipmax varbinary(4), ... ); CREATE INDEX ON ip_ranges (ipmax ASC); SELECT * FROM ip_ranges WHERE ipmax >= ? and ipmin <= ? ORDER BY ipmax LIMIT 1
    That should follow the index straight to the record you want, then stop iterating as soon as it finds it.

      I just wanted to say a quick thank you for all the input. I did some experimenting, and for my use case I ended up with an sqlite3 database that I built myself from the csv tables.

      It took way too long to ingest the data every time I needed it, and I have sqlinte3 on the small box that I want to run this. Using an external call takes to long, and could potentially rate limit. I put a primary index using the first octet, and that made the search way faster.

Re: IP Lookup Tables
by marto (Cardinal) on Mar 03, 2024 at 08:11 UTC
Re: IP Lookup Tables
by Tux (Canon) on Mar 07, 2024 at 13:04 UTC

    What might help if you *do* have a local postgres database, is App::geoip, a tool that uses GeoIP2 data from MaxMind ...

    (note that you need a free MaxMind account for which you need te enter the data in ~/.config/geoip. Your own location is optional, but when available, the tool can show the distance)

    $ cpan App::geoip $ echo "create database geoip;" | psql $ mkdir ~/geoip $ cd ~/geoip $ cat >~/.config/geoip <<:: cat ~/.config/geoip local-location = 50.607080/50.607080 use_distance : True json-pretty : yes maxmind-account : 123456 license-id : GeoIP2-Lite license-key : aAbBcCdDeEfFgGhH :: $ geoip --fetch -- be patient. This will take a while to populate your database $ geoip -d -l 50.607080/50.607080 ipinfo.io GeoIP data for 34.117.186.192 - ipinfo.io: CIDR : 34.117.0.0/16 IP range : 34.117.0.0 - 34.117.255.255 Provider : GOOGLE-CLOUD-PLATFORM City : Kansas City, 616, 64184 Country : US United States Continent : North America Timezone : America/Chicago Location : 39.1027 / -94.5778 (1000) 39°06'09.72" / -94°34'4 +0.08" https://www.openstreetmap.org/#map=10/39.1027/-94.5778 https://www.google.com/maps/place/@39.1027,-94.5778,10z Location : 50.6071 / 50.6071 50°36'25.49" / 50°36'2 +5.49" Distance : ± 9477.56km EU member : No Satellite : No Anon Proxy: No

    Enjoy, Have FUN! H.Merijn
Re: IP Lookup Tables
by cavac (Prior) on Mar 07, 2024 at 08:50 UTC

    I pull GeoIP data into a PostgreSQL database and do the lookups from there. Here's my writeup from 2018 (the software still works fine on my system, no guarantees though): GeoIP revisited

    PostgreSQL is nice for this, since it supports CIDR address ranges, you can do stuff like this:

    SELECT country_code, city_name, latitude, longitude, radius FROM geoip WHERE ? << netblock LIMIT 1

    With the proper GIST index, it works reasonably fast, too.

    CREATE INDEX geoip_netblock_idx ON geoip USING GIST(netblock inet_ops);

    Cables_DB=# SELECT country_code, city_name, latitude, longitude, radiu +s FROM geoip WHERE '8.8.8.8' << netblock LIMIT 1; country_code | city_name | latitude | longitude | radius --------------+-----------+----------+-----------+-------- US | | 37.751 | -97.822 | 1000 (1 row) Cables_DB=# EXPLAIN SELECT country_code, city_name, latitude, longitud +e, radius FROM geoip WHERE '8.8.8.8' << netblock LIMIT 1; QUERY PLAN + ---------------------------------------------------------------------- +----------------- Limit (cost=0.41..8.43 rows=1 width=35) -> Index Scan using geoip_netblock_idx on geoip (cost=0.41..8.43 +rows=1 width=35) Index Cond: ((netblock)::inet >> '8.8.8.8'::inet) (3 rows)

    PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP