A couple of days ago i looked into GeoIP in Perl+PostgreSQL+GeoIP = Awesome!. Since then i have learned that the GeoIP lists i was using were out of support. The new public lists are in a new format. So i took the opportunity to rewrite the whole thing and do pretty much everything with Perl, not using external commands like "unzip" and "wget". This should make things a bit more portable.
I'm sorry, it isn't written "nice" and isn't really documented. I designed it as a cron job for a single private server ;-)
I'm still calling the perl interpreter from a bash script so i can set the correct environment variables and stuff. But it's a lot smaller now:
#!/usr/bin/env bash . ~/.bashrc_activestate cd /home/myuser/src/geoip perl updategeoip.pl
The database tables stays exactly the same as in the last post, here again for reference:
CREATE TABLE geoip ( netblock cidr NOT NULL, country_code text NOT NULL, country_name text NOT NULL, CONSTRAINT geoip_pk PRIMARY KEY (netblock) USING INDEX TABLESPACE "NAMEOFINDEXTABLESPACE" ) WITH ( OIDS=FALSE ) TABLESPACE "NAMEOFDATATABLESPACE"; ALTER TABLE geoip OWNER TO "mydatabaseuser";
And here is the new all-in-one script:
#!/usr/bin/env perl use strict; use warnings; use diagnostics; use LWP::Simple qw[get]; use Archive::Zip; use Data::Dumper; use DBI; my $dbh = DBI->connect("dbi:Pg:dbname=Cables_DB;host=localhost", 'myda +tabaseuser', 'secretpassword', {AutoCommit => 0, RaiseError => 0}) or die("can't connect to DB"); my $dodownload = 1; if($dodownload) { print "Downloading...\n"; my $url = 'http://geolite.maxmind.com/download/geoip/database/GeoL +ite2-Country-CSV.zip'; my $zipdata = get $url; if(!defined($zipdata) || !length($zipdata)) { die("Could not get ZIP data\n"); } open(my $OFH, '>', 'geoip.zip') or die($!); binmode $OFH; print $OFH $zipdata; close $OFH; } my $zip = Archive::Zip->new(); $zip->read('geoip.zip'); my @fnames = $zip->memberNames(); #print Dumper(\@fnames); # Read english "locations" file my ($locmembername) = $zip->membersMatching('.*\/GeoLite2\-Country\-Lo +cations\-en\.csv$'); $zip->extractMember($locmembername, 'locations.csv'); open(my $locfh, '<', 'locations.csv') or die($!); my %locations; my $tmp = <$locfh>; # Ignore header; while((my $line = <$locfh>)) { chomp $line; my ($id, undef, $continentcode, $continentname, $countrycode, $cou +ntryname) = split/\,/, $line; if($countrycode eq '' && $continentcode ne '') { $countrycode = $continentcode; } if($countryname eq '' && $continentname ne '') { $countryname = $continentname; } if($id eq '' || $countrycode eq '' || $countryname eq '') { print("Locations read error on line: $line\n"); next; } my %country = ( code => $countrycode, name => $countryname, ); $locations{$id} = \%country; } close $locfh; # Extract IPv4 my ($ipv4membername) = $zip->membersMatching('.*\/GeoLite2\-Country\-B +locks\-IPv4\.csv$'); $zip->extractMember($ipv4membername, 'ipv4.csv'); # Extract IPv6 my ($ipv6membername) = $zip->membersMatching('.*\/GeoLite2\-Country\-B +locks\-IPv6\.csv$'); $zip->extractMember($ipv6membername, 'ipv6.csv'); # Load both files my @netblocks; foreach my $fname (qw[ipv4.csv ipv6.csv]) { open(my $ipfh, '<', $fname) or die($!); $tmp = <$ipfh>; # Ignore header while((my $line = <$ipfh>)) { chomp $line; my ($cidr, $countryid, $countryid2, $countryid3) = split/\,/, +$line; if($countryid eq '' && $countryid2 ne '') { $countryid = $countryid2; } if($countryid eq '' && $countryid3 ne '') { $countryid = $countryid3; } if($cidr eq '' || $countryid eq '') { print ("Unknown country in File $fname on line: $line\n"); next; } if(!defined($locations{$countryid})) { die("Fail in file $fname, country code $countryid unknown +in line: $line\n"); } my %netblock = ( cidr => $cidr, code => $locations{$countryid}->{code}, name => $locations{$countryid}->{name}, ); push @netblocks, \%netblock; } close $ipfh; } print 'Loaded ', scalar @netblocks, " entries\n"; unlink('locations.csv'); unlink('ipv4.csv'); unlink('ipv6.csv'); print "Truncating table...\n"; $dbh->do("TRUNCATE geoip"); my $insth = $dbh->prepare_cached("INSERT INTO geoip (netblock, country +_code, country_name) VALUES (?,?,?)") or die($dbh->errstr); my $linecount = 0; print "Inserting data into table...\n"; foreach my $block (@netblocks) { $linecount++; if($linecount % 10000 == 0) { my $percent = int(($linecount / scalar @netblocks) * 10000) / +100; print " $linecount = $percent%...\n"; } my $countrycode = sanitize($block->{code}); my $countryname = sanitize($block->{name}); my $cidr = $block->{cidr}; if(!$insth->execute($cidr, $countrycode, $countryname)) { $dbh->rollback; print "Fail on $linecount: ", $dbh->errstr, "\n", Dumper($bloc +k), "\n"; exit(1); } } $dbh->commit; $dbh->disconnect; unlink('geoip.zip'); print "Done\n"; exit(0); sub sanitize { my ($val) = @_; $val =~ s/^\"//; $val =~ s/\"$//; return $val; }
This script uses the newish "GeoLite2" databases from MaxMind. If you use them, please make sure you comply to the open source licensing stated on their official page
.In reply to GeoIP revisited by cavac
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |