Compare two files (Ip addresses)

jorain has asked for the wisdom of the Perl Monks concerning the following question:

Perl newbie needs help comparing two files that contain IP address to find similarities. I need to match on the first three octets from both files first, then I have to handle a range (e.g., 123.45.67.89-123.89.45.67) - would appreciate any assistance. </readme>

File1
138.63.20.48
63.208.170.231
132.3.0.193
63.208.170.198
63.236.1.136
63.236.1.139
205.161.5.239


The second file looks like this:
File2 
Company1-100.45.0.0-100.45.255.255
Company2-227.133.171.0-227.133.173.0
Company3-63.208.170.5-63.208.170.254
Company4-95.214.36.0-95.214.39.255
Company5-35.117.181.0-35.117.181.127
Company6-55.207.128.0-55.207.143.255 
Company7-138.63.20.12-138.63.20.95
[download]

<readme> Need to have a script that will match the ip 138.63.20.48 with the company7 info in file 2. Code is started and looks like this: </readme>

open (IP1,"ipsnew.txt");
open (WHERE,"whereby.txt");

while (<IP1>) {  

     if (/^(.*?)(\d+\.\d+\.\d+)(.*?)$/) {
        $beg_line1{2} = $1;
        $ip1 = $2;
        # print IP1 "$ip1\n";
        # print "@ip1\n";
    
        }       

     }

while (<WHERE>) {

     
     

     if (/^(.*?)(\d+\.\d+\.\d+)(.*?)$/) {
        $beg_line2 = $1;
        $ip2 = $2;
        # print IP1 "$ip1\n";
        # print "@ip2\n";
    
        }       

     }

      if ($beg_line1{$ip1} == $beg_line2{$ip2}) {

        print "we have a match at $ip1\n";}
[download]

<readme> Would appreciate any help </readme>

Comment on Compare two files (Ip addresses) Select or Download Code

Replies are listed 'Best First'.
Re: Compare two files (Ip addresses) by NetWallah (Canon) on May 16, 2007 at 05:03 UTC
Others have pointed out vairous structures and file processing mechanics. There are two modules that will help you validate and manipulate IP addresses: Regex::Common will help validate/extract IP addresses from text NetAddr::IP will help deal with IP address ranges, and easily discover if a particular address is within a specified range `my $ip = new NetAddr::IP 'loopback'; print "The address is ", $ip->addr, " with mask ", $ip->mask, "\n" ; if ($ip->within(new NetAddr::IP "127.0.0.0", "255.0.0.0")) { print "Is a loopback address\n"; } # Or - more likely you want to use this .. $me->contains($other) ...` [download] "An undefined problem has an infinite number of solutions." - Robert A. Humphrey "If you're not part of the solution, you're part of the precipitate." - Henry J. Tillman	[reply] [d/l]
Re: Compare two files (Ip addresses) by graff (Chancellor) on May 16, 2007 at 02:23 UTC
I think it would make more sense to read File2 first, to get the labels that need to be associated with various IP ranges. If the known IP values are treated as hash keys (and company names are the hash values), then it becomes very simple to look up the addresses in File1, and spit out the company name when there's a match. You just need to make sure to handle the IP-range issues properly -- if I understand the question, a File2 entry like "Company6-55.207.128.0-55.207.143.255" would be a hit for any File1 IP whose third component falls between 128 and 143. Something like this could get you started: use strict; my %ip_company; open( I, "File1" ) or die "File1: $!"; while (<I>) { chomp; my ( $company, $bgn_IP, $end_IP ) = split /-/; next unless ( $bgn_IP =~ /^(\d+\.\d+\.)(\d+)\.\d+$/ ); my ( $bgn_q12, $bgn_q3 ) = ( $1, $2 ); next unless ( $end_IP =~ /^(\d+\.\d+\.)(\d+)\.\d+$/ ); my ( $end_q12, $end_q3 ) = ( $1, $2 ); # NB: if $bgn_q12 ne $end_q12, we need some different logic... $ip_company{$bgn_q12.$bgn_q3} = $company; if ( $bgn_q3 != $end_q3 ) { for my $next_q3 ( $bgn_q3+1 .. $end_q3 ) { $ip_company{$bgn_a12.$next_q3} = $company; } } } open( I, "File2" ) or die "File2: $!"; while (<I>) { chomp; ( my $lookup = $_ ) =~ s/\.\d+$//; if ( exists( $ip_company{$lookup} )) { print "$_ is part of $ip_company{$lookup}\n"; } else { print "$_ is not part of any known company\n"; } } [download] (not tested) Handling a range like "123.45.67.89-123.89.45.67" is left as an exercise... (or maybe you do don't have to go there). (update: fixed last sentence so it makes sense)	[reply] [d/l]
Re: Compare two files (Ip addresses) by thezip (Vicar) on May 15, 2007 at 22:52 UTC
Your data representation in File2 does not match the filespec you describe verbally (ie. Company 2 spans across subnets, and hence it is unclear how/why it would match the entries from File1 if only the last octet is to be compared). Do you have any control over how data will be represented in the files? If you do, you might benefit from redesigning the data file layouts. Also, please reformat your question with code tags, as in: `<code> ... your tidy Perl code goes here ... </code>` [download] Where do you want them* to go today?*	[reply] [d/l]