ewhitt has asked for the wisdom of the Perl Monks concerning the following question:

I could use some regex help. I am trying to extract every IP address from many text files. The IP addresses can appear anywhere in the files. What I can't figure out using the follow code is:
if ($ip2 =~ m/(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})/)
1) How to ensure each octet doesn't exceed 255 (e.g. 300.1.2.2 is not a valid IP address)

2) How to distinguish subnet masks from IP address. There should always be at least 8 bits, so I was thinking to match the first octet for "255" to identify a mask. I can't think of a better way.

Thoughts?

Thanks!

/Ethan

Replies are listed 'Best First'.
Re: Parsing IPv4 Addresses and distinguishing Masks
by bobf (Monsignor) on Dec 08, 2007 at 04:14 UTC

    Regexp::Common::net should get you started:

    $ip2 =~ m/^$RE{net}{IPv4}$/;
    (note the regex is anchored).

    I'm afraid I can't help you with question 2 - perhaps another monk has an idea.

    Update to clarify my response about the subnet mask:
    I guess if you were looking for traditional masks you could identify those that start with 255, but knowledge of the networks you're investigating I have a feeling that approach won't be very robust.

Re: Parsing IPv4 Addresses and distinguishing Masks
by TOD (Friar) on Dec 08, 2007 at 04:20 UTC
    first of all, you have to mark the period as a special character:
    /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/
    because else you mean "any one to three digits followed by any sign", which might not be what you have in mind. but now to your questions:

    1.)How to ensure each octet doesn't exceed 255 (e.g. 300.1.2.2 is not a valid IP address)
    i should rewrite the regex and the controlling if:
    my $ip; if ($line =~ /(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/ && $1 <= 255 && $2 <= 255 && $3 <= 255 && $4 <= 255) { $ip = "$1.$2.$3.$4"; }
    2) How to distinguish subnet masks from IP address.
    the answer is quite simple now, since we already have all four octets:
    my ($ip, $subnet); if ($line =~ /(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/ && $1 <= 255 && $2 <= 255 && $3 <= 255 && $4 <= 255) { $ip = "$1.$2.$3.$4"; $subnet = 1 if $1 == 255; }
    --------------------------------
    masses are the opiate for religion.
Re: Parsing IPv4 Addresses and distinguishing Masks
by Anonymous Monk on Dec 08, 2007 at 06:01 UTC
    1) $ perl -le' use Socket; my @tests = ( "127.1", "127.0.0.3", "10.11.12.13", "300.1.2.2" ); for my $test ( @tests ) { print "$test is ", inet_aton( $test ) ? "" : "NOT ", "a valid IP a +ddress."; } ' 127.1 is a valid IP address. 127.0.0.3 is a valid IP address. 10.11.12.13 is a valid IP address. 300.1.2.2 is NOT a valid IP address. 2) $ perl -le' use Socket; my @tests = ( "255.243.0.0", "255.240.0.0", "255.255.255.0", "255.255. +248.0" ); for my $test ( @tests ) { print "$test is ", unpack( "B*", inet_aton( $test ) ) =~ /^1+0+$/ +? "" : "NOT ", "a subnet mask."; } ' 255.243.0.0 is NOT a subnet mask. 255.240.0.0 is a subnet mask. 255.255.255.0 is a subnet mask. 255.255.248.0 is a subnet mask.
Re: Parsing IPv4 Addresses and distinguishing Masks
by NetWallah (Canon) on Dec 09, 2007 at 01:53 UTC
    Ifyou will be manipulating IP addresses/masks, you should take advantage of a number of excellent modules that were designed by experts for that purpose.

    I recommend Netaddr::IP. IF the "new" function returns a defined object, you have a valid IP address, that you can manipulate(and perform more rigorous tests with) .

         "As you get older three things happen. The first is your memory goes, and I can't remember the other two... " - Sir Norman Wisdom