Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

In private conversations with ybiC, we discussed one of the major problems with trying to use regex-en to test an IP for validity: whether it makes sense to be using it in a particular application. For instance, it would not make sense to use a class D or E address when configuring a PC. Or, in many cases, addresses in RFC-defined private addressing space would not be appropriate, but is the case under examination one where such an address is appropriate? To truly test for validity of the address would thus seem to require knowledge specific to the application and its environment, either coded into the application, or determined by some form of active testing.

If appropriate, one direction you can go is to remove the user's ability to cause errors by presenting them with a valid grouping of addresses to select from, which is the approach I have taken in one of the applications I have written for work. The listing depends upon the addresses entered into that listing to be valid, and so again the problem raises its head.

In the case of adding the addresses for the application I mentioned, unfortunately I can only truly depend upon the vigilance of those administrators adding data the application will pull from to make sure it is correct and valid, as I can only test for those cases where the data is formatted incorrectly-not where it is valid but inappropriate.

In discussing with ybiC, there are cases that fall into ranges that can be useful filters, such as the aforementioned class D/E address space, the localhost addressing space, or the RFC-defined private address space. To that end, I offer what I hope are some useful filters that may aid in this. Assuming we have validated that the format is proper (remembering both the "Traps and Snares" and "Multiple Representations" sections above), let us first convert the address in question to a number (my appologies if there are errors on these, as I generally only use the a.b.c.d format). Having done thus, it is now much easier to filter, or convert to whichever format is needed (by doing much the reverse of the ip2bin? functions). Now, sample code.

sub ip2bin4 { my $ip = shift; # ip format: a.b.c.d return(unpack("N", pack("C4", split(/\D/, $ip)))); } sub ip2bin3 { my $ip = shift; # ip format: a.b.c return(unpack("N", pack("C2S", split(/\D/, $ip)))); } sub ip2bin2 { my $ip = shift; # ip format: a.b return(unpack("N", pack("CL", split(/\D/, $ip)))); } sub ip2bin1 { my $ip = shift; # ip format: a - for consistancy return($ip); } sub is_rfc_private { my $address = shift; return(1) if ((0x10000000 <= $address) and ($address <= 0x10FFFFFF)); # 10.0.0.0/8 return(1) if ((0xAC100000 <= $address) and ($address <= 0xAC101FFF)); # 172.16.0.0/12 return(1) if ((0xC0A80000 <= $address) and ($address <= 0xC0A8FFFF)); # 192.168.0.0/16 return(0); } sub is_class_d { my $address = shift; return(1) if ((0xE0000000 <= $address) and ($address <= 0xEFFFFFFF)); # 224.0.0.0/4 return(0); } sub is_class_e { my $address = shift; return(1) if ((0xF0000000 <= $address) and ($address <= 0xFFFFFFFF)); # 240.0.0.0/4 return(0); } sub is_localhost { my $address = shift; return(1) if ((0x7F000000 <= $address) and ($address <= 0x7FFFFFFF)); # 127.0.0.0/8 return(0); } sub is_linklocal { my $address = shift; return(1) if ((0xA9FE0000 <= $address) and ($address <= 0xA9FEFFFF)); # 169.254.0.0/16 return(0); } sub is_testnet { my $address = shift; return(1) if ((0xC0000200 <= $address) and ($address <= 0xC00002FF)); # 192.0.2.0/24 return(0); }

Other, similar tests I believe could easily be written from this point-these were examples. Admittedly, while I am sure there are probably modules in CPAN to perform tests of this type, I do not know them off-hand, so I welcome the input of others.

It is important to remember that to truly validate the representation of an IP address, regex-en are but one part, as one must understand the environment in which it is to be used.

Update: Extended comments in ip2bin? subroutines.

Update: Fixed bug in test for 172.16.0.0, because of incorrect CIDR (was /16, is /12).

Update: Added routines for Link-Local (169.254.0.0/16) and TEST-NET (192.0.2.0/24) address ranges.

Update: Fixed typo in code.

Update: (17 Mar 2005) Fixed missing '(' in conditions in is_linklocal and is_testnet functions.


In reply to Re: Don't Use Regular Expressions To Parse IP Addresses! by atcroft
in thread Don't Use Regular Expressions To Parse IP Addresses! by ybiC

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (4)
As of 2024-03-28 20:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found