RFC 1912 defines the standard for naming a machine on a network. Here's a quote:

DNS domain names consist of "labels" separated by single dots. Allowable characters in a label for a host name are only ASCII letters, digits, and the `-' character. Labels may not be all numbers, but may have a leading digit (e.g., 3com.com). Labels must end and begin only with a letter or digit.

This seems simple enough to write as one massive regex, but i've had little to no luck making _everything_ work. So here's what I'm using (after testing for bad characters with tr/a-zA-Z\-\./):
$error=1 if ($hostname =~/\.$/); #trailing . is bad my @labels = split(/\./, $good_string); foreach my $foo (@labels) { $error=1 if ($foo=~/^\-/); #can't start with a - $error=1 if ($foo=~/\-$/); #can't end with a - $error=1 if ($foo=~/^\d+$/); #can't be only numeric last if $error; } if ($error) { print "A hostname!\n"; } else { print "Not a hostname!\n"; }
This works, but doesn't look very JAPH-esque. I started to write a regex, but after it got longer than 1 line i gave up (faster development always beats l33ter code). Here's what i started on:
$hostname =~/^[^-]([a-zA-z\-])+[^-](\.[^-][a-zA-Z0-9\-]+?[^-])+?/;
Ugh. The not matching of the "-" at the beginning of a line counts as one match (as does the end), so this failed when the label was shorter than 3 characters. Anyone have an idea on how to implement this in one line?

Bonus Question:

More from the RFC:

You should also be careful to not have addresses which are valid alternate syntaxes to the inet_ntoa() library call. For example 0xe is a valid name, but if you were to type "telnet 0xe", it would try to connect to IP address 0.0.0.14. It is also rumored that there exists some broken inet_ntoa() routines that treat an address like x400 as an IP address.


Any ideas? Perhaps evaling an inet_ntoa call on each label?

BlueLines

Disclaimer: This post may contain inaccurate information, be habit forming, cause atomic warfare between peaceful countries, speed up male pattern baldness, interfere with your cable reception, exile you from certain third world countries, ruin your marriage, and generally spoil your day. No batteries included, no strings attached, your mileage may vary.

In reply to Matching RFC1912 compliant hostnames by BlueLines

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.