A very quick skim of both RFC1034 and RFC1035 only gives the following grammar as a "should" - if there are more rules in the text, I haven't found them yet. RFC2181 ("Clarifications to the DNS Specification") does say "Occasionally it is assumed that the Domain Name System serves only the purpose of mapping Internet host names to data, and mapping Internet addresses to host names. This is not correct, the DNS is a general (if somewhat limited) hierarchical database, and can store almost any kind of data, for almost any purpose."

<domain> ::= <subdomain> | " " <subdomain> ::= <label> | <subdomain> "." <label> <label> ::= <letter> [ [ <ldh-str> ] <let-dig> ] <ldh-str> ::= <let-dig-hyp> | <let-dig-hyp> <ldh-str> <let-dig-hyp> ::= <let-dig> | "-" <let-dig> ::= <letter> | <digit> <letter> ::= any one of the 52 alphabetic characters A through Z in upper case and a through z in lower case <digit> ::= any one of the ten digits 0 through 9

So splitting on dots seems to be fairly safe. If you wanted to go full CPAN on this:

use Net::DNS::DomainName; my $host = "foo.example.com"; my $bare = (Net::DNS::DomainName->new($host)->label)[0]; print "$bare\n"; # prints "foo"

Although according to Net::DNS::DomainName, several of the examples in your tests are not valid domain names.

For IP addresses, you might use Regexp::Common::net to identify those, and as for port numbers and query strings, I might use URI (if you are dealing with URLs, at least).


In reply to Re: Extracting a bald host name by haukex
in thread Extracting a bald host name by eyepopslikeamosquito

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.