hrcerq has asked for the wisdom of the Perl Monks concerning the following question:
Hello again.
I always used naive regexps for hostname validation. But recently I've been trying to build something more robust and more adherent to related RFCs.
Mostly, I've consulted the following RFCs:
From that I understand that:
If the hostname is qualified (i.e. there are at least 2 labels), then:
BTW, consulting RFCs sometimes feels like walking a complex maze full of hidden traps, because there's always some obscure detail you might overlook.
Things get worse if we consider some hostnames in the wild not adherent to these rules (e.g. some use underscores, which is valid for DNS, but not when used in hostnames), and also that there exist internationalized domain names.
I've tested my regex, but chances are, there are corner cases I'm not aware of, so maybe anyone you might help me find such cases.
This is how I'm doing:
my $hname_re = qr/ ^ (?=(?&validchar){1,255}$) (?!\d+$) (?&label) (?: (?:\.(?&label))* \.(?&tld) \.? )? $ (?(DEFINE) (?<validchar>[a-zA-Z0-9.-]) (?<alnum>[a-zA-Z0-9]) (?<alnumdash>[a-z-A-Z0-9]) (?<label>(?> (?&alnum) (?: (?&alnumdash){,61} (?&alnum) )? ) ) (?<tld>(?!(\d+|.)\.?$) (?&label) ) ) /x;
Thanks for any suggestions.
return on_success() or die;
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Regex for hostname validation
by Discipulus (Canon) on May 02, 2025 at 08:05 UTC | |
by hrcerq (Monk) on May 03, 2025 at 01:03 UTC | |
|
Re: Regex for hostname validation
by Fletch (Bishop) on May 02, 2025 at 11:38 UTC | |
by hrcerq (Monk) on May 03, 2025 at 01:05 UTC |