in reply to Re^2: regex to extract fully-qualified domain name from full URL
in thread regex to extract fully-qualified domain name from full URL


ooo, you're quite right - https completely slipped my mind.
I'd have to agree with
^(https?://[^/]*)
But frankly, if you're going to include mailto, I think by rights all the other (multifarious) possibilities ought to match as well... in which case it really is time to reach for a module, as you initially suggested.

I disagree with you about possible slashes, ats and other funny characters in the username and password though - my (cursory) examination of the RFCs indicates they're both 'unsafe' and 'reserved' - and it says... Within the user and password field, any ":", "@", or "/" must be encoded RFC1738 - not sure this is still the current one though (?). And this seems to make sense, given the slash is a delimiter within the URL.

andy.

looking into it further... RFC1738 superceded by RFC2396... but I need to go and do some Real Work... ;)

  • Comment on Re: Re: Re: regex to extract fully-qualified domain name from full URL
  • Download Code