in reply to Re: regex to extract fully-qualified domain name from full URL
in thread regex to extract fully-qualified domain name from full URL

To be more thorough, perhaps:    ^(https?://[^/]*) Or further:    ^((?:https?|mailto)://[^/]*) You should also hope that your 'username' and 'password' do not contain any slashes. The only restriction would appear to be that the username cannot contain a ':', and the password cannot contain an '@', though this could be browser dependent.

Replies are listed 'Best First'.
Re: Re: Re: regex to extract fully-qualified domain name from full URL
by andye (Curate) on Mar 23, 2001 at 15:15 UTC

    ooo, you're quite right - https completely slipped my mind.
    I'd have to agree with
    ^(https?://[^/]*)
    But frankly, if you're going to include mailto, I think by rights all the other (multifarious) possibilities ought to match as well... in which case it really is time to reach for a module, as you initially suggested.

    I disagree with you about possible slashes, ats and other funny characters in the username and password though - my (cursory) examination of the RFCs indicates they're both 'unsafe' and 'reserved' - and it says... Within the user and password field, any ":", "@", or "/" must be encoded RFC1738 - not sure this is still the current one though (?). And this seems to make sense, given the slash is a delimiter within the URL.

    andy.

    looking into it further... RFC1738 superceded by RFC2396... but I need to go and do some Real Work... ;)