Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Extracting a domain name from a url

by blazar (Canon)
on Oct 15, 2006 at 15:27 UTC ( [id://578392]=note: print w/replies, xml ) Need Help??


in reply to Extracting a domain name from a url

From -> http://yahoo.co.uk/notfound.html
extract only -> yahoo.co.uk

rhesa already suggested using a specialized module, which yields a superior solution, which is superior because it uses a a specialized module (and that's generally the case), but this shouldn't be hard to do with a match or split, in which case some familiarity with elementary regexen should help you. In particular the following should be fine for you:

my $url='http://yahoo.co.uk/notfound.html'; my $host=(split m(/+), $url)[1];

Replies are listed 'Best First'.
Re^2: Extracting a domain name from a url
by rhesa (Vicar) on Oct 15, 2006 at 18:26 UTC
    Yup, your solution does work for the majority of urls.

    For the record, here are some urls that wouldn't be handled properly with your regex:

    http://proxy.aol.com:8080/ http://user:pass@yahoo.com/login
      thanks monks, as always you show light on things i cannot see :)
      This one is one I've been using in production for a while, and seems to hold up well:

      my $url="http://proxy.aol.com:8080/login"; my($host)=$url=~/http:\/\/([^\/]+)/;

      word!
      -Ev

      Update: Sorry, I'm a moe - I forgot to add the point of this post in my example!

      Good judgement comes with experience. Unfortunately, the experience usually comes from bad judgement.

        You code fails with the input you gave it. Furthermore, it doesn't work with the other example in the post to which you replied either.

        http://proxy.aol.com:8080/ gives proxy.aol.com:8080 instead of proxy.aol.com.

        http://user:pass@yahoo.com/login gives user:pass@yahoo.com instead of yahoo.com

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://578392]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2024-04-24 02:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found