Re: Re: extracting web links

Replies are listed 'Best First'.

Re: Re: Re: extracting web links
by Corion (Patriarch) on Dec 27, 2003 at 22:22 UTC

All modules will only find links in anchor tags or image links. Your regular expression dosen't seem very valid to me, so I doubt that it will find more links, but it will surely find different links, as it will more or less gobble up anything that remotely looks like a link in double quotes, while leaving out links in single quotes.

I'm not sure about your requirements, but for me, any of these modules has always been enough. If you have special requirements as to the nature of links extracted, please state them more specifically and if possible, with examples.

perl -MHTTP::Daemon -MHTTP::Response -MLWP::Simple -e ' ;    # The  
$d = new HTTP::Daemon and fork and getprint $d->url and exit;#spider
($c = $d->accept())->get_request(); $c->send_response( new   #in the
HTTP::Response(200,$_,$_,qq(Just another Perl hacker\n))); ' #  web
[download]

[reply]
[d/l]

Re: Re: Re: Re: extracting web links

by PodMaster (Abbot) on Dec 28, 2003 at 09:53 UTC

I believe you meant to say all except HTML::LinkExtractor, which gets them all :).

Read more... (3 kB)

MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
** The third rule of perl club is a statement of fact: pod is sexy.

[reply]
[d/l]

Re: Re: Re: extracting web links
by dominix (Deacon) on Dec 27, 2003 at 22:49 UTC

http://user:passwd@site

perl -Mre=debug -e '"  "=~/href\s*=\s*"*([^"\s]+)"*\s*>/gi'
put your URL here    ^^
[download]

may be

--
dominix

[reply]
[d/l]

Re: Re: Re: Re: extracting web links

by drake50 (Pilgrim) on Dec 27, 2003 at 22:55 UTC

[reply]