Re: Link Extraction when grabbing web page with USER/PASS

Why bother with LinkExtor when you can just:

use HTML::TokeParser;
my $parser = HTML::TokeParser->new( \$content ); 
my @links;
while ( my $token = $parser->get_tag(qw( a img )) ) {
    my $link = $token->[1]{href} || $token->[1]{src} || next;
    push @links, $link;
}
[download]

You will need to convert relative links to abolute if that is what you need. See Link Checker for more code.

cheers

tachyon

s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Comment on Re: Link Extraction when grabbing web page with USER/PASS Download Code

Replies are listed 'Best First'.
Re: Re: Link Extraction when grabbing web page with USER/PASS by cdherold (Monk) on Mar 04, 2003 at 05:54 UTC
ok, so you could use either of those, but the problem is why can't i get anything out with either one? is it because my web page is grabbed as a string? if so how do i change that so that i can extract links?	[reply]
Re: Re: Re: Link Extraction when grabbing web page with USER/PASS by tachyon (Chancellor) on Mar 04, 2003 at 07:08 UTC
Eh? Get page as string, stick in $content. `my $content = <<HTML; <a href="http://what.the.com">hello?</a> <a href="http://is.dis.org">hello?</a> <a href="http://your.net">hello?</a> <a href="http://problem">hello?</a> HTML use HTML::TokeParser; my $parser = HTML::TokeParser->new( \$content ); my @links; while ( my $token = $parser->get_tag(qw( a img )) ) { my $link = $token->[1]{href} \|\| $token->[1]{src} \|\| next; push @links, $link; } print "@links";` [download] cheers tachyon s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print	[reply] [d/l]