Re: Extracting full links from HTML


Think about Loose Coupling
	PerlMonks

Re: Extracting full links from HTML

by Scott7477 (Chaplain)

on Feb 02, 2007 at 18:17 UTC ( [id://597985]=note: print w/replies, xml )

Need Help??

in reply to Extracting full links from HTML

Here is code that looks for a link to an HTML page from the command line and generates links to each image found in the HTML page. I just took wfsp's code and swapped out his hardcoded links. Update: Also changed the code so that the full URL of each image prints. I figure that would be handy to allow for downloading any or all of the images if so desired.

use strict;
use LWP::Simple;
use HTML::TokeParser::Simple;

#usage imglinker http://www.example.com 

my $url = shift;
my $content = get ($url);


    my $p = HTML::TokeParser::Simple->new(\$content);     
        my $in_anchor;
        while (my $t = $p->get_token){
          if ($t->is_start_tag('a')){
           $in_anchor++;
            next;
              }
          if ($t->is_start_tag('img') and $in_anchor){
            my $src = $t->get_attr('src');
            print $url."/"."$src\n";
            $in_anchor = 0;
    }
    }
[download]

Comment on Re: Extracting full links from HTML Download Code

In Section Seekers of Perl Wisdom

Domain Nodelet^?

www.com | www.net | www.org

Node Status^?

node history
Node Type: note [id://597985]
help

Chatterbox^?

How do I use this? • Last hour • Other CB clients

Other Users^?

Others browsing the Monastery: (5)

As of 2024-04-16 06:04 GMT

Sections^?

Information^?

Find Nodes^?

Leftovers^?

Today I Learned

Voting Booth^?

No recent polls found