Re: Using URI::Find with HTML

You could use this technique with HTML::Parser that by default passes the tags, attributes, and comments through un-touched, but for the text portion performs the substitution above. Be sure to set "unbroken text" so you don't get two callbacks in a given text run.

Adaping one of the examples there, it'd be something like:

  use HTML::Parser;
  HTML::Parser->new(
                    unbroken_text => 1,
                    default_h => [sub { print shift }, 'text'],
                    text_h => sub { my $text = shift; (URI::Find here)
+; print $text }, 'text'],
                   )->parse_file(shift || die) || die $!;
[download]

-- Randal L. Schwartz, Perl hacker
Be sure to read my standard disclaimer if this is a reply.

Comment on Re: Using URI::Find with HTML Download Code

Replies are listed 'Best First'.
Re^2: Using URI::Find with HTML by skx (Parson) on Dec 06, 2005 at 15:59 UTC
Thanks a lot for your help, that pointed me in the right direction. Steve --	[reply]