in reply to Re^2: Pulling a Page with LWP::UserAgent and fixing URLs?
in thread Pulling a Page with LWP::UserAgent and fixing URLs?
Then your result is in $new_html. Of course, this won't handle everything, since you could have references to images, etc in Javascript, for example.my $parser = HTML::TokeParser::Simple->new(string => $html); my $new_html; while ( my $token = $parser->get_token ) { for ( 'src', 'href' ) { my $attr = $_; my $value; next unless $value = $token->get_attr($attr); next unless $value =~ /\.(gif|jpe?g|png|swf)$/; $value =~ s/\/([\.[:word:]\-]+?)$/$new_url$1/; $token->set_attr($attr,$value); } $new_html .= $token->as_is; }
|
|---|