Thanks to Ovid, you can do this:
#!/usr/bin/perl use HTML::TokeParser::Simple; use strict; warn "Strip HREF\n"; my $p = HTML::TokeParser::Simple->new($ARGV[0]); while ( my $token = $p->get_token ) { next if ($token->is_start_tag('a') || $token->is_end_tag('a')) +; print $token->as_is; } warn "Extract HREF\n"; my $p = HTML::TokeParser::Simple->new($ARGV[0]); while ( my $token = $p->get_token ) { next unless ($token->is_start_tag('a')); print $token->return_attr->{href}, "\n"; }
HTH, Valerio
In reply to Re: Is there a faster / more efficient / quicker or easier way to do this ?
by valdez
in thread Stripping a-href tags from an HTML document
by keyDemun
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |