in reply to Stripping a-href tags from an HTML document
Thanks to Ovid, you can do this:
#!/usr/bin/perl use HTML::TokeParser::Simple; use strict; warn "Strip HREF\n"; my $p = HTML::TokeParser::Simple->new($ARGV[0]); while ( my $token = $p->get_token ) { next if ($token->is_start_tag('a') || $token->is_end_tag('a')) +; print $token->as_is; } warn "Extract HREF\n"; my $p = HTML::TokeParser::Simple->new($ARGV[0]); while ( my $token = $p->get_token ) { next unless ($token->is_start_tag('a')); print $token->return_attr->{href}, "\n"; }
HTH, Valerio
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Is there a faster / more efficient / quicker or easier way to do this ?
by PodMaster (Abbot) on Jan 09, 2003 at 23:15 UTC | |
by Aristotle (Chancellor) on Jan 12, 2003 at 00:12 UTC |