in reply to Re: Re: Is there a faster / more efficient / quicker or easier way to do this ?
in thread Stripping a-href tags from an HTML document
while(defined(my $t = $p->get_token())){ print(TEMPO $t->as_is), next unless $t->is_tag('a'); my $attr = $t->return_attr; print( "\nHREF TAG-->[", ++$hrefCount, "]-->", $attr->{href}, "\n\n" ) if exists $attr->{href}; }
Doing so it occured to me it will discard A NAME too - and fixing that is not entirely trivial as you need to keep track of whether the start tag was dropped or kept when you come across a closing /A.
Update: this should work. Untested, but you get the idea.
my @stack; while(defined(my $t = $p->get_token())){ if($t->is_start_tag('a')) { my $attr = $t->return_attr; push @stack, exists $attr->{href}; print( "\nHREF TAG-->[", ++$hrefCount, "]-->", $attr->{href}, "\n\n" ), next if $stack[-1]; } next if $t->is_end_tag('a') and pop @stack; print TEMPO $t->as_is; }
Makeshifts last the longest.
|
|---|