in reply to Re: Cleaning up HTML
in thread Cleaning up HTML
Now, how do you recommend combining two span tags? For example, I have this crufty html:
I'd like to combine the two span tags, merging their style attributes:<span style="font-family:Arial"><span style="font-size:10pt; color:#00 +0080;">text</span></span>
Well, I know HTML::StripScripts has built in handling of style tags...<span style="font-family:Arial; font-size:10pt; color:#000080;">text</ +span></span>
Now, if I do a callback for the span like
then I get this result for the outer span:my $p = HTML::StripScripts::Parser->new({ Rules => { span => sub { my ($filter,$element) = @_; print Dumper $element if $element->{content} =~ /^ +<span\W/; 1; }, } } );
$VAR1 = { 'content' => '<span style="font-size:10pt; color:#000080;">t +ext</span>', 'tag' => 'span', 'attr' => { 'style' => 'font-family:Arial' } };
How do you recommend to proceed from here? Should I parse the "content" again, and how?
Also... how do you remove tags but not its content? If I return '0' from this callback sub, then both the tags and the inner HTML are gone.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Cleaning up HTML
by clinton (Priest) on Apr 22, 2008 at 09:51 UTC |