It fulfils all of your listed requirements, and is certainly seeing active usage on our production sites.
This code should do what you need (untested):my $s = HTML::Stripscripts::Parser->new({ Context => 'Flow', # Only allow these tags BanAllBut => [qw(p a img h3 div em)], # Allow src and href AllowSrc => 1, AllowHref => 1, Rules => { # remove empty p tags p => sub { return length $_[1]->{content} }, # a must have a local href a => { href => \&strip_abs_uri, tag => sub { return 0 unless $_[1]->{href} }, }, # img must have a local src img => { src => \&strip_abs_uri, tag => sub { return 0 unless $_[1]->{src} }, }, # Allow id and class for all tags '*' => { id => 1, class => 1, } }, }); sub strip_abs_uri { my ( $filter, $tag, $attr_name, $attr_val ) = @_; return 1 unless $attr_name =~/href|src/ return $attr_val=~m{://}; } print $s->filter_html($html);
In reply to Re: Dynamically cleaning up HTML fragments
by clinton
in thread Dynamically cleaning up HTML fragments
by SilasTheMonk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |