in reply to Re: Pondering Portals
in thread Pondering Portals

I use HTML::Scrubber on one of my sites, the only problem I have with it (which I was vaguely thinking of posting as a new question only yesterday) is that I see no way to enforce attribute inclusion.

Say the user submits:

<a href="http://example.com">text</a>

I would like to automatically insert, or mandate, the xrel="nofollow" attribute and value - I can't see a simple way of doing this short of re-using the HTML::Parser, or a fragile regexp.

That's the only shortcoming I see with HTML::Scrubber.

Steve
---
steve.org.uk

Replies are listed 'Best First'.
Re^3: Pondering Portals
by bmann (Priest) on May 01, 2005 at 06:09 UTC
    Have you considered subclassing HTML::Scrubber? Below, I inject the xrel attribute into each anchor before validation.

    $ cat XREL.pm package XREL; use strict; use base 'HTML::Scrubber'; sub _validate { my ($self, $t, $r, $a, $as) = @_; if ( $t eq 'a' ) { $$a{ rel } = 'nofollow'; push @$as, 'rel' unless grep { /rel/ } @$as; } $self->SUPER::_validate( $t, $r, $a, $as ); } 1;
    $ cat scrub.pl #!/usr/bin/perl use warnings; use strict; use XREL; my $scrubber = XREL->new( allow => [ qw[ a p b i u hr br ] ] ); $scrubber->rules( a => { href => 1, rel => qr/^nofollow$/i, '*' => 0, } ); my $html = q[<a href="http://perlmonks.org">link </a>]; print $scrubber->scrub($html), $/; $html = q[<a href="http://perlmonks.org" rel="nofollow">link </a>]; print $scrubber->scrub($html), $/; $html = q[<a href="http://perlmonks.org" rel="xxx">link </a>]; print $scrubber->scrub($html), $/; $html = q[<a href="http://perlmonks.org" rel="xnofollow">link </a>]; print $scrubber->scrub($html), $/; __END__ output: <a href="http://perlmonks.org" rel="nofollow">link </a> <a href="http://perlmonks.org" rel="nofollow">link </a> <a href="http://perlmonks.org" rel="nofollow">link </a> <a href="http://perlmonks.org" rel="nofollow">link </a>

    update:changed xrel="nofollow" to rel="nofollow"

      Perfect ++

      I admit I wasn't too sure where to start, though I'd made attempts at hacking the original module to allow 'mandatory' tags.

      Steve
      ---
      steve.org.uk