in reply to Cleaning up HTML tags

I can't even think of why you wouldn't want to use an object-oriented interface for this sort of task. The new parses the incoming data into some sort of workable form that is stored to have methods called upon it. This reduces the amount of work each method needs to do since you'd otherwise need to do some sort of parsing for each and every sub called. I just don't see any valid reason you would want to use anything else. Sorry if I'm being a bit daft.

antirice    
The first rule of Perl club is - use Perl
The
ith rule of Perl club is - follow rule i - 1 for i > 1

Replies are listed 'Best First'.
Re: Re: Cleaning up HTML tags
by cleverett (Friar) on Aug 25, 2003 at 05:58 UTC
    As these these modules are written, the new() takes some rules as to what tags/attributes are and aren't allowed and turns them into a list of allowed tags/attributes.

    So far so good.

    But I'm not seeing where massaging the rules about allowed and denied tags and attributes generates a win except it might make the checking easier to write.

    The algorithm for filtering against the rules will still boil down to:

    1. get a token which is either html markup or text. stop when none are left. 2. it the token is text, add it to the output 2. drop the tag if it's not allowed 3. drop each attibute not allowed 4. repeat

    With a linear problem, I don't see what maintaining state wins for me unless the object accumulates a result for me as I intermittently obtain text to feed it with. Which I admit could be useful, but just not the way I've been programming.

    UPDATE: actually, just as important issue than linearity in the sense above is the fact that there's only one thing to do with the object, and when you've done it, it's useful life is over.

      It seems to me that keeping state is not the issue but more the holding on to the rules. If you have a procedural interface you are presumably going to need some internal variable that holds the rules between setting them and using them. This is fine unless you want to apply different rules to the html depending on the conditions. If that's the case then you are always going to keep changing the rules with a procedural interface. With the OO one you can just have different objects and you can give them nice helpful names.

      Of course if you're not doing this then it might be moot but it's seems like a good reason to use an OO interface to these sort of modules.

      Struan

        use strict; use HTML::CleanerUpper ('sterilize'); my %disinfecting_with = ( soap => {kills => 'nothing'}, steam => {doesnt_kill => [ qw/i b/ ]}, uv => {kills => 'everything'}, ); my $agent = shift || ''; die "$0: I need a disinfecting agent: soap, steam or UV" unless $disinfecting_with{$agent}; my $media = join("\n", <STDIN>); print sterilize($media, with => $agent);