in reply to Code does Remove Html

Why can't your tags contain any spaces?
/[^ >]/
In plain English: any character except ">" or a space.

Besides, quoted attributes, either with single or with double quotes, may contain ">" characters.

Replies are listed 'Best First'.
Re^2: Code does Remove Html
by SFLEX (Chaplain) on Dec 17, 2006 at 14:08 UTC
    bart, from my findings if one was to remove the space in
    /<(([^ >]|\n)*)>/ /
    to
    /<(([^>]|\n)*)>/ /

    The code will remove more none html like.
    1) < sfsdf > 2) < sddsds "dfdsfds" > 3) < sdsds sdsd"sdsdasd" >
    So in those three cases they are text and not html.
      But what about tags like these?
      • <a href="http://perlmonks.org" class="link">
      • <br />
        your right the code /<(([^ >]|\n)*)>/ / does not remove those two.
        But the one code i posted
        $value =~ s/<(([^ >]|\n|\s\w)*)>/ /gso; will remove the <a href="http://perlmonks.org" class="link">
        and if you added |\s\/ will remove the <br />.

        Like.

        $value =~ s/<(([^ >]|\n|\s\w|\s\/)*)>/ /gso;
        I just came up with it and its not 100% tested. But it does the trick. :D

        Updated: Could try $value =~ s/<(([^ >]|\n|\s\/|\s\S\S)*)>/ /gso;
        But Escaping is Much Better!