in reply to Re: Code does Remove Html
in thread Code does Remove Html

bart, from my findings if one was to remove the space in
/<(([^ >]|\n)*)>/ /
to
/<(([^>]|\n)*)>/ /

The code will remove more none html like.
1) < sfsdf > 2) < sddsds "dfdsfds" > 3) < sdsds sdsd"sdsdasd" >
So in those three cases they are text and not html.

Replies are listed 'Best First'.
Re^3: Code does Remove Html
by bart (Canon) on Dec 17, 2006 at 15:01 UTC
    But what about tags like these?
    • <a href="http://perlmonks.org" class="link">
    • <br />
      your right the code /<(([^ >]|\n)*)>/ / does not remove those two.
      But the one code i posted
      $value =~ s/<(([^ >]|\n|\s\w)*)>/ /gso; will remove the <a href="http://perlmonks.org" class="link">
      and if you added |\s\/ will remove the <br />.

      Like.

      $value =~ s/<(([^ >]|\n|\s\w|\s\/)*)>/ /gso;
      I just came up with it and its not 100% tested. But it does the trick. :D

      Updated: Could try $value =~ s/<(([^ >]|\n|\s\/|\s\S\S)*)>/ /gso;
      But Escaping is Much Better!