in reply to JavaScript allowed in posts!
You seem to be testing the Perlmonks engine to the limit it seems :) but you are right indeed, not stripping the <SCRIPT> tags from HTML is bad, as I don't see any valid reason why people would want to see JavaScript in the posts here.
I think the following code should strip all <SCRIPT> tags :
My reason for the parentheses is that only a person with bad intent would want to use JavaScript in nodes anyway and could maybe try to trick the code into not stripping the script by adding some attributes to the closing part of the script tags.my $post; $post =~ s!<SCRIPT(?: [^>]*)?>.*?</SCRIPT(?: [^>])?>!!imgs;
On another side, I think also <FONT> tags and other things (like color etc.) should be avoided. Maybe it would be better to have a positive list of allowed tags instead of allowing Everything and then banning some special tags...
My list of "good" tags would be more or less the following :
| Section | Allowed tags |
|---|---|
| Font manipulation | B,I,U,TT,CODE,H1,H2,H3,H4,H5,H6 |
| Layout | TABLE,THEAD,TBODY,TR,TH,TD,CENTER,P,DIV,UL,OL,LI |
| Links | A |
Also, the engine could maybe even check for ill-formed HTML, that is, unclosed tags. I hate it if somebody posts with <PRE> and then does not close the tag so that all subsequent text is rendered as preformatted in Courier New. But that one requires much more analysis I think - or maybe not. An idea from the top of my head :
This method is crude and maybe destroys more than it does good - maybe instead of fixing the HTML, the engine should simply return a warning like
Update: vroom has a post about his position on HTML online now.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Allowed/forbidden HTML
by turnstep (Parson) on Jun 04, 2000 at 18:30 UTC | |
by . (Acolyte) on Jun 10, 2000 at 08:21 UTC |