Over the weekend, we had another instance of a root node in SoPW which broke the HTML rendering for the whole page. This was evident from view source.

I know we have the footnote below the textarea box, and the list of Perl Monks Approved HTML tags, and there is the preview button. Is there any way that text can be validated for the terminally lazy, prior to allowing submit? To ensure:

anything else we can think of.

Replies are listed 'Best First'.
Re: bad tagging that breaks the page
by giulienk (Curate) on May 13, 2002 at 14:02 UTC
    Well, i thought after Why I like functional programming by tilly the MarkupHandler.pm would have been implemented in the Monastery, but i guess it is not. It would be also really useful to strip JavaScript from home nodes (the checkbox in the user preferences doesn't seems to work...).

    Is there any special problem implementing the parsing described by tilly in that node? I'm using that code myself in a website i'm developing and works perfect.


    $|=$_="1g2i1u1l2i4e2n0k",map{print"\7",chop;select$,,$,,$,,$_/7}m{..}g

      The markup parsing will be improved. It is just a matter of time. It has been on the (long) to-do list for a long time.

              - tye (but my friends call me "Tye")
Re: bad tagging that breaks the page
by mt2k (Hermit) on May 13, 2002 at 23:16 UTC
    The problem of matching tags is really quite simple... Why not something more or less simple like I have posted below? Right off the bat I will say that the code itself probably needs improving (especially those embedded loops halfway through the code). It is not the best code, but it should work fine. This is something I wrote about a year ago, so have mercy :) The code is set up for command-line use (not CGI, so editting will be necessary.

    Quick explanation of my code:

    1. Checks for unallowed tags (ex: <script> and/or </script>)
    2. Checks for missing/too many tags (ex: there is a <strong> tag, but no </strong>)
    3. Checks for misspelled tags (ex: there is a <strong> and a </string> tag)
    4. Combinations of numbers 1-3 (ex: there is a <strong> tag and a </b> tag. First, there are mismatching <strong> and </strong>, plus </b> is not even allowed)

    I did not include the -w switch simply because you will get a lot of warnings about the following line:

    if ($tags{$_} != $tags{"/" . $_}) {

    since that line ends up testing non-existant hash keys. Now to the code:

    #!path to perl use strict; my %tags; my @errors; my $var_containing_message = qq# Hello everyone!<p> This is a sample test of the things that the <strong>awesome</b> langu +age perl can do!<p> Anyway, when I say <pre>$|++</pre> I am changing buffering! #; #Just a list of all allowed tags (opening and closing) #I had to get rid of the CODE entries to post this #Also, this list is incomplete. DO note however, #that I did not include <i>, <b>, or <u> #<strong> and <em> are much better :) my @allowed_tags = qw( p /p br ul /ul li ol /ol em /em strong /strong small /small sub /sub s +up /sup pre /pre ); #A list of the tags from @allowed_tags list #that REQUIRE a closing tag my @match_required = qw(ul ol em strong pre small sub sup); #Here it is! The code that makes sure all closing #tags are in the message somewhere #loop through to find each HTML tag in message #This counts up all the different HTML tags while ($var_containing_message =~ /<(.*?)>/gs) { my $tmp = lc($1); $tags{$tmp}++; } #Loop through all the found HTML tags and see if #any not-permitted/invalid ones are in there foreach my $found_tag (keys(%tags)) { my $allow = 0; foreach my $permitted_tag (@allowed_tags) { if ($found_tag eq $permitted_tag) { $allow = 1; } } if ($allow != 1) { push @errors, "Tag Not Allowed: $found_tag"; } } #Loop through all the tags requiring closing tags #If they do not have the same # of opening/closing tags, #generate and present the error to the poster foreach (@match_required) { if ($tags{$_} != $tags{"/" . $_}) { #Here is where the error is generated and presented to the poster #Example: push @errors, "Mismatched Tags: <$_> and </$_>"; } } if (@errors == 0) { print "hey, it's all good!"; } else { print "Whoops. There is a problem...\n"; print "$_\n" foreach (@errors); } sleep 3; exit;