htmanning has asked for the wisdom of the Perl Monks concerning the following question:

I am totally stumped. This script stopped working and I have no idea why. I'm using the exact same badwords routine in other scripts but this one just stops. As soon as this thing calls &filter_badwords the script prints out the header (done by the &Print_Page_Top call) and stops. I have no idea why it will not go forward. If I comment out the &filter_badwords; routine the script works fine. I hope someone can spot something because I've spent hours on this thing. It was working a week ago. The badwords file is just a list of badwords, one per line.

#!/usr/local/bin/perl # this line required for display to browser print "Content-type: text/html\n\n"; ######## REQUIRED FILES ########### require "cgi-lib.pl" || die "Error loading cgi-lib.pl"; require "db-common.sub" || die "Error loading db-common.sub"; require "all-common.sub" || die "Error loading all-common.sub"; require "build-albums.sub" || die "Error loading build-albums.sub"; require "build-content.sub" || die "Error loading build-content.sub"; ######## INPUT VARIABLES ########## # get input variables using this sub &Init_Common_Vars; $badword_file = "$rootpath/bad-words.txt"; # bad word file for filte +r - for all sites $thisfile = "$rooturl/cgi-bin/add-user-opinion.pl"; # this script &ReadParse; ######## MAIN SCRIPT ############ # get input variables from readparse $submit=$in{'submit'}; $preview=$in{'preview'}; $prodtype = $in{'prodtype'}; # hardware or software or article $prodID = $in{'prodID'}; # ID of item in database being rev +iewed $itemtitle = $in{'itemtitle'}; # title of sw or hw item for user' +s reference $itemlink = $in{'itemlink'}; # author of sw or hw item for user +'s reference $itemtitlefix = $itemtitle; $itemtitlefix =~ s/%20/ /gi; # setup links after top site link include if ($prodtype eq "album") { $top_site_links = "<a href=\"$rooturl$mainalbumpage\">Albums</a>: +"; $title_tag = "Software:"; } elsif ($prodtype eq "article") { $top_site_links = "<a href=\"$rooturl$maincontpage\">Articles</a>: + "; $title_tag = "Articles:"; } else { # error - no prodtype exit; } # end if ($prodtype eq "software") $title_tag = "$title_tag Add User Opinion"; $top_site_links = "$top_site_links Add User Opinion"; # print top of page in all cases &Print_Page_Top; if ($submit || $preview) { # get value of fields submitted by user & truncate lengths $name = substr($in{'name'}, 0, 100); $email = substr($in{'email'}, 0, 150); $location = substr($in{'location'}, 0, 150); $features = substr($in{'features'}, 0, 1); $usability = substr($in{'usability'}, 0, 1); $title = substr($in{'title'}, 0, 150); $body = substr($in{'body'}, 0, 4000); # get rid of repeating characters that attract attention $title =~ s/\#{3,}/\#/g; $title =~ s/!{3,}/!/g; $title =~ s/\?{3,}/\?/g; $title =~ s/-{3,}/\?/g; $title =~ s/\.{3,}/\?/g; $title =~ s/\*{3,}/\?/g; $name =~ s/\#{3,}/\#/g; $name =~ s/!{3,}/!/g; $name =~ s/\?{3,}/\?/g; $name =~ s/-{3,}/\?/g; $name =~ s/\.{3,}/\?/g; $name =~ s/\*{3,}/\?/g; # fix line feeds submitted by preview $body =~ s/<p>/\n\n/g; $body =~ s/<br>/\n/g; # strip out illegal characters and html @vars[0] = $name; @vars[1] = $email; @vars[2] = $location; @vars[3] = $title; @vars[4] = $body; for ($n=0; $n<=4; $n++) { # filter html and other illegal tags @vars[$n] =~ s/<!--(.|\n)*-->//g; # tak +e out SSI/Comment tags @vars[$n] =~ s/<!--(.|\n)*//g; # ta +ke out begin SSI/Comment tags @vars[$n] =~ s/(.|\n)*-->//g; # take + out end SSI/Comment tags @vars[$n] =~ s/<([^>]|\n)*>//g; # take + out html tags @vars[$n] =~ s/</&lt;/g; # conv +ert legit < to &lt @vars[$n] =~ s/>/&gt;/g; # conv +ert legit > to &gt @vars[$n] =~ s@(<|&lt;?)\s?/?blink(>|&gt;?)@ @gi; # kill + blinks @vars[$n] =~ s@(<|&lt;?)\s?/?script(>|&gt;?)@ @gi; # kill + scripts @vars[$n] =~ s@(<|&lt;?)\s?/?meta@ @gi; # kill + meta @vars[$n] =~ s@(<|&lt;?)\s?/?ssi@ @gi; # kill + ssi @vars[$n] =~ s@(<|&lt;?)\s?/?form@ @gi; # kill + forms } # end for ($n=0; $n<=4; $n++) $name = @vars[0]; $email = @vars[1]; $location = @vars[2]; $title = @vars[3]; $body = @vars[4]; # fix special characters in body $body =~ s/\n\n/<p>/g; $body =~ s/\n/<br>/g; $body =~ s/\&/\&amp\;/g; $body =~ s/&lt;/</g; $body =~ s/&gt;/>/g; $body =~ s/\cM//g; # filter bad words from bad words file, use same file for all boar +ds and opinions &filter_badwords; # get current date to store in record ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime( +time); #year needs to have 1900 added $year = $year+1900; #month starts at 0 for Jan, so add 1 $mon = $mon+1; $dateadded=$year."-".$mon."-".$mday; # check for missing fields &error(missing_field) if (!$name || !$title || !$email || !$body); } # if ($submit || $preview) if ($submit) { # fix single quotes so SQL insert will work right $namefix = $name; $namefix =~ s/\'/\\'/g; $emailfix = $email; $emailfix =~ s/\'/\\'/g; $locationfix = $location; $locationfix =~ s/\'/\\'/g; $titlefix = $title; $titlefix =~ s/\'/\\'/g; $bodyfix = $body; $bodyfix =~ s/\'/\\'/g; # fix quotes back from hidden variables $namefix =~ s/\&quot\;/"/g; $emailfix =~ s/\&quot\;/"/g; $titlefix =~ s/&quot\;/"/g; $bodyfix =~ s/\&quot\;/"/g; $locationfix =~ s/\&quot\;/"/g; # check for same name or title for this product on this day to pre +vent multiposts &Conn_to_DB; $SQL = "SELECT * from $op_tbl WHERE (prodID='$prodID' AND dateadde +d='$dateadded' AND prodtype='$prodtype' AND (name='$namefix' OR title +='$titlefix'))"; # run SQL against the DB &Do_SQL; &error(no_spam) if ($pointer = $sth->fetchrow_hashref); $SQL = "INSERT INTO $op_tbl (name,email,location,features,usabilit +y,title,body,dateadded,prodID,prodtype) VALUES ('$namefix','$emailfix','$locationfix','$features','$us +ability','$titlefix','$bodyfix','$dateadded','$prodID','$prodtype')"; # run SQL against the DB &Do_SQL; if ($prodtype eq "album") { $prod_table = $album_tbl; } elsif ($prodtype eq "article") { $prod_table = $cont_tbl; } # end if ($software) $quick_build=1; #$quick_ID = $prodID; # get record to match user opinion $SQL = "SELECT * from $prod_table WHERE ID=$prodID"; &Do_SQL; $pointer = $sth->fetchrow_hashref; $quick_build_type = "modify"; $quick_filename = $pointer->{'filename'}; $uo_filename = substr($quick_filename, 0, length($quick_filename)- +5) . "-uo1.html"; if ($prodtype eq "album") { $opinions_url = $rooturl.$albumdir."/$uo_filename"; $details_url = $rooturl.$albumdir."/$quick_filename"; &Build_Albums; } elsif ($prodtype eq "article") { $contenttype = $pointer->{'contenttype'}; &Init_Content_Type; $articledir = "$rooturl$contentdir/$typedir"; $opinions_url = $articledir."/$uo_filename"; $details_url = $articledir."/$quick_filename"; &Build_Content; } # end if ($software) print <<HTML; <center> <TABLE border=0 cellpadding=3 width=100%> <tr bgcolor=$barbgcolor> <td width=100%> <center> <font color=#FFFFFF><b>Submission Confirmation for $it +emtitlefix</b></font> </center> </td> </TR> <tr> <td> <center> <br> <br> Thank you for letting your opinion be heard at <b><a href += "$rooturl/index.html">$sitename</a></b>.</font> <br> <br> Return to [ <b><a href = "$details_url">Details Page</a></ +b> | <b><a href = "$opinions_url">Opinions Page</a></b> ] <br> <br> If the above links do not show the opinion you added, clic +k reload in your browser<br>after clicking either link. <br> <br> <br> </center> </td> </tr> </table> </center> HTML } elsif ($preview) { # fix quotes so " shows correctly in input variables $name =~ s/"/\&quot\;/g; $email =~ s/"/\&quot\;/g; $title =~ s/"/\&quot\;/g; $location =~ s/"/\&quot\;/g; $body =~ s/"/\&quot\;/g; $submitter = "<b>Submitted by:</b> $name</b>"; $submitter = $submitter . "<br><b>Email:</b> $email" if ($email); $submitter = $submitter . "<br><b>Location:</b> $location" if ($lo +cation); $submitter = $submitter . "<br><b>Date Added:</b> $dateadded"; $rating_img = "<font size=-2>&nbsp;&nbsp;1&nbsp;&nbsp;2&nbsp;&nbsp +;3&nbsp;&nbsp;4&nbsp;&nbsp;5&nbsp;</font>"; $features_img = "<img src = \"$rooturl/images/review-dot$features. +gif\">"; $usability_img = "<img src = \"$rooturl/images/review-dot$usabilit +y.gif\">"; print <<HTML; <center> <form method=POST action="$thisfile"> <input type = "hidden" name="prodID" value="$prodID"> <input type = "hidden" name="prodtype" value="$prodtype"> <input type = "hidden" name="itemtitle" value="$itemtitle"> <input type = "hidden" name="features" value="$features"> <input type = "hidden" name="usability" value="$usability"> <input type = "hidden" name="title" value="$title"> <input type = "hidden" name="body" value="$body"> <input type = "hidden" name="name" value="$name"> <input type = "hidden" name="email" value="$email"> <input type = "hidden" name="location" value="$location"> <input type = "hidden" name="itemlink" value="$itemlink"> HTML # fix quotes back $name =~ s/\&quot\;/"/g; $email =~ s/\&quot\;/"/g; $title =~ s/\&quot\;/"/g; $location =~ s/\&quot\;/"/g; $body =~ s/\&quot\;/"/g; print <<HTML; <TABLE border=0 cellpadding=3 width=100%> <tr colspan=2 bgcolor=$barbgcolor align=center> <td><font color=#FFFFFF><b>Preview Opinion for $itemtitlefix</ +b></font></td> </TR> <tr> <td> <br><center><b>This opinion has NOT yet been submitted.</b></c +enter><br> <br>If this opinion is correct, press the Submit this Opinion +button below to submit it. If you need to make changes, click your b +rowser's back button and make changes on the previous page. <br>&nbsp; </td> </TR> </table> <hr width=100% size=1> <TABLE border=0 cellpadding=6 width=100%> <tr> HTML if ($prodtype eq "album") { print <<HTML; <td align=right valign=top nowrap=on> $rating_img<br> <b>Rating</b> $features_img<br> </td> HTML } print <<HTML; <td width=100% height=100% valign=top> <b><font face="arial" color=#b22222>$title</font></b><br> <font face="arial">$body<br> <br>$submitter</font> </td> </TR> </table> <hr width=100% size=1> <TABLE border=0 cellpadding=6 width=100%> <tr colspan=3> <td> <br> <center><INPUT TYPE = "submit" NAME = "submit" VALUE = "Submit + this Opinion"></center> <br> </td> </TR> </TABLE> <br> </FORM> </center> HTML } else { # this part happens if we didn't press submit or preview, i.e. fir +st load of file print <<HTML; <center> <form method=POST action="$thisfile"> <input type = "hidden" name="prodID" value="$prodID"> <input type = "hidden" name="prodtype" value="$prodtype"> <input type = "hidden" name="itemtitle" value="$itemtitle"> <input type = "hidden" name="itemlink" value="$itemlink"> <TABLE border=0 cellpadding=3 width=500> <tr bgcolor=$barbgcolor> <td colspan=2> <center> <font color=#FFFFFF><b>Submit Opinion for $itemtitlefi +x</b></font> </center> </td> </TR> <tr> <td colspan=2> HTML if ($prodtype eq "software") { print <<HTML; <br>If there is a problem with a download link or some other issue wit +h the software listing, please let us know <a href = "$rooturl/mail.h +tml">HERE</a>. <br><br>Please do not submit an opinion telling about a broken link, l +et us know <a href = "$rooturl/mail.html">HERE</a>. <br> HTML } print <<HTML; <br> Thank you for taking the time to share your opinion with other rea +ders. Please fill in the blanks below and click preview to view your + opinion. <br>&nbsp; </td> </TR> <tr> <TD ALIGN = "left" valign="center">Name:</TD> <TD><INPUT TYPE = "text" NAME = "name" value="" SIZE = "32" MAXSIZ +E = "100"></TD> </TR> <tr> <TD ALIGN = "left" valign="center">Email:</TD> <TD><INPUT TYPE = "text" NAME = "email" value="" SIZE = "32" MAXSI +ZE = "150"></TD> </TR> <tr> <TD ALIGN = "left" valign="center" nowrap=on>Location (city, state +):</TD> <TD><INPUT TYPE = "text" NAME = "location" value="" SIZE = "32" MA +XSIZE = "150"></TD> </TR> HTML if ($prodtype eq "album") { print <<HTML; <input type = "hidden" name="usability" value="3"> <tr> <TD ALIGN = "left" valign="center"><br>Rating:</TD> <TD> <br> <select name = "features"> <option value="1"> 1 Poor <option value="2"> 2 Fair <option value="3" selected> 3 Average <option value="4"> 4 Good <option value="5"> 5 Excellent </select> </TD> </TR> HTML } else { print <<HTML; <input type = "hidden" name="features" value="3"> <input type = "hidden" name="usability" value="3"> HTML } print <<HTML; <tr> <TD ALIGN = "left" valign="center"><br>Opinion Title:</td> <TD><br><INPUT TYPE = "text" NAME = "title" value="" SIZE = "40" M +AXSIZE = "150"></TD> </TR> <tr> <td colspan=2> Opinion Body (~500 words or less):<br> <textarea name="body" wrap=virtual rows="9" cols="58"></textarea>< +br> <br> <center><INPUT TYPE = "submit" NAME = "preview" VALUE = "Preview + this Opinion"></center> <br>&nbsp; </td> </tr> </TABLE> </FORM> </center> HTML } # end if ($submit) # print bottom of page for all options &Print_Page_Bottom; $dbh->disconnect; exit; ## END of main script ## ############################# # filter_badwords subroutine sub filter_badwords { $badword_found = 0; open(BADWORDS,"$badword_file"); @badwords = <BADWORDS>; close(BADWORDS); foreach $badword (@badwords) { # Strip any extra CR/LF's $badword =~ s/\n//g; $badword =~ s/\r//g; if (($name =~ /$badword/i) || ($title =~ /$badword/i) || ($bod +y =~ /$badword/i)) { #print "found"; $badword_found = 1; # If a bad word is found, highlight all occurances of it $name =~ s/($badword)/<B>$1<\/B>/ig; $email =~ s/($badword)/<B>$1<\/B>/ig; $location =~ s/($badword)/<B>$1<\/B>/ig; $title =~ s/($badword)/<B>$1<\/B>/ig; $body =~ s/($badword)/<B>$1<\/B>/ig; } # end if (($name =~ /$badword/i) || ($email =~ /$badword/i) } # end foreach $badword (@badwords) #print "anything"; &error(bad_words) if ($badword_found == 1); return; } # end filter_badwords subroutine ############################ # Error Messages Subroutine # sub error { $error = $_[0]; print "<table width=90% cellpadding=3><tr bgcolor=\"$barbgcolor\"> +<td width=100% align=center><font color=#FFFFFF><b>"; if ($error eq 'missing_field') { print <<HTML; ERROR: Missing Field</b></font></td></tr> <tr><td><br><br><blockquote><center>You forgot to fill in the +<b>name</b>, <b>title</b>, <b>email</b>, or <b>body</b> field in your posting. <br><br>Please go back and correct this and resubmit. HTML } elsif ($error eq 'bad_words') { print <<HTML; ERROR: Illegal Words</b></font></td></tr> <tr><td><br><br><blockquote>Our script has read your post and +has determined that it may contain banned words. The purpose of this +site is for serious computing. We want to keep the quality of the site up while we save you f +rom being flamed! If this warning has been triggered by accident, ple +ase go back and remove the offending words / comments / concepts from your post and r +e-send. <!--<br><br>Remove the <b>bold</b> language / words / or conce +pts and repost:<br><BR><P> Name: $name<BR> Email: $email<BR> Location: $location<BR> Title: $title<BR> Body: $body<P>--><P> HTML } elsif ($error eq 'no_spam') { print <<HTML; ERROR: Post Limit Exceeded</b></font></td></tr> <tr><td><br><br><blockquote>To prevent spam a posting limit ru +le has been implemented. You cannot post more than one message per p +roduct per day. This error may occur if the submit button was clicked more tha +n once. <br><br>If you feel you have reached this page in error, pleas +e try to post your message again later or try posting with a differen +t name or title. HTML } else { print <<HTML; ERROR: Undefined</b></font></td></tr> <tr><td><br><br><blockquote>An undefined error has occurred.<b +r><br>Please go back and try to the post again. HTML } # end if ($error eq 'no_name') print "</blockquote></td></tr></table><br><br>\n"; &Print_Page_Bottom; exit; } 1; # below subs tag

Replies are listed 'Best First'.
Re: Need help with badwords routine
by graff (Chancellor) on Oct 21, 2008 at 02:37 UTC
    Have you looked at the httpd error_log file yet? If the script is dying, maybe there's a relevant error message in the web server log. If not, I'm not sure where else to begin.

    Also, since there is no "use CGI" (or use of any CPAN modules), and instead there are several "require" statements for (presumably) home-grown chunks of code, I'm assuming this thing was originally written for Perl 4. Would you consider refactoring the code so that it takes advantage of Perl 5 and CPAN (e.g. at least CGI, and possibly something like HTML::Template)?

    Also, since you are using regex substitutions to highlight "bad" words, I wonder if the problem might be related to the particular data associated with the current script / web-site, rather than with the code. People using current perl modules and methods do not use regex substitutions for this sort of editing. HTML::Parser or something built on top of that (e.g. HTML::TokeParser or others) would be a safer, less brittle approach. But again, that entails a fairly drastic refactoring/rewrite of the current code.

    (I googled "cgi-lib.perl" -- at least it doesn't appear to be the infamous "Matt's Scripts", but it looks to be about 10 years old. Time for an update, don't you think? Esp. considering that some of the content at http://cgi-lib.berkeley.edu/, which purports to be the "cgi-lib.pl home page", appears to have vanished, for instance the stuff that was linked to "CGI Security" -- not a good sign...)

    (updated to fix typo in first paragraph)

Re: Need help with badwords routine
by AnomalousMonk (Archbishop) on Oct 21, 2008 at 08:18 UTC
    Please update your post to enclose huge chunks of code in <readmore> ... </readmore> tags. Please see Writeup Formatting Tips and related links for info.
Re: Need help with badwords routine
by chrism01 (Friar) on Oct 21, 2008 at 06:57 UTC
    As above, upgrade to Perl5 syntax:
    #!/usr/local/bin/perl -w use strict;
    Don't use '&' to call a sub, that's deprecated, basically now means a ref-to-sub.

    To access/assign 1 element of an array, don't use

    @vars[0] = $name;
    use
    $vars[0] = $name;
      Don't use '&' to call a sub, that's deprecated, basically now means a ref-to-sub.
      That's wrong. It's deprecated, but:
      \&foo is a reference to the sub foo.
      &foo is a call to the sub foo which circumvents prototypes, and additionally because of the lack of parentheses it gives the current @_ as an argument to foo.
      Darn, no edit facility.

      Anyway, amend

      open(BADWORDS,"$badword_file");
      to
      open(BADWORDS,"<","$badword_file") or die "Unable to open badword_ +file: $! \n";
      or similar
Re: Need help with badwords routine
by JadeNB (Chaplain) on Oct 21, 2008 at 20:39 UTC
    &error(bad_words) if ($badword_found == 1);
    Note that you are calling error with the bareword bad_words, not the variable $badwords. (Yet another of many reasons to use strict; use warnings—but many people have pointed out that this looks Perl 4-ish, and I guess that these pragmata weren't available then.) Thus, your error subroutine is always called.

    UPDATE: Immediately after posting, I saw that I'd completely ignored the trailing if ..., so your error routine need not always be called after all. Still, you probably don't want to call it this way.