jlongino has asked for the wisdom of the Perl Monks concerning the following question:

I realize that this is not perl related but seeing so many CGI related posts leads me to believe that this is a fairly common problem using HTML forms.

I've only written four Web applications and even then under severe time constraints (which isn't very condusive to tidy code or indepth research), but in at least three of those applications I've encountered the same problem and haven't found a satisfactory solution. The problem is that TEXTAREA fields that contain an apostrophe (´) are truncated or completely obliterated upon submission.

So I guess the question is how should this situation be handled such that the apostrophes are preserved? The last application I wrote intercepted the submission via "onsubmit=" to a js sub, escaped the original TEXTAREA to a hiddent TEXTAREA field and then submitted the form.

It doesn't seem efficient to submit two TEXTAREA fields for each one on a given form. Nor does it seem efficient to create a shadowed hidden form and then submit only the hidden form data. There must be something obvious that I'm overlooking. I doubt that there is a Perl solution since the unescaped field never makes it (intact) to the Perl program once submitted.

Thanks for any advice. If someone could point me in the right direction (code is not necessary but welcome).

--Jim

Replies are listed 'Best First'.
Re: (OT) TEXTAREA and the Single Quote
by jlongino (Parson) on Nov 18, 2001 at 05:30 UTC
    OK, I discovered the problems with my previous code and I think that they will be instructional for others who doubt or disregard the advice given regularly by the sages at PM.

    Mistake #1. I arrived at the solution by following standard advice given regarding:
    use CGI::Carp qw(fatalsToBrowser); which had I known about when I originally wrote the application would have immediately led me to the source of the problem (pun intended).

    I happen to be a sysadmin, so I can't blame not having access to the system logs because I did. Since I never received any error messages in the browser and the applications appeared to work despite the original problem I described, it never occurred to me to look in the logs.

    Mistake #2. I relied on my memory of the steps I took to debug the problem (this was over 9 months ago) and thought that I had tested the query string by substituting "GET" for "POST". Apparently I hadn't because after I replied to hacker, I began second guessing myself and actually wrote some code to test it. Well, the whole TEXTAREA contents were displayed to browser location bar. I apologize for the mistaken assumption and realize that I should have gone back and retested my claim before making it.

    Mistake #3. As Trimbach suggested, I was relying on a handrolled solution for retrieving CGI parameters. Well, actually I pieced together snippets I'd found from various sources and wrote my own sub. I actually was going to post a question about substituting my own code with CGI::import_names but after some research decided that I could figure that out on my own. Instead, I wrote the first post in this thread which has brought me back full circle.

    In my previous applications I used the following code to assign my CGI params. I am already aware of how horrible this is, even though I thought at the time that I had insured the "untaintedness" of my data, I didn't realize the other implications (see this thread).

    HTML:

    <html> <form name="Survey" method="get" action="/cgi-bin/textarea.pl"> <table border=0 width="100%"> <tr><td><TEXTAREA NAME="xiv" ROWS="6" COLS="55" wrap="soft"></T +EXTAREA></td></tr> <tr><td><input type="submit" name="Submit" value="Submit"></td> +</tr> </table> </form> </html>
    CGI:
    use CGI; use CGI::Carp qw(fatalsToBrowser); doGetCGIvars(); print "Content-type: text/html\n\n"; # my $query = new CGI; # my $xiv = $query->param('xiv'); print "<html><body>\$xiv=$xiv</body></html>"; sub doGetCGIvars { ### for future revisions look into CGI::import_names my $VarName; my $query = new CGI; foreach $VarName ($query->param) { $assign = "\$$VarName = '" . $query->param($VarName) . "'"; &UnTaint($assign); ### print "$assign<br>"; eval($assign); } } sub UnTaint { my $test = shift; unless ($test =~ /^([^<]*)$/) { die "Couldn't untaint variable \$test:\n\n"; } }
    The problem is that if you comment out the line use CGI::Carp qw(fatalsToBrowser);, you don't get error messages to the browser, and since the $xiv assignment broke as demonstrated by fatalsToBrowser:
    Software error: Substitution pattern not terminated at (eval 5) line 2. For help, please send mail to the webmaster (xxx@yyyyyyyyyy), giving t +his error message and the time and date of the error. Content-type: text/html $xiv=
    $xiv has an undefined value.

    --Jim

Re: (OT) TEXTAREA and the Single Quote
by Trimbach (Curate) on Nov 18, 2001 at 05:23 UTC
    When you say "it never makes it to the CGI intact" are you sure? Have you examined the contents of ENV directly (in the case of a GET request) or STDIN (in the case of a POST)? I suspect the data IS being transmitted to your CGI, it's just that your CGI is not handling the transmitted data correctly. The fact that you don't know how to use CGI.pm implies that you're using a hand-rolled method to extract your form information, and that way, dear brother, lies madness.

    Why don't you post your CGI code and we'll see if that's where the problem is?

    Gary Blackburn
    Trained Killer

Re: (OT) TEXTAREA and the Single Quote
by hacker (Priest) on Nov 18, 2001 at 03:37 UTC
    Two modules come to mind:
       URI::Escape
       CGI::escapeHTML (part of CGI.pm)
    
      As I mentioned in the original node, from what I can tell, the data doesn't make it to the Perl program intact after the form is submitted. I tested this using "METHOD=GET" instead of "METHOD=POST".

      --Jim

Re: (OT) TEXTAREA and the Single Quote
by dws (Chancellor) on Nov 18, 2001 at 03:02 UTC
    The problem is that TEXTAREA fields that contain an apostrophe (´) are truncated or completely obliterated upon submission.

    I have no problems preserving apostrophes in my webapps. Can you post a small sample of code that demonstrates the problem?

      BTW, this was written before I found PM, and I knew nothing about CGI.pm. Not that it matters, but I intercept the submit via "onclick=" and not "onsubmit=" as I originally posted. I don't know if this is enough code or not:
      print << "--eot1--"; <!doctype html public "-//w3c//dtd html 4.0 transitional//en"> <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-885 +9-1"> <meta name="GENERATOR" content="Mozilla/4.75 [en] (Win98; U) [Netsc +ape]"> <title>USA Fourteenth Faculty Survey, 2001</title> </head> . . . <form name="Survey" method="POST" action="$SubmitURL"> . . . <table border=0 width="100%"> <tr><td><TEXTAREA NAME="xiv" ROWS="6" COLS="55" wrap="soft"></TE +XTAREA></td></tr> <tr><td><input type="button" name="SubmitSurvey" value="Submit S +urvey" onclick="javascript:return SurveySubmit(document.forms[0])"></ +td></tr> </table> . . . </form> --eot1-- . . . function SurveySubmit (form) { var CGI_URL = "http://jaguar1.usouthal.edu/cgi-bin/surveys/facsenat +e/" form.xiv.value = escape(form.xiv.value) form.action = CGI_URL + "writesurvey.pl" form.submit() return true }
      Update: Added javascript sub SurveySubmit(). Also, in one application there was only one person using the form for input so I told them to be sure and use &acute; instead of the single quote and there's no problem (still an unacceptable solution though).

      --Jim

      1. This isn't doesn't appear to be a Perl problem, it's a Javascript problem.
      2. Why are you calling escape? What happens if you comment that line out?
        The whole SurveySubmit sub was an attempt to fix the initial problem. It appeared to me at the time that the reason the single quotes were screwing things up was because they weren't being escaped. You could bypass the whole sub but the results would be essentially the same. I assume (maybe erroneously) that double escapement didn't cause any noticeable problems. At least when I unescaped the data in the Perl CGI, I didn't get anything that appeared to be corrupted. Then again, I might have luckily bypassed circumstances that could have led to data corruption.

        It's ironic that the post was not OT at all. It was poor Perl programming that caused it. It just wasn't obvious at first.

        --Jim