sstevens has asked for the wisdom of the Perl Monks concerning the following question:

I have a simple form with a textarea that allows people to type stuff. I've had complaints that sometimes not everything that they enter gets saved. So I tested it out with a part of Don Quixote:
<html> <form action='server-scripts/spellchecker.pl' method='post'> <textarea name='textinputs'>In a village of La Mancha, the name of whi +ch I have no desire to call to mind, there lived not long since one o +f those gentlemen that keep a lance in the lance-rack, an old buckler +, a lean hack, and a greyhound for coursing. An olla of rather more b +eef than mutton, a salad on most nights, scraps on Saturdays, lentils + on Fridays, and a pigeon or so extra on Sundays, made away with thre +e-quarters of his income. The rest of it went in a doublet of fine cl +oth and velvet breeches and shoes to match for holidays, while on wee +k-days he made a brave figure in his best homespun. He had in his hou +se a housekeeper past forty, a niece under twenty, and a lad for the +field and market-place, who used to saddle the hack as well as handle + the bill-hook. The age of this gentleman of ours was bordering on fi +fty; he was of a hardy habit, spare, gaunt-featured, a very early ris +er and a great sportsman. They will have it his surname was Quixada o +r Quesada (for here there is some difference of opinion among the aut +hors who write on the subject), although from reasonable conjectures +it seems plain that he was called Quexana. This, however, is of but l +ittle importance to our tale; it will be enough not to stray a hair's + breadth from the truth in the telling of it. You must know, then, that the above-named gentleman whenever he was at + leisure (which was mostly all the year round) gave himself up to rea +ding books of chivalry with such ardour and avidity that he almost en +tirely neglected the pursuit of his field-sports, and even the manage +ment of his property; and to such a pitch did his eagerness and infat +uation go that he sold many an acre of tillageland to buy books of ch +ivalry to read, and brought home as many of them as he could get. But + of all there were none he liked so well as those of the famous Felic +iano de Silva's composition, for their lucidity of style and complica +ted conceits were as pearls in his sight, particularly when in his re +ading he came upon courtships and cartels, where he often found passa +ges like "the reason of the unreason with which my reason is afflicte +d so weakens my reason that with reason I murmur at your beauty;" or +again, "the high heavens, that of your divinity divinely fortify you +with the stars, render you deserving of the desert your greatness des +erves." Over conceits of this sort the poor gentleman lost his wits, +and used to lie awake striving to understand them and worm the meanin +g out of them; what Aristotle himself could not have made out or extr +acted had he come to life again for that special purpose. He was not +at all easy about the wounds which Don Belianis gave and took, becaus +e it seemed to him that, great as were the surgeons who had cured him +, he must have had his face and body covered all over with seams and +scars. He commended, however, the author's way of ending his book wit +h the promise of that interminable adventure, and many a time was he +tempted to take up his pen and finish it properly as is there propose +d, which no doubt he would have done, and made a successful piece of +work of it too, had not greater and more absorbing thoughts prevented + him. </textarea> <input type='hidden' name='foo' value='foo'> <input type='submit'> </form> </html>
I have a very simple CGI script:
#!/usr/local/bin/perl use strict; use CGI qw/ :standard /; print "Content-type: text/html\n\n"; print "<html>\n"; print Dump; print "</html>\n";
This is what I get:
<html> <ul> <li><strong>textinputs</strong> <ul> <li>In a village of La Mancha, the name of which I have no desire to c +all to mind, there lived not long since one of those gentlemen that k +eep a lance in the lance-rack, an old buckler, a lean hack, and a gre +yhound for coursing. An olla of rather more beef than mutton, a salad + on most nights, scraps on Saturdays, lentils on Fridays, and a pigeo +n or so extra on Sundays, made away with three-quarters of his income +. The rest of it went in a doublet of fine cloth and velvet breeches +and shoes to match for holidays, while on week-days he made a brave f +igure in his best homespun. He had in his house a housekeeper past fo +rty, a niece under twenty, and a lad for the field and market-place, +who used to saddle the hack as well as handle the bill-hook. The age +of this gentleman of ours was bordering on fifty; he was of a hardy h +abit, spare, gaunt-featured, a very early riser and a great sportsman +. They will have it his surname was Quixada or Quesada (for here ther +e is some difference of opinion among the authors who write on the su +bject), although from reasonable conjectures it seems plain that he w +as called Quexana. This, however, is of but little importance to our +tale; it will be enough not to stray a hair&#39;s breadth from the tr +uth in the telling of it. <br> <br> You must know, then, that the above-na </ul> </ul></html>
I have no clue why this would happen. I thought that using POST allowed you to send an unlimited amount of data. Any ideas? Update: In case it matters, I'm using Perl 5.8.8 on Linux.

Replies are listed 'Best First'.
Re: Perl truncating HTML form input
by wfsp (Abbot) on Apr 10, 2008 at 17:04 UTC
    This html
    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <title>test</title> <meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1 +"> </head> <body> <form action='z/bin/test.cgi' method='post'> <textarea name='textinputs'>In a village of (as in your post)</textare +a> <input type='hidden' name='foo' value='foo'> <input type='submit'> </form> </body> </html>
    and script
    #!/usr/local/bin/perl use strict; use CGI qw/ :standard /; print "Content-type: text/html\n\n"; print "<html>\n"; print Dump; print "</html>\n";
    gave
    <html> <ul> <li><strong>textinputs</strong></li> <ul> <li>In a village of La Mancha, ... telling of it. <br /> <br /> You must know, then, that the above-named ... thoughts prevented him. + </li> </ul> <li><strong>foo</strong></li> <ul> <li>foo</li> </ul> </ul></html>
    I've truncated the text but it shows that all of it was returned. There is also the foo field there. This is the hidden field you have in your html. This doesn't show in your output.

    Are you sure you are looking at the files (html/cgi) that you think you are? This is often the reason when I hit frustating snags like this. :-)

    Good luck!

      What the? I am _positive_ I'm running the same code that I posted. I promise, I copy-pasted. Maybe it has to do with the DOCTYPE tag? Hm.

      What version of Perl are you running?

      In regards to foo not showing up in my output, that was just one of the side effects (affects?). The Quixote text was truncated and I would lose any inputs (hidden or otherwise) from the form that occur after the Quixote textarea.

      Update: After being baffled by wfsp's reply, I've tried a few hundred things. I finally moved the two files (blank.html and spellchecker.pl) to a different domain on the same server. It worked! The exact same files with the same permissions worked! I noticed that mod_perl was not enabled for the domain (domain A) that the script worked, and it was on for the domain (domain B) that the script didn't work. I turned on mod_perl for domain A and the script stopped working! I got really excited and turned mod_perl off again... but then it still didn't work. I tried it a few times and sometimes it would work and sometimes it wouldn't. I know that's hard to believe. There must be some kind of user error here, right?

      I thought it might be a browser issue, so I tried IE 6. It consistently doesn't work for IE 6.

      If I saw this thread, I would think to myself that the person was screwing something up. I'm not saying that I'm not screwing something up, but it isn't as simple as calling the wrong script. I've copied this stuff over to an accessible location, in case someone wants to try it out.


      And now it's working consistently. I have no idea what's going on. :(
        It doesn't have anything to do with perl version, maybe apache version, CGI.pm version , browser version, but not perl version.
Re: Perl truncating HTML form input
by ww (Archbishop) on Apr 10, 2008 at 14:08 UTC
    :-) Other than infinity, very little in this universe is unlimited.

    But I don't think that's the major issue here. Your output appears to have a variety of markup (<ul> and <li, for example) whose source is not apparent.

    Update3 (.oO This one must have really bugged me, especially after the reply re enctype and multi-part.) Note also that the markup shown in your initial (input) example is missing a few tags... such as <body>. What's more, the cgi shown has no plausible nexus to the name in the form action which is "spellchecker.pl". Tentative conclusion: Information presented is ne what's really extant.

    Perhaps more information about your form would be helpful. Oops. found in the input stanza.

    Update2: Searching with big G, found several references (of unknown accuracy) to use of config for Apache to limit max-length of text-area input. But I have not found that documented in Apache itself (yet?).

    And, OT, if you're letting (unknown) users input whatever they chose, without taint checking and other precautions, you're begging to lose more than just parts of their submissions.

      Okay, so I was thinking: Why would I be able send an image of several MB in size via forms, but I can't send this few KB of text? When I send images, I have to include enctype='multipart/form-data' in the form tag, so I thought maybe that has something to do with it. I popped that in my code, and it worked! Woohoo! So now the HTML form tag looks like:
      <form action='server-scripts/spellchecker.pl' method='post' enctype='m +ultipart/form-data'>
      Hopefully this helps others in the future.

      Update: I've had a couple questions about the code I posted being what I'm actually using on my server. I want to assure everyone that I copy-pasted from emacs when creating this thread. The HTML page is called blank.html (it was just a test page), and the CGI script is called spellchecker.pl. The original spellchecker.pl had a ton of stuff in it, but I stripped it down to the bare minimum when I was debugging.
      Your output appears to have a variety of markup (<ul> and <li>, for example) whose source is not apparent.
      CGI.pm's Dump method puts that in. Put where is Foo which is a field in the html form but not in the output?
        "where is Foo which is a field in the html form but not in the output?"

        Exactly! Without the enctype, all other data was being lost!
Re: Perl truncating HTML form input
by Joost (Canon) on Apr 10, 2008 at 14:48 UTC
      1. That is really odd that it worked for you without the enctype. Now I'm confused again. I'm using Firefox 2.0.0.13 on Mac 10.4.11.
      2. The code that I used in this thread was a copy-paste from emacs, so I'm sure it's using POST.

      Unrelated to Joost's reply, the Chatterbox said someone was wondering what makes this form multipart. I was wondering that myself and found this:
      The content type "application/x-www-form-urlencoded" is inefficient fo +r sending large quantities of binary data or text containing non-ASCI +I characters
      here. I'll be honest -- I'm not sure if my text falls into this category, but it does fit the "large quantities" part (relatively speaking).