http://qs1969.pair.com?node_id=266586

eweaverp has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks...

Does anybody know if CGI textarea boxes do newlines msdos style or unix style? I think msdos because chomp() seems to be giving inappropriate behavior. How do I remove the extra blank characters? Even after chomping my code still prints out newlines on the web...

Thanks
~evan

Replies are listed 'Best First'.
Re: Textarea boxes in CGI
by dws (Chancellor) on Jun 17, 2003 at 18:58 UTC
    Does anybody know if CGI textarea boxes do newlines msdos style or unix style?

    Browsers tend to send both a carriage-return and a linefeed. I always do

    $text =~ s/\r//g;
    on whatever comes back from a text area. (Caveat: This might not be the right thing to do if you're on a Mac.)

    Another common problem is to emit

    <textarea> stuff </textarea>
    instead of
    <textarea>stuff</textarea>
    The former adds newline on your behalf. The latter preserves the original text (except for newline conversion).

Re: Textarea boxes in CGI
by cfreak (Chaplain) on Jun 17, 2003 at 19:00 UTC

    The newline character is determined by the sending system. On Win/DOS thats \n\r, on *nix its \n and Mac <= OS 9 I believe is \r\n but don't quote me on that (OS X is \n like all Unixes)

    As for your problem: the textarea is going to be returned as a single string that contains newlines so chomp() would only get rid of the last one anyway. If you want the string as an array this should work:

    my $textarea = $q->param('textarea'); my $textarea =~ s/\r//g; # get rid of any \r characters my @array = split(/\n/,$textarea);

    Hope that helps

    Lobster Aliens Are attacking the world!
      I hate to burst your bubble, but Phoenix (Mozilla Firebird :) under Linux returns \r\n as well :)

      cLive ;-)

Re: Textarea boxes in CGI
by sgifford (Prior) on Jun 17, 2003 at 19:24 UTC
    The three line-endings I've encountered are \n from UNIX, \r\n from DOS, and \r from some Mac's. I usually preprocess multiline strings from forms with:
    s/\r\n/\n/g; s/\r/\n/g;
    This puts everything in UNIX-style line endings.

    You could also do this in your split:

    split(/\r\n|\r|\n/,$s)
    ; this is probably a little faster, and saves you a chomp.

      so a general input filter that leaves us with unix line endings for all three would be

      s/\r\n?/\n/g;

      cheers,

      J

      Thanks everyone!... I ended up using the 2nd suggestion above. Although I was slightly overwhelmed by the number of varying responses :). Keep in mind that I am primarily a python person... but I understand TIMTOWTDI (in perl).

      ~evan
Re: Textarea boxes in CGI
by crouchingpenguin (Priest) on Jun 17, 2003 at 19:29 UTC

    My memory is a little foggy (it's almost quitting time for today)... but I think the textarea's wrap attribute can have an affect on this as well. The textarea wrap attribute can be set to HARD,VIRTUAL,OFF,SOFT. The sticky issue is these attributes are browser dependent.

    The white-space CSS property is similar to the textarea's wrap attribute. Possible Values are: normal, pre, and nowrap

    A google search shows this:, this, and this. These describe wrap, but may not be worth much else.

    Update: Right, cLive. I bolded the part of my post that mentioned that. =]


    cp
    ----
    "Never be afraid to try something new. Remember, amateurs built the ark. Professionals built the Titanic."
      It's worth noting that wrap is not a valid html/xhtml attribute for a textarea tag, but a browser extension that (I think) was first brought in by Netscape), and then adopted (with slightly different attribute values) in IE.

      Use with caution!

      cLive ;-)

Re: Textarea boxes in CGI
by arthas (Hermit) on Jun 17, 2003 at 19:02 UTC

    I'm unsure, but it might depend on the browser and on the platform where it runs. Anyhow, you can just do:

    $textareacontent =~ s/\r//g;

    This should only leave you with newlines, which is always a good idea I think (if the server is Unix-style ;)).

    Michele.

Re: Textarea boxes in CGI
by Cody Pendant (Prior) on Jun 18, 2003 at 05:10 UTC
    I don't know if there's a point to summarising all the answers so far but in answer to the question:
    Does anybody know if CGI textarea boxes do newlines msdos style or unix style?
    The answer obviously is:
    1. "No. Because it varies. A lot".
    2. It's possible, due to the WRAP attribute, for there to be linebreaks sent by the form which the user never intended
    3. \r?\n seems to be a good regex.
    So if you're in charge of the form's HTML you can ignore the second point, but you should be very careful of the first, because you don't know what browsers are going to do.
    --
    “Every bit of code is either naturally related to the problem at hand, or else it's an accidental side effect of the fact that you happened to solve the problem using a digital computer.”
    M-J D
      Turn textarea wrapping off by adding 'wrap=off' to your textarea definition.
Re: Textarea boxes in CGI
by cLive ;-) (Prior) on Jun 17, 2003 at 18:49 UTC
    Sorry, I'm not very psychic. Perhaps you could post the stripped down sample code that's causing you problems and then we can try and help?

    cLive ;-)

      use CGI; my(@v) = split(/\n/, param('textbox')); chomp(@v); print @v;
      results in an HTML file with, for instance, if textbox contained:
      BLAH
      BLAH
      BLAH
      
      the same thing, whereas I want it to be:
      BLAHBLAHBLAH
      
        Here are a couple of succinct methods:
        my $ta = param('textbox'); $ta =~ s/\r?\n//g; print $ta; # or if you really want an array: my @v = split /\r?\n/, param('textbox');
        I'm not 100% sure whether *all* browsers will send the \r, hence the ? in the regex.

        .02

        cLive ;-)

        Given the advice of the others (you have a pesky \r that's causing the 'problem'), how about spliting on a zero or one carriage return(s), followed by one newline?
        use Data::Dumper; my @v = split(/\r?\n/, param('textbox')); print Dumper \@v;
        Note the use of Data::Dumper. Don't leave home without it. ;)

        UPDATE: Got \r and \n backwarks (again!) and cLive ;-) beat me to the punch, i should have checked back before i posted. Anyhoo ... all of this should have you covered. If you care to read more about what causes this confusion then check out A Little History on 0D0A and it's replies.

        jeffa

        L-LL-L--L-LL-L--L-LL-L--
        -R--R-RR-R--R-RR-R--R-RR
        B--B--B--B--B--B--B--B--
        H---H---H---H---H---H---
        (the triplet paradiddle with high-hat)