Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

I am trying to perform substitutions on the text that has been submitted from an HTML form using the TEXTAREA element.

In my HTML I have a TEXTAREA element named info and I try to do the following in the called Perl script:

... my $info = $query->param('info'); $info =~ s/\n/<p>/g; my $sql = sprintf "insert into test values(%s)", $dbh->quote($info); $dbh->do($sql) || die; ...

However, when I submit

Hello, This is a test. Bye Bye.
to the HTML form, I see:
test=> select * from test; stuff ----------------------------------------------- <p>Bye Bye.a test. (1 row)
in the database.

Why?

Replies are listed 'Best First'.
Re: Replacing new line characters from HTML TEXTAREA submissions
by dvergin (Monsignor) on Mar 14, 2001 at 15:30 UTC
    Are you perhaps on a Windows machine? If so, you have dealt with the \n in each end-of-line \r\n pair, but you have left the \r behind. So the lines are over-laying each other. I submit that what you actually have as the result of your SELECT query is (ignore the fact that I am showing this on separate lines; you have gotten rid of the \n's but it helps to show how things line up.):
    Hello,\r <p>\r <p>This is a test.\r <p>\r <p>Bye Bye.
    At the end of each line, the \r, is just moving the cursor back to the beginning of the line and writing the next line right over the top of the previous one. Smash each of those lines right over the top of the previous line and by the time you get to the end, you have:    <p>Bye Bye.a test. Once you see that, you are probably clear that your regex should be:    $info =~ s/\r\n/<p>/g; Hope that helps. (Actually, you probably didn't need anything past the heads-up in the first line or two of this post.) And if you are not on Windows, then I don't have a clue! ;-) MSgremlins, perhaps...

      Thankyou!

      I changed my code to read

      $info =~ s/(\r?\n)+/<p>/g;

      and now it works (and catches blank lines too).

      --
      The Original Posting Monk

Re: Replacing new line characters from HTML TEXTAREA submissions
by jlawrenc (Scribe) on Mar 14, 2001 at 21:12 UTC
    Sometimes I'll use the regular expression:
    $info=~s/[\n\r]{1,2}/<p>/gms; or $info=~s/[\n\r]+/<p>/gms;
    Maybe it is a bit too greedy for some people but for me, ack, one more newline and/or carriage returns is enough for a P tag.

    I just noticed that your regular expression was just /g, keep in minde it WILL stop matching on the first line. You gotta let 'er match the entire string.

      Regarding:
              ...it WILL stop matching on the first line.
             You gotta let 'er match the entire string.

      Nope. (Did you test your assertion before posting?)

      Viz:

      my $s = "line1\r\nline2\r\nline3\r\n"; print $s; $s =~ s/[\n\r]//g; print $s;
      Prints:
      line1 line2 line3 line1line2line3
      It's good that you are alert to the pitfalls of matching in multi-line strings. (They can catch you off guard.) But your assertion perpetuates a common confusion that is only confusing if you let it be.

      Here's what both perlman:perlre and/or a quick test script will show you:

      /s simply allows . to match end-of-line chars anywhere in the string (if needed) /m simply allows ^ and $ to match begin/end of lines in mid-string instead of only the begin/end of the entire string. /g is happy to repeat searching over several lines m// (simple search) will find something that occurs on a line after the first one \n \r will both be found (if present) without resorting to /s or /m
Re: Replacing new line characters from HTML TEXTAREA submissions
by dws (Chancellor) on Mar 15, 2001 at 00:14 UTC
    This may be complete Cargo Cult on my part, but it works on IIS and on Apache (FreeBSD and Linux), and it works for a variety of browsers on a variety of platforms. The idea is to first replace <cr><lf> with a platform newline sequence, then convert any lingering <cr>s. (I've not tried this out on a Mac-based web server.)
    $query->param('textarea') =~ s/\r\n/\n/g; $query->param('textarea') =~ s/\r/\n/g;