Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm sure this is a simple task, but I am a newbie & don't know how to accomplish this.

I have a form with an area for entering text. After saving the data entered from the text area I need to strip any carriage returns from the string, and have been doing so in the following manner:

$text_area =~ s/\n//g;

And then I'm saving the string to a file. When the form is submitted from a Windows machine & there are carriage returns they are passed as DOS carriage returns. After replacing the \n with nothing there are still ^M characters in the file when viewed in vi. What regex is needed to get rid of the remaining characters?

Replies are listed 'Best First'.
Re: DOS characters
by blakem (Monsignor) on Sep 05, 2001 at 23:52 UTC
    If $text_area =~ s/\n//g; is working for you now, other than the ^M issue, try:
    $text_area =~ s/[\r\n]+//g;
    Though, I would think you'd want to replace EOL chars with a space, otherwise the first word on one line will get glued onto the last word in the previous line. I'd suggest:
    $text_area =~ s/[\r\n]+/ /g;
    instead to avoid this situation.

    -Blake

      Transliteration would be more efficient than a regexp for this kind of operation:
      $text_area =~ tr/\r\n/ /s;
      This removes any sequence of carriage returns or newlines, replacing them with a single space.
Re: DOS characters
by inverse (Acolyte) on Sep 05, 2001 at 23:58 UTC
    Look up the function chomp. It will do what you want.

      Look up the function chomp. It will do what you want.

      Actually it won't:

      $textarea = "Line1\015\012Line3\015\012Line3\015\012"; chomp $textarea; print $textarea.'----'; print "\n"; $textarea =~ s/[\015\012]/ /g; print $textarea;

      It is better to specifically target CR and LF using their octal or hex escapes, thus the \015\012 instead of \r\n

      cheers

      tachyon

      s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

        I've always used /\cM\cJ\/

        -Lee

        "To be civilized is to deny one's nature."
      I think he is trying to collapse multiple lines into a single one, in which case chomp will not work.
      #!/usr/bin/perl -wT use strict; my $text = "abc\ndef\r\nghi\njkl\n"; print "1.)T='$text'\n"; chomp($text); print "2.)T='$text'\n"; $text =~ s/[\r\n]+/ /g; print "3.)T='$text'\n";

      -Blake