lev has asked for the wisdom of the Perl Monks concerning the following question:

Hello,
My perl-cgi program copies by number one-liners (using file I/O) from a list in a file that is in my html directory which can be displayed as a web page. New one-line strings are added or old removed from the list dynamically with buttons on a separate web page. Each chosen one-liner needs to be modified with regexp in order to be passed to a javascript function for display in a text area on this web page. For example, quotes must be removed or re-escaped ($sqlQuery =~ s/\'/\\'/g;).

A character found at the end of some of the one-liners as "^M" must also be removed. It appears only in the unix(html) file in blue, contrasting the adjacent white text. I have no idea of it's origin or meaning. Using the chop function (or $string =~ s/.$//g;) works where this "^M" tail is present, but it's indiscriminate and $string =~ s/\\^M$//g; does nothing (with or without escaping "^"). What's the nature of this blue "^M" character and how do I select it out from the string?

My thanks for any answers.
  • Comment on A 'strange' character("^M") of contrasting color appearing unexpededly at the end of lines of a unix file. How can it be removed?
  • Select or Download Code

Replies are listed 'Best First'.
Re: A 'strange' character("^M") of contrasting color appearing unexpededly at the end of lines of a unix file. How can it be removed?
by jwkrahn (Abbot) on Dec 14, 2010 at 17:27 UTC

    "^M" is a carriage return character.    You can represent it in a regular expression as /\cM/ or /\x0D/ or /\015/ or /\r/.

Re: A 'strange' character("^M") of contrasting color appearing unexpededly at the end of lines of a unix file. How can it be removed?
by cdarke (Prior) on Dec 14, 2010 at 18:01 UTC
    Using the chop function

    When reading files from Windows on UNIX it is often useful to:
    local $/ = "\r\n"; chomp $string;
    (rather than chop)

      or even use the ":crlf" PerlIO layer to automatically translate \r\n <—> \n.

      On Windows, the layer is active by default, which is why the "^M problem" does not occur. But there's nothing that would keep you from using it on Unix, too, in case you need to handle Windows-style files.

Re: A 'strange' character("^M") of contrasting color appearing unexpededly at the end of lines of a unix file. How can it be removed?
by planetscape (Chancellor) on Dec 14, 2010 at 20:54 UTC
Re: A 'strange' character("^M") of contrasting color appearing unexpededly at the end of lines of a unix file. How can it be removed?
by Anonyrnous Monk (Hermit) on Dec 14, 2010 at 17:32 UTC
Re: A 'strange' character("^M") of contrasting color appearing unexpededly at the end of lines of a unix file. How can it be removed?
by raybies (Chaplain) on Dec 14, 2010 at 18:30 UTC
    I use  s/\r//gs;. \r\n is the default line terminator of DOS/Windows text. If you have a file from a windows machine, and you're on linux, run "dos2unix" on it. --Ray
Re: A 'strange' character("^M") of contrasting color appearing unexpededly at the end of lines of a unix file. How can it be removed?
by TechFly (Scribe) on Dec 14, 2010 at 20:11 UTC

    I recently ran into this same issue. I wrote the following short script to fix it. I don't know if it will work with everything, but it has for everything I have given it. Just supply the info it asks for, and it will create the FIXED file for you.

    #!/usr/bin/perl -w use strict; use warnings; my $FILE; my $DST; my $filename; print("What file do you want to remove the trailing Carrige Returns fr +om:"); while($filename=<STDIN>){ chomp($filename); last; } open (FILE, '<', $filename) or die "Failed to open file:$!"; open (DST, '>', "FIXED".$filename) or die "Failed to open fix file:$!" +; while (<FILE>){ if($_ =~ /\r$/){ s/\r//; print DST $_; } else{ print DST $_; } } print("The file $filename has been fixed. The new file name is FIXED". +$filename.".\n"); close FILE;

    I am sure there are improvements that could be made, but it was a nice study on input output and basic regex.

    I found that the ^M is indeed the same as a \r as is mentioned a couple of times above. This script just replaces it with nothing, essentially deleting it.

    Cheers.

      You may have an intention error in your while(<FILE>) loop. Right now, you're testing if the string ends with a \r and then you're removing the first occurrence of it. If your string is "Hello \rthere\r\n", your code would print "Hello there\r\n" to DST.

      If I understand your intention correctly, I would replace the while loop with the following:

      while (<FILE>){ s/\r$//; print DST $_; }

      If you want to get rid of all occurrences of \r in the line, you could just say:

      while (<FILE>){ s/\r//g; print DST $_; }
      Heh, now that I look at it there, I notice that I neglected to close DST too. Oops.
Re: A 'strange' character("^M") of contrasting color appearing unexpededly at the end of lines of a unix file. How can it be removed?
by AR (Friar) on Dec 14, 2010 at 17:07 UTC
    The "strange character" is the line feed character. It's used in windows and macintosh line endings, but not in unix. It is \r in Perl. Disregard my entire post. Thanks.
      Warning to future readers: Sorry; ^M is a 0x0d or "carriage return" a line feed is 0x0a (as correctly stated by jwkrahn, well before I posted this).

      AR's reference to the use of a return in "macintosh line endings" is at best flawed, as Apple adopted nix'ish LFs for its more recent OSen.

      Apart from the name "line feed" which should have been "carriage return", you were pretty spot on.

      So don't be so hard on yourself.