The line endings of files generated by perl will match the standard for the operating system that you run perl under. So if you run your perl script under windows, and then examine the output with unix/linux, you will see extra ^M Line endings.

Try a simple test program

#! perl open OUTFILE, '>', 'test_out.txt' or die "Error writing to test_out.tx +t $!"; print OUTFILE "Hello\n"; print OUTFILE "world\n"; close OUTFILE;

You should find with the script above that if you run it under windows, and then open it under unix, you should see those same ^M line endings.

To see what is going on, open the output in a hex editor. The file generated under windows should look something like this:

00000000 48 65 6C 6C 6F 0D 0A 77 6F - 72 6C 64 0D 0A Hello..world..

While the file generated under unix will look like:

00000000 48 65 6C 6C 6F 0A 77 6F - 72 6C 64 0A Hello.world.

Note that the file from windows has two return chars (0D 0A), where as the unix one has just one (0A). This illustrates how the different platforms have different new line codes. If you open a file generated on one platform using a dumb editor on a different one, then you will see artefacts from the difference in return codes, for example if you open the unix output file using windows notepad, you won't see a newline between hello and world. (Smarter editors usually detect the difference in line ending and automatically do the right thing for you.)

In other words, the ^M codes you are seeing are nothing to do with your program, or how you it is loading data, but come from the platform you are running your program under, and the platform which generated it's input data.

If you are running your perl script under windows, and then seeing the ^M codes when you read it's output under unix, then as viveksnv suggested, you should use dos2unix to convert the output file

If your script runs under unix, but is processing files generated under windows, the you can either use dos2unix to pre-convert all the input files before processing, or you can use a regular expression such as $line =~ s/\s+$// to strip all trailing white space from the end of each input line before further processing. This is more powerful than chomp as it will remove more than one newline character, though obviously you need to be careful with it if you might need trailing white space on lines to be preserved.


In reply to Re: ^M chars in output file by chrestomanci
in thread ^M chars in output file by locust

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.