Somewhere, Tom has a large writing about this. But the basics is that DOS stores CR LF only for text files, and only when written on a physical device. As soon as you read it in, the C library turns the physical line ending of CR LF into the logical newline \n. And when you write it to a file, the reverse happens. That is, if you run the program under DOS.

If you take your DOS file to a Unix platform, only the LF gets mapped to the logical newline \n (which happens to be represented with a LF character as well). The preceeding CR byte is considered by Unix to be just another byte. Also note that chop chops of the last character of a string. One character, nothing more. So, if you are on Unix, reading a line from either a Unix file or a DOS file, the last character will be LF, aka \x0A.

So, yes, the comparison should have been done with eq instead of ==, but that still doesn't make a difference, "\x0A" eq "\xOA".

There is flawless way to determine wether something is a "Unix line" or a "DOS line". "Unix line"s end with a LF character, and "DOS line"s with CR LF. However, there is nothing that forbids a "Unix line" to have a CR character just before the LF character.

-- Abigail


In reply to RE: Unix \n vs. DOS \n by Abigail
in thread Unix \n vs. DOS \n by greenhorn

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.