in reply to Unix \n vs. DOS \n

Somewhere, Tom has a large writing about this. But the basics is that DOS stores CR LF only for text files, and only when written on a physical device. As soon as you read it in, the C library turns the physical line ending of CR LF into the logical newline \n. And when you write it to a file, the reverse happens. That is, if you run the program under DOS.

If you take your DOS file to a Unix platform, only the LF gets mapped to the logical newline \n (which happens to be represented with a LF character as well). The preceeding CR byte is considered by Unix to be just another byte. Also note that chop chops of the last character of a string. One character, nothing more. So, if you are on Unix, reading a line from either a Unix file or a DOS file, the last character will be LF, aka \x0A.

So, yes, the comparison should have been done with eq instead of ==, but that still doesn't make a difference, "\x0A" eq "\xOA".

There is flawless way to determine wether something is a "Unix line" or a "DOS line". "Unix line"s end with a LF character, and "DOS line"s with CR LF. However, there is nothing that forbids a "Unix line" to have a CR character just before the LF character.

-- Abigail