I don't use "\n" because on some encodings this is not the real "\012".
Also, my regular expression solves some weird non-unix and non-mac files I've found, which have first the newline, then the carriage return.
| [reply] |
I don't use "\n" because on some encodings this is not the real "\012".
What encodings? The unicode mechanism for specifying \n is 0x000a 0 according to Unicode Standard Annex #13: Unicode Newline Guidelines. Sure, there is EBCDIC, but translating around the \r doesn't help fix newlines on EBCDIC.
Also, my regular expression solves some weird non-unix and non-mac files I've found, which have first the newline, then the carriage return.
I've never heard of a system that used \n\r. Do you know what generates those files?
0: 0x0a is the same as \n in standard unix land, the unicode equiv is just null prepended.
| [reply] |
re ordering of CarriageReturn and LineFeed, "...some weird non-unix and non-mac files ... have first the newline, then the carriage return. IIRC:
'doze: 0x0d, 0x0a
pre *n*x mac: 0x0a, 0x0d
*n*x: 0x0a
\n, in a perlish sense, is NOT relevant; perl is compiled to use system defaults, so "\n" may be any of the above (and possibly some not mentioned), depending on where it's running
| [reply] |
If your default encoding is utf16, \n will have two bytes. If the file is from DOS it will never match.
| [reply] |