Don't let the confusion of perlport and old Macs addle your brain too much. "A" isn't an uppercase A but is merely a logical uppercase A that stands in for the actual bit pattern that applies on that particular platform. When you talk to a SMTP server it doesn't want to hear "HELO" but to hear "HELO" in ASCII so, by the logic of perlport you should never send "HELO" but instead should hard-code the ASCII bit patterns for that. Sounds rediculous, doesn't it? The fact that people don't see it as rediculous when applied to "\n" are several.
"\n" is quite simply the newline character on the current platform.
The fact that old Macs claimed to be ASCII systems and so don't provide any translation layers to real ASCII and yet defined "\n" as something other than ASCII newline (but "\n" is still the newline character for old Macs), has caused a lot of broken thinking in the Perl world. I have a few nodes where I go into this in more detail.
Following the advice in perlport means that you write code that isn't portable unless you are on an ASCII system while just writing "\n" means that your code is portable to every system in the world except for old Macs. The popularity of old Macs vs. the popularity of Perl scripts on non-ASCII systems makes the balance here switch toward favoring portability to one single (somewhat broken) system over just coding portably. But the mental hoops that people have jumped through to try to rationalize the practice of making things that work on old Macs (while ignoring the existance of non-ASCII systems) are impressive and have caused tons of confusion.
Sorry, I don't have time to go into this further at this time. I'll try to throw in links to my prior discussions of this as I find time.
| [reply] |
The fact that old Macs claimed to be ASCII systems and so don't provide any translation layers to real ASCII and yet defined "\n" as something other than ASCII newline (but "\n" is still the newline character for old Macs), has caused a lot of broken thinking in the Perl world.
I don't find the thinking to which you are referring to be "broken." The broken thinking, if there is any, is in conceptualizing the escape "\n" as popularized by C to be an ASCII character. In the above, even you called it an "ASCII newline". But there is no such thing. The C standard is very clear about "\n" being implementation dependent.
All that said, it would certainly be easier on everyone if LF were universally accepted as the newline character and we could finally put an end to the suffering that continues to be inflicted upon us by long dead hardware issues and designed incompatibilities.
-sauoq
"My two cents aren't worth a dime.";
| [reply] |
But there is no such thing [as] "ASCII newline"
s/newline/line feed/, if that makes you feel better. I don't make a distinction between "newline" and "linefeed", since their usage is very often mixed and there is little in those names to make the distinction clear, but I may try to honor that distinction in future (since I see this distinction being made in many references I check). I make a distinction between "ASCII (newline|linefeed)" vs. "local (newline|linefeed)" vs. "filesystem (newline|linefeed)". I certainly find plenty of evidence that there is an "ASCII line feed", which is what I meant. Note that http://foldoc.org/?query=+newline even says "Though the term 'newline' appears in ASCII standards, it never caught on in the general computing world before Unix", so I'm not sure I believe your assertion (it is too bad that the ASCII standard is likely still not free to download).
The broken thinking, if there is any, is in conceptualizing the escape "\n" as popularized by C to be an ASCII character
No, "\n" isn't an ASCII character (and, to be clear, I never claimed that it was). It is a local character. On all ASCII systems it is ASCII line feed, except for old Macs where they chose to be "lazy" and try to avoid binmode by defining "\n" in C incorrectly. Though if you find me an ASCII or non-ASCII system besides old Macs where "\n" isn't the "line feed" character, then I'll have to adjust my thinking. I don't think you will, however.
All that said, it would certainly be easier on everyone if LF were universally accepted as the newline character
"newline character"?? There is no such thing. (: The newline sequence in file systems is widely varied and often isn't a single character. Even Unix knows that it has to translate "\n" to "\r\n" on output (it just doesn't translate when doing output to a file, waiting to do it only when doing output to a device; and translate "\r" to "\n" on input similarly). Many systems encode newlines outside of the data bytes of the file (in meta data). So "newline" can be one character, more than one character, or not a character at all. :)
I don't find the thinking to which you are referring to be "broken."
Then find a non-ASCII system with Perl on it and open a socket from it to some SMTP server on the internet and send "HELO\015\012" to it and tell me if it works. There are two possibilities, either "HELO" will show up in non-ASCII (and your system is broken) or "\015\012" will get translated from the local character set to character set that is expected to be received over internet TCP/IP connections and likely won't be "\015\012" any longer.
The thinking is broken in thinking that you should hard-code the ASCII bit patterns for some characters but not for all. It happens to work on ASCII systems and on old Macs. And those are the only systems that they seem to know or care about or understand.
You should never hard-code character bit patterns unless 1) you've first checked that you are on a system where such will work or 2) if you are hard-coding all ASCII bit patterns in order to write an ASCII stream no matter whether your system is ASCII or not. And perlport is "broken" for thinking otherwise.
| [reply] |