Re^7: best sort

Which human languages are you fluent in, and which ones are you merely competent in?

Also, how much higher mathematics have you done?

That’s not a flip question, nor a prying one. It is directly relevant to the discussion at hand, because its answer makes a difference.

You see, I’m trying to understand your biases. I know what it looks like, but I’m hoping I’m wrong about it. That’s of course why I’ve asked.

If you never have cause to use anything that God didn’t put into ASCII because you’re an English monoglot who refuses to spell imported words or even people’s names they wish them spelled...
if you plan to stay in your humble hamlet the rest of your life...
if have no cause for such specialist characters like dashes, curly quotes, degree symbols, or the extraordinarily rich set of symbols needed by by scientists and mathematicians...
and if you don’t care to interact with people who do...

...then it makes perfect sense that you will have a very different set of biases compared with people who actually do any of those things.

All I keep hearing is the same old grumpy-grampa story about walking to school in the rain uphill everyday both ways — that is, that ASCII was good enough for you when you were a tad, so by gawd awlmighty it should be good enough for these selfdeluding young whippersnappers.

You are also brazen in your profoundly disturbing advocacy of the offensive position that everybody else in the world should learn your bloody language instead of ever once admitting that just maybe you ought to learn theirs.

As for Unicode, just because you don’t understand it or don’t like it doesn’t mean there is something “wrong” with it. Most of your statements about it are either flat-out wrong or so misleading and misrepresentative as to make any rational person wonder why you would be intentionally deceptive.

Unicode is not going away. You’ll be dead long before it is, something pretty much guaranteed by your “over my dead body” attitude about condescending to learning anything new. Unicode is here, and it’s here to stay, and no amount of old FUDdy bellyaching from you is going to change that. That means you jolly well ought to get used to it. Either that, or retire and crawl back into your tiny little hole and die. Your choice. Some of us prefer to engage the world, not fight against it. And that’s our choice.

I haven’t seen you doing anything to try to make Perl better, whether in its Unicode handling, its text processing, nor indeed anything else. I haven’t seen any bug reports, patches, or even questions. And I certainly haven’t seen any feedback from you during the review period for the various public issues that come up. That gives the appearance that all you want to do is complain, and that makes you part of the problem set, not the solution set.

If you won’t work with the rest of us to make the world a better place, than at least have the human decency to stop trying to make it a worse one — just let us go about our own business unhindered and unharangued.

Perhaps I’m wrong. If so, it’s perfectly easy for you to show me that in a way that is publicly credible. Merely publish your full legal name like the rest of us responsible internet citizens do, and I’ll search the relevant discussion archives for your constructive participation in these matters. As soon as I find it, I’ll gladly reconsider my position. Frankly, I’m looking forward to it, because the alternative is pretty sickening.

Otherwise you’re just another greyheaded internet loudmouth who sickly enjoys bitching and attacking just to wind people up and waste their time, and who can’t be bothered to do one damn thing toward bettering the situation he won’t stop ranting about.

In other words, put up or shut up. One or another of those two little spirits who sits upon our shoulders feeding us counsel is whispering that the smart money says you’ll do neither, but we shall see what we shall see, shan’t we now? My cards are on the table for everyone to look at; time for you to show yours.

Comment on Re^7: best sort

Replies are listed 'Best First'.
Re^8: best sort by BrowserUk (Patriarch) on Aug 16, 2011 at 20:20 UTC
BIG, BOLD, FLIP, AND ENTIRELY IRRELEVANT QUESTIONS HERE! ... you plan to stay in your humble hamlet the rest of your life... I worked all over Europe and Scandanavia, including over 4 years in one European country with a multi-national programming team. That's one of the reasons that I advocate minimal commenting. If you need to translate comments, with all their typically informal language usages, in order to understand the code you are working on, it is a nonsense that slows work to a crawl. That's one of several reasons why a lingua-franca is even more important when working across national borders than it is when working isolated within them. What language is the lingua-franca is irrelevant, it just happens, by virtue of history, to be English. I was jointly responsible for adding bi-directional language support -- which meant working with Hebrew, Arabic and Farsi amongst others -- to OS/2 back in the day. I also took the lead in delivering magazine front install CDs for OS/2 Warp in 13 different languages. In both cases, the native language text was treated internally as opaque binary indexed by language and message number. It would have been impossible to get sufficiently versed in all the required languages for either project, so pragmatism reigned as it should. Translations to target languages were performed by native-language/English bilinguals and verified by English/native language bilinguals. And the process iterated until they agreed. The code was written by a variety of nationals -- me, an Israeli and an Egyptian for the former; and a whole bunch of nationalities for the latter -- in the English-based computer language of choice (C), with comments in English where necessary. Any other approach would have been silly. I'm a mono-plus-several-less-than-halfs-glot, but my linguistic skills are irrelevant unless you are truly advocating that every programmer should learn every human language on the planet? If not, in fact, even if you were, the only sensible solution is to have a lingua-franca so that each programmer only needs to learn their native language + that lingua-franca, not all 7000 natural languages, nor even the 200 or so in common usage around the world. Whatever language is the lingua-franca one bunch of programmers will have an advantage. As is, that means I haven't had to become properly bilingual. Had it been French or German or Italian or Spanish, I probably would have still had a successful career. How would you do if it suddenly changed to Mandarin Chinese or Urdu or one of the Cyrillic-based languages? you’re an English monoglot who refuses to spell imported words or even people’s names they wish them spelled... I never suggested for one moment that text should only be ascii. Only that the Unicode mechanism whereby I can receive a file of "text" and have absolutely no way of determining which of the many Unicode encodings it contains -- nor even if it actually contains any of them -- is a nonsense. Unicode as it stands is multiple fixed and variable length binary formats without no identifiers or headers. As I said, imagine taking a directory of mixed format image files, striping out the headers and then writing a program to work out how to display them all. That's a direct analogy to the situation today with "unicode". It is farcical! In other words, put up or shut up. When you address my question about how you are going to solve the problem of sorting names written in Latin, Cyrillic, Arabic, Farsi, Thai, Chinese, Japanese, Urdo, Gaelic Ge'ez, Osmanya, Tifinagh ... et al. I'll consider it. Because until then, you've only partially -- the latin part -- solved the real problem. And that part was "solved" decades ago. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re^9: best sort by tchrist (Pilgrim) on Aug 18, 2011 at 00:08 UTC
You seem to have managed to confuse Unicode with serialization formats. That’s a shame. As for knowing what sort of data content you have, that has never been Unicode’s job. That is something one must relegate to a higher-level protocol. It’s just like with receiving a file over the web. If you expect to know what to do with the file, then you need various bits of metadata to know how to handle it. If someone sends you a file but doesn’t tell you what’s in it, that’s a personal problem. It’s not a Unicode problem at all. You have a social problem, which is something else altogether. You need a better higher-level protocol is all. That said, because Unicode was both exceedingly careful and also reasonably clever about how it defined its approved variable‐width serialization schemes, I have no trouble in the world at all knowing which of the three I have: `$ perl -CS -S unichars Singleton > sample-one $ iconv -f UTF-8 -t UTF-16 < sample-one > sample-two $ iconv -f UTF-8 -t UTF-32 < sample-one > sample-three $ file sample-{one,two,three} sample-one: UTF-8 Unicode text sample-two: Little-endian UTF-16 Unicode text sample-three: Unicode text, UTF-32, little-endian` [download] There aren’t many different flavors of Unicode as you frequently allege. There can be only one. That’s what the “uni” part is about. That’s why things like Perl and XML and HTML are always all Unicode, all the time: because it always means the same thing. It makes no matter whether you say `chr(233)` in Perl, `é` in HTML, or `é` in XML. Those are always the same character, because the Unicode mapping of assigned code points to characters is always the same and guaranteed never to change. And that character is always LATIN SMALL LETTER E WITH ACUTE. Similarly, something like HTML’s `é` always maps to Unicode code point 233. It’s not like the same character is a code point 142 on a Mac and code point 221 on NextStep. That would be wrong. That’s why modern systems like Perl and HTML and XML are 100% Unicode: so that assigned code points always mean the same character. There is only one flavor of Unicode, or it wouldn’t be Unicode. I suppose you might stump for Unicode 6.0 being a different flavor from Unicode 5.0, but that seems to be putting too fine a point on it. In any event, the strong stability guarantees Unicode avoid train wrecks in that arena. Which is quite all the time I have for a belligerent anonymous coward, and then some.	[reply] [d/l]
Re^10: best sort by BrowserUk (Patriarch) on Aug 18, 2011 at 01:45 UTC
you need various bits of metadata to know how to handle it. Ah. So when a Unicode file is sent somewhere, it needs to be accompanied by another file containing metadata to identify which "unicode" the first contains. But what encoding is the metadata in? Now you need another file ... Oh yeah! That's great design. a belligerent anonymous coward, Translations: belligerent: someone who doesn't immediately agree with the VIP tchrist. anonymous coward: someone you can't intimidate when you run out of good arguments. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re^11: best sort by tchrist (Pilgrim) on Aug 18, 2011 at 16:24 UTC
Re^12: best sort by BrowserUk (Patriarch) on Aug 18, 2011 at 18:18 UTC