your repeated misuse of the noun 'weight' was changed to what you really meant, the verb 'weigh,' is that being pedantic?
For you to be able to point out my error and inform me of "what you really meant", you must have understood what I was trying to communicate; thus there is no purpose in pointing it out, and feeling the need to do so is the very definition of pedantry. So yes!
I work with a lot of programmers. They are definitely not pedants. I wish more of them were more "pedantic," in fact. I have never heard any of them refer to a "Unicode string" -
Hearsay. Appeal to authority.
"That which is asserted without evidence; can be dismissed without evidence." -- but where's the fun in that!
they all say "UTF-8 string" which at least narrows it down to the encoding.
If they and/or you think that utf-8 is the only form of Unicode, its no wonder that you find Unicode confusing.
I don't; I find it eminently clear. So clear in fact that I can see its inherent flaws.
If you actually read the context, you'd discover that I was referring to files of Unicode data, each of which could be encoded an any of the many Unicode encodings, thus the strings being referred to can be encoded in any one of those Unicode encodings; and so I referred to them collectively as "Unicode strings".
In exactly the same way as the Unicode Consortium themselves do:
- The Unicode string <U+0061, U+FFFF, U+0062> is just a sequence of 3 Unicode characters. It is valid *for* use in internal processing, because"
- "String is a Unicode string type--an array of UTF-16 code units. That is the internal encoding of String is UTF-16."
- "Computing the length or position of a "character" in a Unicode string can be a little complicated, as there are four different approaches to ..."
- "Unicode String. A code unit sequence containing code units of a particular Unicode encoding form (whether well-formed or not)."
- "Are noncharacters invalid in Unicode strings and UTFs?"
- "A Unicode string with the sequence, say, <U+0300, U+0061> (a combining grave mark, followed by "a"), is "valid" Unicode in the sense that"
- "At this point, if one is transforming a Unicode string to NFD or NFKD, the process is complete"
- "That is, any sequence of bytes can be converted by each charset converter to a Unicode string, and that Unicode string would be converted"
- "Briefly stated, the Unicode Collation Algorithm takes an input Unicode string and a Collation Element Table, containing mapping data for characters"
- "the corresponding Unicode string sequence"
- "occur in a UTF-16 Unicode string; instead, the code unit sequence"
NOTE: the qualification.
- "Depending on the programming environment, a Unicode string may or may not also be required to be in the corresponding Unicode encoding form."
- And many, many more.
Using your own misunderstanding, and appeals to the authority of "lots of programmers you work with"; to 'correct' my appropriate use of the term; is further evidence.
Why so cavalier about inter-human communication?
3 days or a week from now, once I've forgotten what I intended to write in the OP; I'll re-read it and those errors will stand out like a sore thumb, and I'll likely correct them as I often do.
In the meantime, look up the word aphasia.
If you had a lisp or limp or a lazy eye and I used it as a way of trying to mock you; I would rightly be condemned for it.
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
In the absence of evidence, opinion is indistinguishable from prejudice.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.