in reply to perl unicode docs
The fact of the matter is, if the pattern switched to the Unicode character scheme then the pattern couldn't possibly match a single character in a UTF-8 string.
Why not?
If by "a single character" you mean a codepoint encoded it multiple bytes, then yes, that's the default mode.
If you by "a single character" you mean a byte of a multi-byte UTF-8 sequence, then yes, even that's possible (with the \C escape. Yes, it's... weird, but it is implemented). (But nowhere in the unicode docs I can find an indication that this is meant).
I think it would be nice if the person who wrote the perlunicode docs had adhered to the basic tenants of unicode when describing perl's state of unicode awareness.
And where did he not? By the way, if you find places where the docs need improvement, don't whine about it, but submit patches.
Or, was the idea to dumb down the docs and present factually incorrect descriptions so that beginners who think that Unicode characters are the same as UTF-8 characters are not confused?
I don't think so. I also don't see how anything of the docs is factually incorrect.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: perl unicode docs
by 7stud (Deacon) on Mar 27, 2010 at 13:09 UTC | |
by moritz (Cardinal) on Mar 27, 2010 at 14:55 UTC | |
|
Re^2: perl unicode docs
by 7stud (Deacon) on Mar 27, 2010 at 13:09 UTC |