Can someone point me to a comprehensive guide to encodings?
I'm still looking for that one, too. Nevertheless, don't expect too much; the biggest problem as far as I can tell for me comes in the variety of forms the related issues show up.
It isn't a Perl problem, it isn't a system/OS/libs problem, it isn't a setup problem, it isn't an I/O problem: it is
more then a bit of all of these.
Once you get your Perl skills sharpened to know what's what, you'll still might be surprized at times.
Make sure you use a Unicode editor when looking at code or data that should contain Unicode characters; I'm not sure Vim is your best friend here. Take care: if you see correct cyrillic chars when editing, this might already be a sign that the data is _not_ Unicode. This proved to be the biggest issue for me personally so far: to be sure about what
is in the file/data as opposed to how it
looks (in editors, web pages, files, etc.)
And be prepared: other people are confused too, and it won't be an exception to get files/mails/HTML pages where the 'announced' encoding is one, whereas the actual content is a real soup of both Unicode/non-Unicode characters.
I hope, oh, I
really really hope that all these headaches will disappear as soon as virtually everyone and everything will line up to Unicode. But I'm afraid that will still take quite a while.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.