I have lots of name in a UTF8 text file.
When I try to lowercase or uppercase them, any accented characters remain in their original case.
The following translation works fine for nuking a string (lowercase):
tr/AÁÀÂÅÄÃBCÇDEÉÈÊËFGHIÍÌÎÏJKLMNÑOÓÒÔÖÕPQRSTUÚÙÛÜVWXYŸZ/aáàâåäãbcçdeéè
+êëfghiíìîïjklmnñoóòôöõpqrstuúùûüvwxyÿz/;<br>
but it doesn't allow me to make cool use of things like \u for capitalizing words in a substitution.
I'm running PERL v5.8.9 built for darwin-2level on Mac OS X Leopard (standard distro).
I've got use UTF8;
My setlocale refuses to work - error message of "Undefined subroutine &main::setlocale called"
my system locales are all variations on LC_CTYPE="en_US.UTF-8" which may be hindering my adventure (the names are French)
I'm sure I'm not the first person to experience this behaviour - but a lot of googling has led to nothing but others with success by adding "use utf8;" (which I already had).
Advice? Ideas?
I don't want to have to iterate over every character in the string manually. The tr above is not elegant, but it works.
Thanks for any assistance you can provide!
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.