Pathologically Eclectic Rubbish Lister | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
I have a string that I'm getting from a UTF-8 database that contains a UTF-8 character in amongst the generic ones. The string is "Mali Lošinj". I'm trying to turn the S w/caron into a normal S. My main script is doing a similar thing with other european characters (removing funny accents and stuff on them) but they other characters are all also present in ISO-8859-1. It's just the š character that I can't get to work. And I assume I'll have the same problem with any other utf8-only characters I come across in the future. I have reduced the problem down to the following code:
I assumed I needed "use utf8" as the Š characters are "in the code", but if I uncomment "use utf8", none of the translations or substitions have any effect, I just print out the original text each time. When "use utf8" is commented out, I get the following: Mali Lošinj1(Mali Los�inj)2(Mali Lossinj)3(Mali Lossinj)4(Mali Lošinj)5(Mali Losinj)6(Mali Lossinj) The first one, converts it to an "s" followed by an unidentified character (However, this should do nothing because it is the wrong case). So, what am I not understanding here? And what would you suggest as the most appropriate course of action? I have a single tr/// line altering 51 other ISO-8859-1/UTF-8 characters without any problem. Cheers. MattLG In reply to utf8 characters in tr/// or s/// by MattLG
|
|