Now, one can only guess what is happening, but a possibility to look into is that a plain ISO-Latin-1 text string could be concatenated with something that Perl has flagged as a UTF-8 string. Whenever that happens, perl will "promote" the ISO-Latin-1 string to UTF-8, turning each of the bytes with value >= 128 into two or three bytes.
A possible fix, to be on the safe side, it's applicable everywhere, is to make every non-Ascii character an entity, either named entities as by using HTML::Entities, or as numerical entities like ¥, where the number is nothing but the ordinal character code in the Unicode/Latin-1 character set.
n.b. These characters in the above posts are actually not in the ISO-Latin-1 repertoire. They are in the Windows character set, though, which is compatible with ISO-Latin-1 plus a few extra printable characters. So in order to be according to the rules, their numerical value should be replaced by their ordinal value in Unicode.
update So the author of my first example fixed up his node, thereby removing my evidence. :( Well I found another one here.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: ISO-Latin-1 as node and UTF-8 in frontpage (not for me)
by tye (Sage) on Sep 14, 2003 at 05:08 UTC | |
by bart (Canon) on Sep 14, 2003 at 16:42 UTC | |
by allolex (Curate) on Sep 14, 2003 at 10:51 UTC | |
|
Re: ISO-Latin-1 as node and UTF-8 in frontpage
by dws (Chancellor) on Sep 14, 2003 at 16:34 UTC | |
|
Re: ISO-Latin-1 as node and UTF-8 in frontpage
by mandog (Curate) on Sep 14, 2003 at 22:50 UTC | |
|
Re: ISO-Latin-1 as node and UTF-8 in frontpage
by castaway (Parson) on Sep 16, 2003 at 07:31 UTC |