I do think the table being latin1 is a
part of the problem. On the other hand, the application that fills the table
- seems to use a reasonable encoding (UTF-8).
- If you change the table to something unicodey, that application most probably will NOT automagically insert a unicode character instead of the current 4 bytes.
Probably the easier solution will be to check for bytes between 0x80 and 0x9F (because these are not defined for ISO 8859-1, the "official" Latin1). If they are not used otherwise in your variant of Latin1, it might be feasible to try it with
Encode::decode.
What happens, if you insert something like
{
use Encode qw(decode :fallbacks);
$text = decode('UTF-8', $text, FB_WARN);
}
after reading $text from the database?
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.