"Somewhere" along the path in your new environment, at least once component changed the way it handles encoding/decoding data.

You will need to track down every border between all your components and make sure that all data is in the format you expect. Preferrably, you transfer all data encoded as UTF-8 between your components, and decode to Unicode on input/retrieval, and encode on output/web page.

The checklist is roughly:

  1. Find out how the data is stored in the database
  2. Find out how the database driver delivers the data. Preferrably make it deliver the data encoded as UTF-8 and have the DBD decode it to Unicode.
  3. Find out how the data is stored in text files on disk. Preferrably encode them as UTF-8.
  4. Find out how the data is read from the files. Preferrably decode it to unicode.
  5. Find out how the data is converted/concatenated with other data (for example, templates). Either encode to the target character set or decode to Unicode.
  6. Find out how the data is written. Make sure that the encoding used for the data matches the encoding used for the headers and the encoding stated in the HTML page.

In reply to Re: UTF-8 trouble moving from perl 5.8.5 to 5.10.1 by Corion
in thread UTF-8 trouble moving from perl 5.8.5 to 5.10.1 by manni

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.