Hello,
I have used the Perl module Encode to convert data from a database containing UTF-8 data to Latin1 when outputting to a file.
Basically, the code is:
open (FILE, ">:encoding(iso-8859-1)", "$file");
It works fine except that some characters such as quotes, double quotes, dashes, astrophes are coded as, for example:
“ becomes \x{201c}
– becomes \x{2013}
The final latin1 output file is an XML file. Is there anyway to convert these to the proper characters under latin1? Would numeric character entities be used since it would be XML file? Is the reason for their insertion since they are non-matching latin1 characters from the UTF-8 conversion? Is there a module or subroutine that could convert these for me?
Thanks
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.