Hello wise monks
I'm running into a problem, and I can't seem to find any solution to it for the past 2 days. Here's what it's all about:
o I got an UTF-8 encoded XML file
o I parse it in and want to write some parts of it to a mysql database, ISO-8859-1 encoded
Everything is working fine, i'm reading the XML in, creating a hash out if it with XML::Parser, data gets written to MySQL aswell, but when I check the data in the table, it's UTF-8 again.
So I started playing with Text::Iconv, and came to this:
--- some stuff above ---
$parser->parsefile(shift @ARGV, ProtocolEncoding => "ISO-8859-1");
--- some stuff inbetween ---
my $converter = Text::Iconv->new("UTF-8","ISO-8859-1");
while (my($key,$value)=each(%attrs))
{
push (@value_stack, { $key=>$value });
if ($program_hash{$current_filmid}{$key} eq '')
{
$program_hash{$current_filmid}{$key} = $conver
+ter->convert($value);
} else {
if ($key ne "EventId" && $key ne "KanalId")
{
$program_hash{$current_filmid}{"$key.$
+value"} = $converter->convert($value);
}
}
}
--- some stuff below ----
When I print the values out (eg: print $converter->convert($value)."\n"; ) it looks correct (eg ISO-8859-1 encoded), but when writing to the DB, it's UTF-8 again (meaning all special chars, like öäüéàè etc are some weird chars like Ã| etc...).
I'm really going nuts here, and would appreciate any help provided for this.
If more of the source is needed just tell me.
Thanks in Advance
Emanuel
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.