Hi all,
I am trying to read a xml and get some of the required part from the xml and write it as a new xml. I am successfully collect the data by using the following script
use strict; use warnings; undef $/; open OUT, ">:encoding(UTF-8)", "D:/wordpress/wordpress_categories.xml" +; open (IN, "<:encoding(UTF-8)", "D:/wordpress/wordpress.2011-04-12.xml" +); my $line = <IN>; while ($line =~ /<title>(.*?)<\/title>\n\t\t<link>(.*?)<\/link>\n\ +t\t<pubDate>(.*?)<\/pubDate>\n\t\t<dc:creator>(.*?)<\/dc:creator>\n\t +\t\n\t\t<category>(.*?)<\/category>/i) { $line =~ s/(<title>(.*?)<\/title>\n\t\t<link>(.*?)<\/link>\n\t +\t<pubDate>(.*?)<\/pubDate>\n\t\t<dc\:creator>(.*?)<\/dc\:creator>\n\ +t\t\n\t\t<category>(.*?)<\/category>)//i; print OUT "$1\n\n"; } close (IN); close (OUT);
but the output xml is not produce the non english characters correctly. below is the wrong output
<title>எழுத ‹வேண்டிய கட்டு‹ரை +ள்</title> <link>http://naatkurippugal.wordpress.com/?p=501</link> <pubDate>Wed, 30 Nov -0001 00:00:00 +0000</pubDate> <dc:creator><![CDATA[ஸ்ரீஹரி]]></dc:creator> <category><![CDATA[கட்டு‹ரை]]></category>
can anybody help me to solve this problem?
Thanks in Advance,
srikrishnan
In reply to How to write a utf-8 file by srikrishnan
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |