Re: Bug in Template?

Hello packetstormer

You said
> (using the same dbh call with UTF8 enabled)
This may be "internally decoded perl's utf8". So encode it to external UTF8 before you pass them to Tempalte.

#!/usr/bin/perl

use strict;
use warnings;

use Encode qw(encode decode);
use Template;

my @chars_not_encoded=();
my @chars_encoded=();
#foreach my $code ( hex('3041') .. hex('3096') ){
foreach my $code ( hex('00C0') .. hex('00F0') ){
    push @chars_not_encoded, chr($code);
    push @chars_encoded, encode('utf8', chr($code)) ;
};
my $t =Template->new();

#corrupt output
$t->process("test.tmpl", {lines=>\@chars_not_encoded}, "log_noenc" ) o
+r die $t->error();

#OK
$t->process("test.tmpl", {lines=>\@chars_encoded}, "log_enc" ) or die 
+$t->error();
[download]

And template

<html>
<head>
    <meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8"
+>
</head>
<body>
    [% FOREACH item IN lines %]
        item=#[% item %]#<br>
    [% END %]
</body>
</html>
[download]

I am also confusing about encoding of Template, And there seems a lot to read for theses troubles(for example Template::Provider::Encoding )... good luck

Comment on Re: Bug in Template? Select or Download Code

Replies are listed 'Best First'.
Re^2: Bug in Template? by remiah (Hermit) on Mar 22, 2012 at 00:53 UTC
Oh ... it seems I am totally confused. I 'll post later when I clear my mind.	[reply]
Re^3: Bug in Template? by remiah (Hermit) on Mar 22, 2012 at 03:54 UTC
This seems not a problem of Template. I also want advice for this. “Séan”'s é may be 00E9 of unicode table http://www.utf8-chartable.de/unicode-utf8-table.pl. I thought decode it to perl internal utf8 and pass them to Template encoding it utf8 will work. But it is not work. Without Template, there is strange behavior. #!/usr/bin/perl use strict; use warnings; use Encode qw(is_utf8 encode decode); use Template; my(@raw, @decoded_internal_utf8,@encoded_raw_utf8,@encoded_internal_ut +f8); my @chars=hex('00C0') .. hex('00F0'); #target characters #my @chars=hex('3041') .. hex('3096'); #hiragana foreach my $code ( @chars ){ my($raw, $chr); $raw =chr($code); if ( is_utf8($raw) ){ $chr=$raw; } else { $chr=decode('utf8',$raw); } push @raw, $raw; push @decoded_internal_utf8, $chr; push @encoded_raw_utf8 , encode('utf8', $raw); push @encoded_internal_utf8, encode('utf8', $chr); } print "======================\n"; print "perl=$^X : version=$]\n"; print "1.###raw\n"; print "#$_#\n" for @raw; print "2.###decoded_intenal_utf8\n"; #print "#$_#\n" for @decoded_internal_utf8; print "3.###encoded_raw_utf8\n"; print "#$_#\n" for @encoded_raw_utf8; print "4.###encoded_internal_utf8\n"; print "#$_#\n" for @encoded_internal_utf8; [download] It is strange No3 only works at this case. I usualy print characters with No 4. Japanese characters like "hiragana" seems to have no problem( for example,'3041' .. '3096'). I saw similar problem at Why Doesn't Text::CSV_XS Print Valid UTF-8 Text When Used With the open Pragma?. At that time, I didn't understand well and thought newer version would have no problem... Is this the same trouble? I tried with 5.012002 and 5.014002. They print exact same output except version number.	[reply] [d/l]
Re^4: Bug in Template? by Anonymous Monk on Mar 22, 2012 at 08:27 UTC
I'm confused by your code, what is it supposed to demonstrate? perlunitut: Unicode in Perl warns against using is_utf8, so I wouldn't use it Consider `$ perl -le " print chr hex q/C0/ " \| od -tx1 0000000 c0 0d 0a 0000003` [download] when viewed as Windows-1252 it is À And this `$ perl -le " binmode STDOUT , q/:utf8/; print chr hex q/C0/ " \| od -tx +1 0000000 c3 80 0d 0a 0000004` [download] when viewed as Windows-1252 it is Ã€ but viewed as UTF-8 it is À And this `$ perl -MEncode -le " print decode(q/utf8/, chr hex q/C0/ )" \| od -tx1 Wide character in print at -e line 1. 0000000 ef bf bd 0d 0a 0000005` [download] when viewed as Windows-1252 it is ï¿½ but viewed as UTF-8 it is � If you search for ef bf bd you'll see lots of questions about this erroneous conversion So if you want to treat chr 192 ( `perl -le " print hex q/C0/ "` ) as unicode you have to encode it, because characters 0 to 255 are also valid Latin-1, they are not utf8 `$ perl -le " print chr hex q/C0/ " \|od -tx1 0000000 c0 0d 0a 0000003 $ perl -le " print chr 255 " \|od -tx1 0000000 ff 0d 0a 0000003 $ perl -le " print chr 256 " \|od -tx1 Wide character in print at -e line 1. 0000000 c4 80 0d 0a 0000004` [download] Or, if you want chr 192 to return unicode, use encoding pragma ( utf8 pragma doesn't affect chr ) `$ perl -le " use encoding q/utf8/; print chr 192 " \|od -tx1 0000000 c3 80 0a 0000003` [download]	[reply] [d/l] [select]
Re^5: Bug in Template? by Anonymous Monk on Mar 22, 2012 at 08:33 UTC
Re^5: Bug in Template? by remiah (Hermit) on Mar 22, 2012 at 10:58 UTC