moritz has asked for the wisdom of the Perl Monks concerning the following question:

I'm constantly running into troubles while making my web apps Unicode safe.

My usual work flow is

I believe this is the only sane approach, otherwise you lose track which string is a text string and which is not.

The problems start when I put non-ASCII characters into the templates. I looked through the documentation of HTML::Template (I submitted a patch for this one), HTML::Template::Compiled (tinita says she's working on it), Template, Text::Template and Template::Simple, and none of them even mention encoding issues (I search for 'encoding', 'charset', 'utf8', 'utf-8' in the docs).

The problem is that when I write non-ASCII characters into the template files, and the template engine doesn't decode that into text strings and I supply text strings to populate the templates, I have mixed text and binary strings.

Now comes my question: Which template system provides sane handling of encodings? For me that's a good reason to switch to such a module.

My idea of "sane" is something along these lines: On opening the template files I can specify an encoding in which I want the file to be opened, and the template engine handles everything as text strings internally. Any other notion of what "sane" could mean is greatly appreciated as well.

Rant: are encoding issues really that rare and unimportant so that 4* out of 5 modules that I've looked at don't seem to care about it?

Is this a cultural issue? I could imagine that people who's native language can fully be expressed in ASCII characters tend not to care too much about charset.

(*) Regarding Template::Toolkit: a friend told me it had the ENCODING option, but I couldn't find it in the docs, for me this is equivalent to "it doesn't exist".

Grepping through the TT tarball from CPAN I found the note The ENCODING options needs testing and documenting. in the TODO file, so at least there is some awareness.

Update 1: small formatting updates.

Replies are listed 'Best First'.
Re: Handling Encoding in Templates
by Corion (Patriarch) on Nov 12, 2007 at 10:23 UTC

    Looking through the source of HTML::Template, I see that you can pass it a filehandle for the input template. So as an intermediate stopgap, you could open / binmode the template file yourself and then let HTML::Template handle it. Template::Simple seems to mostly render templates from scalars or scalar refs, so there the same technique could apply. Petal's templates are XML and it claims to handle (eq let you specify) the encoding.

    But yes, it would be convenient to have this on a "template checklist", which compares the features resp. solutions the templating systems provide.

      So as an intermediate stopgap, you could open / binmode the template file yourself
      how would that help with includes? it's a solution for simple templates, yes, but as soon as you have includes, that won't be a help.
      (and as moritz said, i'm working on it for HTC)
      update: i implemented it and it will be in version 0.90.
Re: Handling Encoding in Templates
by Rhandom (Curate) on Nov 12, 2007 at 19:12 UTC
    Thanks to the patches by Carl Franks, Template::Alloy has included the ENCODING parameter since version 1.008 (which was released several weeks ago).

    This means that you can use ENCODING on both TT2 style templates, as well as on HTML::Template templates and Text::Tmpl templates since Template::Alloy supports them all.

    The following example shows usage for setting the encoding - though it is only pulling from a code ref in this example, the encoding works properly when a filename is used rather than a string ref.
    use strict; use warnings; use Template::Alloy; my $t = Template::Alloy->new(ENCODING => 'UTF-8'); my $in = "[% foo %]|BAR ¥"; my $foo = 'fü'; my $out = ''; $t->process(\$in, {foo => $foo}, \$out) || die $t->error; print "\$out: $out\n"; # Prints # $out: fü|BAR ¥

    my @a=qw(random brilliant braindead); print $a[rand(@a)];
      Great, it works like a charm. Thanks to you and Carl Franks!
Re: Handling Encoding in Templates
by perrin (Chancellor) on Nov 12, 2007 at 18:31 UTC
Re: Handling Encoding in Templates
by Anonymous Monk on Nov 12, 2007 at 13:38 UTC